Abstract
Increasingly, high-stakes
large-scale examinations are used to make important decisions
about student achievement. Consequently, it is equally important
that scores obtained from these examinations are accurate. This
study compares the estimation accuracy of procedures based on
classical test score theory (CTST) and item response theory (Generalized
Partial Credit model, GPCM) for examinations consisting of multiple-choice
and extended-response items. Using the British Columbia Scholarship
Examination program, the accuracy of the two procedures was compared
when the scholarship portions of the examinations were removed.
For the subset of examinations investigated, the results indicate
that removing these scholarship portions led to an error rate
of approximately 10% with approximately seven out of 10 errors
resulting in the denial of scholarships. The results were similar
for both the CTST and the GPCM, indicating that for mixed-format
examinations the two procedures produce randomly equivalent results.
Implications for policy and future research are discussed.
Copyright © AJER, the Faculty of Education, and the University
of Alberta, 2003.
Last revised: May 6, 2003.
Designed by G.H. Buck