Bootstrap Standard Errors for Maximum Likelihood Ability Estimates When Item Parameters Are Unknown
Educational and Psychological Measurement
Published online on December 02, 2013
Abstract
When item parameter estimates are used to estimate the ability parameter in item response models, the standard error (SE) of the ability estimate must be corrected to reflect the error carried over from item calibration. For maximum likelihood (ML) ability estimates, a corrected asymptotic SE is available, but it requires a long test and the covariance matrix of item parameter estimates, which may not be available. An alternative SE can be obtained using the bootstrap. The first purpose of this article is to propose a bootstrap procedure for the SE of ML ability estimates when item parameter estimates are used for scoring. The second purpose is to conduct a simulation to compare the performance of the proposed bootstrap SE with the asymptotic SE under different test lengths and different magnitudes of item calibration error. Both SE estimates closely approximated the empirical SE when the test was long (i.e., 40 items) and when the true ability value was close to the mean of the ability distribution. However, neither SE estimate was uniformly superior: the asymptotic SE tended to underpredict the empirical SE, and the bootstrap SE tended to overpredict the empirical SE. The results suggest that the choice of SE depends on the type and purpose of the test. Additional implications of the results are discussed.