Rater Comparability Scoring and Equating: Does Choice of Target Population Weights Matter in This Context?

Gautam Puhan

Published online on January 23, 2014

Abstract

When a constructed‐response test form is reused, raw scores from the two administrations of the form may not be comparable. The solution to this problem requires a rescoring, at the current administration, of examinee responses from the previous administration. The scores from this “rescoring” can be used as an anchor for equating. In this equating, the choice of weights for combining the samples to define the target population can be critical. In rescored data, the anchor usually correlates very strongly with the new form but only moderately with the reference form. This difference has a predictable impact: the equating results are most accurate when the target population is the reference form sample, least accurate when the target population is the new form sample, and somewhere in the middle when the new form and reference form samples are equally weighted in forming the target population.