Tracing the Origin of Rating Bias in a High‐Stakes Chinese–English Translation Test

Zhiqiang Yang, Qing Li, Tao Zhang, Baohua Dong

Published online on April 27, 2026

Abstract

["European Journal of Education, Volume 61, Issue 2, June 2026. ", "\nABSTRACT\nHuman ratings introduce rating bias, thereby undermining the reliability and fairness of the test. While existing studies have identified rating bias as a construct‐irrelevant factor, the sources of rating bias are underexplored. Therefore, this study explored the sources of rating bias of the translation task of a high‐stakes English language test. Based on the performance of 25 students on the translation test task, this study invited nine raters from three universities to rate these responses and collected their rating performance data on the first and last days of CET‐4 rating. Many‐facet Rasch measurement model and think‐aloud were employed to examine the rating bias and explore its sources. The findings revealed that raters' reliability was acceptable but decreased on the last day, and the sources of rating bias were mainly derived from three areas, including rating administration, raters' cognition and construct‐irrelevance factors. Implications on balancing practicality and validity regarding human rating are discussed based on the results.\n"]