MetaTOC stay on top of your field, easily

Binary Logistic Regression Analysis for Detecting Differential Item Functioning: Effectiveness of R2 and Delta Log Odds Ratio Effect Size Measures

, ,

Educational and Psychological Measurement

Published online on

Abstract

The authors analyze the effectiveness of the R2 and delta log odds ratio effect size measures when using logistic regression analysis to detect differential item functioning (DIF) in dichotomous items. A simulation study was carried out, and the Type I error rate and power estimates under conditions in which only statistical testing was used were compared with the rejection rates obtained when statistical testing was combined with an effect size measure based on recommended cutoff criteria. The manipulated variables were sample size, impact between groups, percentage of DIF items in the test, and amount of DIF. The results showed that false-positive rates were higher when applying only the statistical test than when an effect size decision rule was used in combination with a statistical test. Type I error rates were affected by the number of test items with DIF, as well as by the magnitude of the DIF. With respect to power, when a statistical test was used in conjunction with effect size criteria to determine whether an item exhibited a meaningful magnitude of DIF, the delta log odds ratio effect size measure performed better than R2. Power was affected by the percentage of DIF items in the test and also by sample size. The study highlights the importance of using an effect size measure to avoid false identification.