MetaTOC stay on top of your field, easily

Effectiveness of Combining Statistical Tests and Effect Sizes When Using Logistic Discriminant Function Regression to Detect Differential Item Functioning for Polytomous Items

, ,

Educational and Psychological Measurement

Published online on

Abstract

The objective of this article was to find an optimal decision rule for identifying polytomous items with large or moderate amounts of differential functioning. The effectiveness of combining statistical tests with effect size measures was assessed using logistic discriminant function analysis and two effect size measures: R2 and conditional log odds ratio in delta scale (LR). Four independent variables were manipulated: (a) different sample sizes for the reference and focal groups (1,000/500, 1,000/250, 500/250), (b) impact between reference and focal group (equal-ability distribution, i.e., no impact; or different-ability distribution, i.e., impact), (c) the percentage of differential item functioning (DIF) items in a test (0%, 12%, i.e., only the first three items of the test; 20%, i.e., the first five items of the test; 32%, i.e., the first eight items of the test), and (d) direction of DIF (one-sided and both-sided). The magnitudes of DIF were indirectly manipulated through the percentage of DIF items and DIF direction, and they were simulated to be moderate or large. The results show that the false positive rates were low when an effect size decision rule was used in combination with a statistical test, and they were very low when R2 effect size criteria were applied. With respect to power, when a statistical test was used in conjunction with effect size criteria to determine whether an item exhibited a meaningful magnitude of DIF, we found when using the LRdecision rule that the percentage of meaningful DIF items was higher with greater amounts of DIF. Examining DIF by means of blended statistical tests, in other words, those incorporating both the p value and effect size measures, can be recommended as a procedure for classifying items displaying DIF.