Testing the Difference of Correlated Agreement Coefficients for Statistical Significance

Gwet, K. L.

Educational and Psychological Measurement

Published online on July 28, 2015

Abstract

This article addresses the problem of testing the difference between two correlated agreement coefficients for statistical significance. A number of authors have proposed methods for testing the difference between two correlated kappa coefficients, which require either the use of resampling methods or the use of advanced statistical modeling techniques. In this article, we propose a technique similar to the classical pairwise t test for means, which is based on a large-sample linear approximation of the agreement coefficient. We illustrate the use of this technique with several known agreement coefficients including Cohen’s kappa, Gwet’s AC₁, Fleiss’s generalized kappa, Conger’s generalized kappa, Krippendorff’s alpha, and the Brenann–Prediger coefficient. The proposed method is very flexible, can accommodate several types of correlation structures between coefficients, and requires neither advanced statistical modeling skills nor considerable computer programming experience. The validity of this method is tested with a Monte Carlo simulation.