MetaTOC stay on top of your field, easily

The normalization of occurrence and Co‐occurrence matrices in bibliometrics using Cosine similarities and Ochiai coefficients

,

Journal of the American Society for Information Science and Technology

Published online on

Abstract

We prove that Ochiai similarity of the co‐occurrence matrix is equal to cosine similarity in the underlying occurrence matrix. Neither the cosine nor the Pearson correlation should be used for the normalization of co‐occurrence matrices because the similarity is then normalized twice, and therefore overestimated; the Ochiai coefficient can be used instead. Results are shown using a small matrix (5 cases, 4 variables) for didactic reasons, and also Ahlgren et al.'s (2003) co‐occurrence matrix of 24 authors in library and information sciences. The overestimation is shown numerically and will be illustrated using multidimensional scaling and cluster dendograms. If the occurrence matrix is not available (such as in internet research or author cocitation analysis) using Ochiai for the normalization is preferable to using the cosine.