Topic Modeling: Latent Semantic Analysis for the Social Sciences*
Published online on September 07, 2018
Abstract
---
- |2
Objective
Topic modeling (TM) refers to a group of methods for mathematically identifying latent topics in large corpora of data. Although TM shows promise as a tool for social science research, most researchers lack awareness of the tool's utility. Therefore, this article provides a brief overview of TM's logic and processes, offers a simple example, and suggests several possible uses in social sciences.
Methods
Using latent semantic analysis in our example, we analyzed transcripts of the 2016 U.S. presidential debates between Hillary Clinton and Donald Trump.
Results
Resulting topics paralleled the most frequent policy‐related Internet searches at the time. When divided by candidate, changes in emergent topics reflected individual policy stances, with nuanced differences between the two.
Conclusion
Findings underscored the utility of TM to identify thematic patterns embedded in large quantities of text. TM, therefore, represents a valuable addition to the social scientist's methodological tool set.
- Social Science Quarterly, EarlyView.