Inferring Personality From Social Media Activity Using Large Language Models: Cross‐Model Agreement, Temporal Stability, and Convergent Validity With Self‐Reports

Davide Marengo, Christian Montag, Michele Settanni

Published online on September 02, 2025

Abstract

["Journal of Personality, EarlyView. ", "\nABSTRACT\n\nIntroduction\nLarge language models (LLMs) offer a promising approach to infer personality traits unobtrusively from digital footprints. However, the reliability and validity of these inferences remain underexplored.\n\n\nMethod\nGemini 1.5 Pro and GPT‐4o were used to infer Big Five traits from 2 years of Facebook posts by 1214 Italian users. Predictions were compared to self‐reports on the Ten‐Item Personality Inventory.\n\n\nResults\nLLM predictions underestimated Agreeableness and Conscientiousness, overestimated Extraversion, while Neuroticism and Openness closely aligned with self‐report means. On repeated prompting, Gemini 1.5 Pro inferences showed less variability than GPT‐4o, with both models achieving excellent reliability when aggregating inferences. Temporal stability was highest when combining predictions across LLMs, with test–retest correlations over 2 years ranging from 0.44 for Conscientiousness to 0.60 for Openness. Cross‐LLM agreement was highest when combining inferences from multiple time points, with correlations ranging from 0.58 for Neuroticism to 0.83 for Extraversion. Correlations with self‐reports were modest, reaching 0.27 for Extraversion, 0.24 for Agreeableness, 0.23 for Conscientiousness, 0.18 for Neuroticism, and 0.31 for Openness when combining LLM inferences across LLMs and time points.\n\n\nConclusion\nThese findings advance understanding of LLMs' potential for personality inference, highlighting the importance of aggregating inferences to enhance the reliability and validity of such assessments.\n\n"]