MetaTOC stay on top of your field, easily

A language-model-based approach for subjectivity detection

,

Journal of Information Science

Published online on

Abstract

The rapid growth of opinionated text on the Web increases the demand for efficient methods for detecting subjective texts. In this paper, a subjectivity detection method is proposed which utilizes a language-model-based structure to define a subjectivity score for each document where the topic relevance of documents does not affect the subjectivity scores. In order to overcome the limited content in short documents, we further propose an expansion method to better estimate the language models. Since the lack of linguistic resources in resource-lean languages like Persian makes subjectivity detection difficult in these languages, the method is proposed in two versions: a semi-supervised version for resource-lean languages and a supervised version. Experimental evaluations on five datasets in two languages, English and Persian, demonstrate that the method performs well in distinguishing subjective documents from objective ones in both languages.