Classification of news-related tweets
Journal of Information Science
Published online on June 17, 2016
Abstract
It is important to obtain public opinion about a news article. Microblogs such as Twitter are popular and an important medium for people to share ideas. An important portion of tweets are related to news or events. Our aim is to find tweets about newspaper reports and measure the popularity of these reports on Twitter. However, it is a challenging task to match informal and very short tweets with formal news reports. In this study, we formulate this problem as a supervised classification task. We propose to form a training set using tweets containing a link to the news and the content of the same news article. We preprocess tweets by removing unnecessary words and symbols and apply stemming by means of morphological analysers. We apply binary classifiers and anomaly detection to this task. We also propose a textual similarity-based approach. We observed that preprocessing of tweets increases accuracy. The textual similarity method obtains results with the highest recognition rate. Success increases in some cases when report text is used with tweets containing a link to the news report within the training set of classification studies. We propose that this study, which is made directly in consideration of tweet texts that measure the trends of national newspaper reports on social media, has a higher significance when compared to Twitter analyses made by using a hashtag. Given the limited number of scientific studies on Turkish tweets, this study makes a contribution to the literature.