MetaTOC stay on top of your field, easily

Summary generation approaches based on semantic analysis for news documents

, ,

Journal of Information Science

Published online on

Abstract

With the exponential growth of the internet, a lot of online news reports are produced on the web every day. The news stream flows so rapidly that no one has the time to look at each and every item of information. In this situation, a person would naturally prefer to read updated information at certain time intervals. Document updating technique is very helpful for individuals to acquire new information or knowledge by eliminating out-of-date or redundant information. Existing summarization systems involve identifying the most relevant sentences from the text and putting them together to create a concise initial summary. In the process of identifying the important sentences, features influencing the relevance of sentences are determined. Based on these features the salience of the sentence is calculated and an initial summary is generated from highly important sentences at different compression rates. These types of initial summaries work on a batch of documents and do not consider the documents that may arrive at later time, so that corresponding summaries need to get updated. The update summarization system addresses this issue by taking into account the documents read by the user in the past and seeks to present only fresh or different information. The first step is to create an initial summary based on basic and additional features. The next step is to create an update summary based on the basic, additional and update features. In this paper, two approaches are proposed for generating initial and update summary from multiple documents about given news. The first approach performs semantic analysis by modifying the vector space model with dependency parse relations and applying latent semantic analysis on it to create a summary. The second approach applies sentence annotation based on aspects, prepositions and named entities to generate summary. Experimental results show that the proposed approaches generate better initial and update summaries compared with the existing systems.