Topic segmentation using word-level semantic relatedness functions
Journal of Information Science
Published online on September 04, 2015
Abstract
Semantic relatedness deals with the problem of measuring how much two words are related to each other. While there is a large body of research for developing new measures, the use of semantic relatedness (SR) measures in topic segmentation has not been explored. In this research the performance of different SR measures is evaluated in the topic segmentation problem. To this end, two topic segmentation algorithms that use the difference in SR of words are introduced. Our results indicate that using an SR measure trained with a general domain corpora achieves better results than topic segmentation algorithms using Wordnet or simple word repetition. Furthermore, when compared with computationally more complex algorithms performing global analysis, our local analysis, enhanced with general domain lexical semantic information, achieves comparable results.