MetaTOC stay on top of your field, easily

A learning approach for email conversation thread reconstruction

, , ,

Journal of Information Science

Published online on

Abstract

An email conversation thread is defined as a topic-centric discussion unit that is composed of exchanged emails among the same group of people by reply or forwarding. Detecting conversation threads contained in email corpora can be beneficial for both humans to digest the content of discussions and automatic methods to extract useful information from the conversations. This research explores two new feature-enriched learning approaches, LExLinC and LExTreC, to reconstruct linear structure and tree structure of conversation threads in email data. In this work, some simplifying assumptions considered in previous methods for extracting conversation threads are relaxed, which makes the proposed methods more powerful in detecting real conversations. Additionally, the supervised nature of the proposed methods makes them adaptable to new environments by automatically adjusting the features and their weights. Experimental results show that the proposed methods are highly effective in detecting conversation threads and outperform the existing methods.