Person entity linking in email with NIL detection
Journal of the American Society for Information Science and Technology
Published online on July 04, 2017
Abstract
For each specific mention of an entity found in a text, the goal of entity linking is to determine whether the referenced entity is present in an existing knowledge base, and if so to determine which KB entity is the correct referent. Entity linking has been well explored for dissemination‐oriented sources such as news stories, blogs, and microblog posts, but the limited work to date on “conversational” sources such as email or text chat has not yet attempted to determine when the referent entity is not in the knowledge base (a task known as “NIL detection”). This article presents a supervised machine learning system for linking named mentions of people in email messages to a collection‐specific knowledge base, and that is also capable of NIL detection. This system learns from manually annotated training examples to leverage a rich set of features. The entity linking accuracy for entities present in the knowledge base is substantially and significantly better than the best previously reported results on the Enron email collection, comparable accuracy is reported for the challenging NIL detection task, and these results are for the first time replicated on a second email collection from a different source with comparable results.