MetaTOC stay on top of your field, easily

QSem: A novel question representation framework for question matching over accumulated question-answer data

,

Journal of Information Science

Published online on

Abstract

This paper proposes a novel question representation framework to assist automated question answering through reusing accumulated question–answer data. The framework, named QSem, defines three types of question words – question-target words, user-oriented words and irrelevant words, along with semantic patterns, for representing a question. The question word types are semantically labelled by a pre-defined ontology to enrich the semantic representation of questions. The semantic patterns through equivalent pattern linking enhance normal structure matching aiming at improving question matching performance. We trained QSem on 400 randomly selected questions with semantic patterns and obtained optimized parameters. After that, 5000 questions from our system were tested and the precision of question matching was between 0.71 and 0.93 with respect to various generators, indicating the stability of the approach. We further compared our approach with Cosine similarity, WordNet-based semantic similarity and IBM translation model on a standard TREC dataset containing 5536 questions. The results presented that our approach achieved best performance with mean reciprocal rank increased by 7.2% and accuracy increased by 7.5% on average, demonstrating the effectiveness of the approach.