MetaTOC stay on top of your field, easily

Hybrid pseudo-relevance feedback for microblog retrieval

, , ,

Journal of Information Science

Published online on

Abstract

The microblog has become a new global hot spot. Information retrieval (IR) technologies are necessary for accessing the massive amounts of valuable user-generated contents in the microblog sphere. The challenge in searching relevant microblogs is that they are usually very short with sparse vocabulary and may fail to match queries. Pseudo-relevance feedback (PRF) via query-expansion has been proven in previous studies to successfully increase the number of matches in IR. However, a critical problem of PRF is that the pseudo-relevant feedback may not be truly relevant, and thus may introduce noise to query expansion. In this paper, we exploit the dynamic nature of microblogs to address this problem. We first present a novel dynamic PRF technique, which is capable of expanding queries with truly relevant keywords by extracting representative terms based on the query’s temporal profile. Next we present query expansion from external knowledge sources based on negative and positive feedbacks. We further consider that the choice of PRF strategy is query-dependent. A two-level microblog search framework is presented. At the high level, a temporal profile is constructed and categorized for each query; at the low level, hybrid PRF query expansion combining dynamic and external PRF is adopted based on the query category. Experiments on a real data set demonstrate that the proposed method significantly increases the performance of microblog searching, compared with several traditional retrieval models, various query expansion methods and state-of-art recency-based models for microblog searching.