MetaTOC stay on top of your field, easily

Generating query suggestions by exploiting latent semantics in query logs

,

Journal of Information Science

Published online on

Abstract

Search engines assist users in expressing their information needs more accurately by reformulating the issued queries automatically and suggesting the generated formulations to the users. Many approaches to query suggestion draw on the information stored in query logs, recommending recorded queries that are textually similar to the current user’s query or that frequently co-occurred with it in the past. In this paper, we propose an approach that concentrates on deducing the actual information need from the user’s query. The challenge therein lies not only in processing keyword queries, which are often short and possibly ambiguous, but especially in handling the complexity of natural language that allows users to express the same or similar information needs in various differing ways. We expect a higher-level semantic representation of a user’s query to more accurately reflect the information need than the explicit query terms alone can. To this aim, we employ latent Dirichlet allocation as a probabilistic topic model to reveal latent semantics in the query log. Our evaluations show that, whereas purely topic-based query suggestion performs the worst, the interpolation of our proposed topic-based model with the baseline word-based model that generates suggestions based on matching query terms achieves significant improvements in suggestion quality over the already well performing purely word-based approach.