Skip to Main Content
Traditional information retrieval techniques mainly focus on improving the relevance of result documents retrieved by a query. In most informational (or exploratory) search tasks, formulated queries may often fail to precisely capture users' real intents, and consequently retrieved results may not adequately satisfy their information needs. However, if the retrieved results were diverse in nature, some results would likely be able to meet the user's information need. In this paper, we present a novel search results diversification technique that integrates social interest mined from query logs with a probabilistic model based on query-URL bipartite graphs. A query can retrieve relevant and diverse results by incorporating the social interest, discovered through kernel principle component analysis on the related queries and URLs, with random walk on the bipartite graph. We have conducted a set of experiments to validate this technique and the results show that this technique outperforms existing results diversification techniques in terms of both the relevance and the diversity of result documents retrieved by a query.