Skip to Main Content
This paper proposes to enhance search query log analysis by taking into account the semantic properties of query terms. We first describe a method for extracting a global semantic representation of a search query log and then show how we can use it to semantically extract the user interests. The global representation is composed of a taxonomy that organizes query terms based on generalization/specialization (“is a”) semantic relations and of a function to measure the semantic distance between terms. We then define a query terms clustering algorithm that is applied to the log representation to extract user interests. The evaluation has been done on large real-life logs of a popular search engine.