Skip to Main Content
As well known, the user interest is carried in the user's web browsing history that can be mined out. This paper presents an innovative method to extract user's inter.ests from his/her web browsing history. We first apply an efficient algorithm to extract useful texts from the web pages in user's browsed URL sequence. We then proposed a Labeled Latent Dirichlet Allocation with Topic Feature (LLDA-TF) to mine user's interests from the texts. Unlike other works that need a lot of training data to train a model to adopt supervised information, we directly introduce the raw supervised information to the procedure of LLDA-TF. As shown in the experimental results, results given by LLDA-TF fit predefined categories well. Furthermore, LLDA-TF model can name the user interests by category words as well as a keyword list for each category.