By Topic

Document Topic Extraction Based on Wikipedia Category

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$31 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

5 Author(s)
Jiali Yun ; Sch. of Comput. & Inf. Technol., Beijing Jiaotong Univ., Beijing, China ; Liping Jing ; Jian Yu ; Houkuan Huang
more authors

Document Topic Extraction aims at using several key phrases to describe the topics of documents. It can be applied in web document categorization and tagging, document clusters topic description and information retrieval tasks. In this paper, we propose a Wikipedia category-based document topic extraction method. Document is mapped to a set of Wikipedia categories and is represented as graph structure in order to conserve the relationship between Wikipedia categories. Then, document topic can be extracted by clustering the related Wikipedia categories in the document collection. Experiment in real data shows Wikipedia category-based document topic extraction method achieves the better result than latent topic modeling method, such as LDA.

Published in:

Computational Sciences and Optimization (CSO), 2011 Fourth International Joint Conference on

Date of Conference:

15-19 April 2011