By Topic

Feature Selection for the Topic-Based Mixture Model in Factored Classification

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$31 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

1 Author(s)
Qiong Chen ; Sch. of Comput. Sci. & Eng., South China Univ. of Technol., Guangzhou

Topic-based mixture model (TBMM) is a learning algorithm for factored classification. In factored classification, the class label is factored into a vector of class features. For example, the class label for a personal Web page at a university might be described by two features: the academic discipline of the person, and their position (e.g., 'chemistry professor' or 'physics student'). An approach to factored classification of text documents in which each document is assumed to be generated by a mixture of class features was proposed. Experiments in factored text classification problems show TBMM can outperform other two approaches for categories with especially sparse training data. In this paper, we analyze the feature selection for TBMM. For TBMM the feature space can be reduced to small number of feature terms with a significant improvement to classification accuracy. We present empirical results that indicate that TBMM is an adequate method to determine the feature terms for the supervised classification task

Published in:

Computational Intelligence and Security, 2006 International Conference on  (Volume:1 )

Date of Conference:

Nov. 2006