By Topic

Combinatorial PCA and SVM methods for feature selection in learning classifications (applications to text categorization)

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$31 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

2 Author(s)
Anghelescu, A.V. ; Dept. of Comput. Sci., Rutgers Univ., USA ; Muchnik, I.B.

We describe a purely combinatorial approach of obtaining meaningful representations of text data. More precisely, we describe two different methods that materialize this approach: we call them combinatorial principal component analysis (cPCA) and combinatorial support vector machines (cSVM). These names emphasise mathematical analogies between the well known PCA and SVM, on one hand, and our respective methods. For evaluating the selected spaces of features, we used the environment set for TREC 2002 and used a very common classifier: 1-nearest neighbour (1-NN). We compared the results obtained on the feature sets calculated by the procedures we described (cPCA and cSVM) with the results obtained on the original feature space. We showed that by selecting a feature space on average 50 times smaller than the original space, the performance of the classifier does not decrease by more than 2%.

Published in:

Integration of Knowledge Intensive Multi-Agent Systems, 2003. International Conference on

Date of Conference:

30 Sept.-4 Oct. 2003