Close category search window
 

Support vector machine for customized email filtering based on improving latent semantic indexing

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$31 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

2 Author(s)
Qing Yang ; Sch. of Inf. Eng., Wuhan Univ. of Technol., China ; Fang-min Li

Latent semantic indexing (LSI) is an important method for information retrieval (IR), in which we can automatically transform the original textual data to a smaller semantic space by take advantage of some of the implicit or latent higher-order structure in associations of words with customized objects, and it also has been successfully applied to text classification. LSI can resolve the problems of polysemy and synonymy, and can reduce noise in the raw document-term matrix. But LSI is not an optimal approach to text classification. Because LSI is a complete unsupervised method, which ignores categories discrimination, it often drops the performance of text classification when it is applied to the whole training documents. In this paper, in order to prevent the spreading of the unsolicited email and harmful message, under multi-languages (Chinese and English) circumstance we have developed a system based on customized email topic being filtered, and we represented topic in Latent Semantic model, and abstract features from predefined email categories and document categories in LSI method. It is able to filter and recognize customized or special unwanted Chinese and English emails in positive examples supervised learning approach. We propose an improving LSI to improve the classification performance by a separate single value decomposition (SVD) on the transformed local region of each category. We apply support vector machine (SVM) classification method to recognize and filter email based on text classifier. The result of the experiment showed that our approach is very effective and has a good filtering performance.

Published in:
Machine Learning and Cybernetics, 2005. Proceedings of 2005 International Conference on  (Volume:6 )

Date of Conference: 18-21 Aug. 2005

Need Help?


IEEE Advancing Technology for Humanity About IEEE Xplore | Contact | Help | Terms of Use | Nondiscrimination Policy | Site Map | Privacy & Opting Out of Cookies

A not-for-profit organization, IEEE is the world's largest professional association for the advancement of technology.
© Copyright 2013 IEEE - All rights reserved. Use of this web site signifies your agreement to the terms and conditions.