By Topic

Scalable feature mining for sequential data

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$33 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

3 Author(s)
N. Lesh ; MERL, Cambridge, MA, USA ; M. J. Zaki ; M. Oglhara

Many real world data sets contain irrelevant or redundant attributes. This might be because the data was collected without data mining in mind or without a priori knowledge of the attribute dependences. Many data mining methods such as classification and clustering degrade prediction accuracy when trained on data sets containing redundant or irrelevant attributes or features. Selecting the right feature set not only can improve accuracy but also can reduce the running time of the predictive algorithms and lead to simpler, more understandable models. Good feature selection is thus a fundamental data preprocessing step in data mining. To provide good feature selection for sequential domains, we developed FeatureMine, a scalable feature mining algorithm that combines two powerful data mining paradigms: sequence mining and classification algorithms. Tests on three practical domains demonstrate that FeatureMine can efficiently handle very large data sets with thousands of items and millions of records

Published in:

IEEE Intelligent Systems and their Applications  (Volume:15 ,  Issue: 2 )