By Topic

Distributed Approach to Feature Selection From Very Large Data Sets Using BLEM2

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$31 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

2 Author(s)
Chien-Chung Chan ; Dept. of Comput. Sci., Akron Univ., OH ; Selvaraj, S.

Feature selection is an important step in the preprocessing of raw data for data mining. It involves eliminating redundant and irrelevant features from the dataset to get a subset of features, which performs as efficient as the complete set. The wrapper approach to the problem of feature selection is to use an induction algorithm to select the features. Most induction algorithms fail to handle large datasets. The obvious method that can be employed to handle large datasets is "divide and conquer". This paper introduces a strategy for finding features from a collection of distributed subsets using the BLEM2 rule-based inductive learning program. Heurstics for determining proper number of subsets and proper subset size are proposed. The proposed strategy has been tested on the intrusion detection systems dataset made available by MIT Lincoln labs

Published in:

Fuzzy Information Processing Society, 2006. NAFIPS 2006. Annual meeting of the North American

Date of Conference:

3-6 June 2006