Skip to Main Content
In pattern recognition system, many irrelevant or redundant features will not only reduce the performance of classifier but also lead to the "dimension disaster", so it is important to select features. This thesis proposes a new method of feature subset selection, which is based on discrete binary version of particle swarm optimization (BPSO) and overlap information entropy (OIE). This method does not depend on classifier. The main idea is: at first, a group of particles are generated randomly. The OIE between attribute set and class attribute is used as BPSO algorithm's fitness function, its size denotes the correlation degree between selected attribute set and class attribute. Then, feature subset is optimized by BPSO. Finally, feature subset, which has the largest OIE with class attribute, is selected as the optimal feature subset. Experimental results on Bio_Train dataset of KDDCUP2004 confirm that this method can find the optimal feature subset effectively and its classification results are not worse than all features' classification results.