This paper proposed a feature selection strategy based on rough set theory (RST) and discrete particle swarm optimization (DPSO) methods prior to classify protein function. RST is adopted in the first phase with the aim to eliminate the insignificant features and prepared the reduce features to the next phase. In the second phase, the reduced features are optimized using the new evolutionary computation method, DPSO. The optimum features from this two methods were mined using support vector machine classifier with the optimum RBF's kernel parameters. These methods have greatly reduced the features and achieved higher classification accuracy across the selected datasets compared to full features and RST alone. The results also demonstrate that the integration of RST and DPSO is capable of searching the optimal features for protein classification and applicable to different classification problem.
Published in:
Data Mining and Optimization, 2009. DMO '09. 2nd Conference on
Date of Conference: 27-28 Oct. 2009