By Topic

Application of random forest data mining method to the feature selection for female sub-health state

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$31 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

11 Author(s)
Li-min Wang ; Beijing Univ. of Chinese Med., Beijing, China ; Jia-xu Chen ; Min Fan ; Xin Zhao
more authors

BACKGROUND: Sub-health state is a low-quality status between health and disease. The aim of this study was to determine which factors and/or combination of factors could be predictive of sub-health state in female as using random forest method. METHODS: Data were collected through a clinical epidemiology survey and obtained 2992 cases (2507 cases were in sub-health state and 485 cases were in health), in which the female subhealth state cases were 1285 and the female health state cases were 177, respectively. Based on association declined by mutual information, we used a classification technique called Random Forest to predict the sub-health state in female through the analysis of the clinical data. RESULTS: We've obtained the total OOB error rate of 20.06% , namely, the correct classification rate is 79.94%. In other words, there were 10 variables very powerful to discriminate between health state and sub-health state in female. They were the symptoms as follows, Fatigue, Myasthenia of limbs, Amnesia, Dizziness, Dysphoria, Sighing, Hypochondriac distension and pain, Constipation, Swollen sore throat and Premenstrual Distension of Breast. CONCLUSIONS: We suggest data random forest mining method for feature selection in female sub-health state; the main advantage of this method is to select important features that retaining a high predictive accuracy.

Published in:

Bioinformatics and Biomedicine Workshops (BIBMW), 2010 IEEE International Conference on

Date of Conference:

18-18 Dec. 2010