Abstract:
In medical domain, data features often contain missing values. This can create serious bias in the predictive modeling. Typical standard data mining methods often produce...Show MoreMetadata
Abstract:
In medical domain, data features often contain missing values. This can create serious bias in the predictive modeling. Typical standard data mining methods often produce poor performance measures. In this paper, we propose a new method to simultaneously classify large datasets and reduce the effects of missing values. The proposed method is based on a multilevel framework of the cost-sensitive SVM and the expected maximization imputation method for missing values, which relies on iterated regression analyses. We compare classification results of multilevel SVM-based algorithms on public benchmark datasets with imbalanced classes and missing values as well as real data in health applications, and show that our multilevel SVM-based method produces fast, and more accurate and robust classification results.
Date of Conference: 06-09 July 2015
Date Added to IEEE Xplore: 17 September 2015
ISBN Information:
Conference Location: Washington, DC, USA
Keywords assist with retrieval of results and provide a means to discovering other relevant content. Learn more.
- IEEE Keywords
- Index Terms
- Missing Values ,
- Class Imbalance ,
- Prediction Model ,
- Standard Method ,
- Performance Measures ,
- Support Vector Machine ,
- Characteristics Of Data ,
- Expectation Maximization ,
- Multilevel Framework ,
- Public Benchmark Datasets ,
- Linear Regression ,
- Logistic Regression ,
- Machine Learning ,
- Parameter Estimates ,
- Maximum Likelihood Estimation ,
- Classification Problem ,
- Medical Data ,
- Large-scale Data ,
- Kernel Function ,
- Number Of Data Points ,
- Imbalanced Data ,
- Missing Features ,
- Sequential Minimal Optimization ,
- Missing At Random ,
- Positive Class ,
- Missing Not At Random ,
- Uniform Design ,
- Financial Risk ,
- Standard Support Vector Machine ,
- Financial Data
Keywords assist with retrieval of results and provide a means to discovering other relevant content. Learn more.
- IEEE Keywords
- Index Terms
- Missing Values ,
- Class Imbalance ,
- Prediction Model ,
- Standard Method ,
- Performance Measures ,
- Support Vector Machine ,
- Characteristics Of Data ,
- Expectation Maximization ,
- Multilevel Framework ,
- Public Benchmark Datasets ,
- Linear Regression ,
- Logistic Regression ,
- Machine Learning ,
- Parameter Estimates ,
- Maximum Likelihood Estimation ,
- Classification Problem ,
- Medical Data ,
- Large-scale Data ,
- Kernel Function ,
- Number Of Data Points ,
- Imbalanced Data ,
- Missing Features ,
- Sequential Minimal Optimization ,
- Missing At Random ,
- Positive Class ,
- Missing Not At Random ,
- Uniform Design ,
- Financial Risk ,
- Standard Support Vector Machine ,
- Financial Data