Abstract:
Data uncertainty is common in real-world applications due to various causes, including imprecise measurement, network latency, outdated sources and sampling errors. These...Show MoreMetadata
Abstract:
Data uncertainty is common in real-world applications due to various causes, including imprecise measurement, network latency, outdated sources and sampling errors. These kinds of uncertainty have to be handled cautiously, or else the mining results could be unreliable or even wrong. In this paper, we propose a new rule-based classification and prediction algorithm called uRule for classifying uncertain data. This algorithm introduces new measures for generating, pruning and optimizing rules. These new measures are computed considering uncertain data interval and probability distribution function. Based on the new measures, the optimal splitting attribute and splitting value can be identified and used for classification and prediction. The proposed uRule algorithm can process uncertainty in both numerical and categorical data. Our experimental results show that uRule has excellent performance even when data is highly uncertain.
Date of Conference: 29 March 2009 - 02 April 2009
Date Added to IEEE Xplore: 10 April 2009
ISBN Information:
ISSN Information:
Keywords assist with retrieval of results and provide a means to discovering other relevant content. Learn more.
- IEEE Keywords
- Index Terms
- Classification Algorithms ,
- Algorithm For Data ,
- Uncertain Data ,
- Rule-based Algorithm ,
- Rule-based Classification ,
- Rule-based Classification Algorithm ,
- Categorical Data ,
- Numerical Data ,
- Probability Distribution Function ,
- Network Latency ,
- Uniform Distribution ,
- Support Vector Machine ,
- Artificial Neural Network ,
- Validation Set ,
- Decision Tree ,
- Data Mining ,
- Local Services ,
- Annual Income ,
- Set Of Rules ,
- Numerous Properties ,
- Split Point ,
- Categorical Attributes ,
- Probability Vector ,
- Test Instances ,
- Uncertain Model ,
- Classifier Construction ,
- Traditional Rules ,
- Class Probabilities ,
- Frequent Class
Keywords assist with retrieval of results and provide a means to discovering other relevant content. Learn more.
- IEEE Keywords
- Index Terms
- Classification Algorithms ,
- Algorithm For Data ,
- Uncertain Data ,
- Rule-based Algorithm ,
- Rule-based Classification ,
- Rule-based Classification Algorithm ,
- Categorical Data ,
- Numerical Data ,
- Probability Distribution Function ,
- Network Latency ,
- Uniform Distribution ,
- Support Vector Machine ,
- Artificial Neural Network ,
- Validation Set ,
- Decision Tree ,
- Data Mining ,
- Local Services ,
- Annual Income ,
- Set Of Rules ,
- Numerous Properties ,
- Split Point ,
- Categorical Attributes ,
- Probability Vector ,
- Test Instances ,
- Uncertain Model ,
- Classifier Construction ,
- Traditional Rules ,
- Class Probabilities ,
- Frequent Class