Large high dimensional data handling using data reduction | IEEE Conference Publication | IEEE Xplore

Large high dimensional data handling using data reduction


Abstract:

This paper represent recent problems of computer based applications like medical diagnosis or prediction, weather prediction, climate prediction, text document classifica...Show More

Abstract:

This paper represent recent problems of computer based applications like medical diagnosis or prediction, weather prediction, climate prediction, text document classification, finance and market data prediction etc. Now a days, as data availability increase, data size in terms of samples/instances and attributes are increase which are called large and high dimensional dataset respectively. In most of the applications, data increase in both cases called Large High Dimensional data. Data mining methods which are used for data analysis, classification, clustering and prediction on this type of data, mining methods degrade its performance. So, handling such dataset, data reduction play important role in pre-process steps with preserve or increase performance. Data reduction reduce samples by instance reduction or Horizontal reduction and attributes by attribute/feature reduction or Vertical reduction. Data reductions methods are depend on for which data mining method using. In this paper, classification method is considered with data reduction. For proving importance of data reduction, here weighted k-nearest neighbor classifier is used. Uniform random sampling selection is used for data reduction in both direction-instances and attributes and results show that after data reduction, Accuracy is preserve or increase in most of cases and execution time is decrease in all cases.
Date of Conference: 03-05 March 2016
Date Added to IEEE Xplore: 24 November 2016
ISBN Information:
Conference Location: Chennai, India

I. Introduction

This In many applications, Data is growing fastly day by day and there are massive data collections in the systems. For finding important information from these massive data required methods or techniques of data mining.so we can say that knowledge extraction become important task now a days. In recent trends, most of applications like Health care, medical analysis [1] [11], weather and climate prediction, network intrusion detection [3], automatic text data analysis and categorization [4], finance and market prediction, multimedia data analysis etc. having large amount of data. Manually analysis and information extraction from this large amount of data, is not an easy task for human in short amount of time with accuracy. Therefore, to avoid this problem, most of people or organizations are depend on different techniques of data mining. In next section, we see some data mining methods overview like clustering, classification and prediction [1] and its behavior. This paper discusses k-Nearest Neighbor classifier in the next section.

Contact IEEE to Subscribe

References

References is not available for this document.