Skip to Main Content
Traditional clustering methods were developed to analyse complete data sets. Faults during the data collection, data transfer or data cleaning often lead to missing values in data so that common clustering methods can not be used for the data analysis. Therefore, in these cases clustering methods which can handle missing values in data are of great use. In this paper we discuss different approaches proposed in the literature for adapting partitioning clustering algorithms for dealing with missing values in data. We analyse them on two appropriate data sets and compare them with each other. We give particular attention to the analysis of the accuracy of these methods depending on the different missing-data mechanisms and the percentage of missing values in the data sets.
Date of Conference: 5-8 July 2010