Variance Based Moving K-Means Algorithm | IEEE Conference Publication | IEEE Xplore

Variance Based Moving K-Means Algorithm


Abstract:

Clustering is a useful data exploratory method with its wide applicability in multiple fields. However, data clustering greatly relies on initialization of cluster center...Show More

Abstract:

Clustering is a useful data exploratory method with its wide applicability in multiple fields. However, data clustering greatly relies on initialization of cluster centers that can result in large intra-cluster variance and dead centers, therefore leading to sub-optimal solutions. This paper proposes a novel variance based version of the conventional Moving K-Means (MKM) algorithm called Variance Based Moving K-Means (VMKM) that can partition data into optimal homogeneous clusters, irrespective of cluster initialization. The algorithm utilizes a novel distance metric and a unique data element selection criteria to transfer the selected elements between clusters to achieve low intra-cluster variance and subsequently avoid dead centers. Quantitative and qualitative comparison with various clustering techniques is performed on four datasets selected from image processing, bioinformatics, remote sensing and the stock market respectively. An extensive analysis highlights the superior performance of the proposed method over other techniques.
Date of Conference: 05-07 January 2017
Date Added to IEEE Xplore: 13 July 2017
ISBN Information:
Electronic ISSN: 2473-3571
Conference Location: Hyderabad, India

I. Introduction

Clustering aims at grouping unlabeled data elements with high similarity into clusters based on any measure obtained solely from the data. These methods have been widely used in different investigative areas such as face detection [10], bioinformatics [9], [1], [14], market analysis [2] etc. Clustering has been extensively used to detect faces using skin extraction [5] while bioinformatics researchers utilized cluster analysis to build gene groups with related patterns and develop homologous sequences of genes [14]. Furthermore, market researchers took advantage of clustering techniques to segment multivariate survey data to better understand the relationships between different groups of consumers [2].

Contact IEEE to Subscribe

References

References is not available for this document.