A novel similarity measure based on spatial overlapping relation is proposed in this paper, which calculates the similarity between a pair of data points by using the mutual overlapping relation between them in a multi-dimensional space. A spatial overlapping based hierarchical clustering method SOHC was also developed and implemented aimed to justify the effectiveness of the proposed similarity measure. SOHC works well both in low-dimensional and high-dimensional datasets, and is able to cluster arbitrary shape of clusters. Moreover, it can work for both numerical and categorical attributes in a uniform way. Experimental results carried out on some public datasets collected from the UCI machine learning repository and predictive toxicology domain show that SOHC is a promising clustering method in data mining.
Published in:
Fuzzy Systems and Knowledge Discovery, 2008. FSKD '08. Fifth International Conference on
(Volume:2
)
Date of Conference: 18-20 Oct. 2008