Clustering is the process of partitioning a set of patterns into disjoint and homogeneous meaningful groups (clusters) among which there exist more or less similarities and hierarchies. Accordingly, customer will have difficult to interpret and describe these large amounts of initial cluster results and hierarchies among them. Therefore, it is very valuable to analyze these similarities and construct hierarchy structures of the cluster results based on the similarities. The statistical cluster methods, the grid-based and density-based cluster methods and the model-based cluster algorithms are unfit for this post-processing cluster problem. Furthermore, this problem becomes more intricate in data stream environment for the constraint of single scan of stream data and the need of incremental clustering. Based on multifractal theory, the Fractal-based Cluster Hierarchy Optimization (FCHO) algorithm is proposed, which integrate the cluster similarity with the cluster shape and the cluster distribution to construct cluster hierarchy tree from the disjoint initial clusters. The algorithm proposed is easy to realize, simple to understand and parameter self-adaptive. The elementary time-space complexity is presented and the experimental results using synthetic and real life data set show the performance and the effectivity of FCHO algorithm.
Published in:
Fuzzy Systems and Knowledge Discovery, 2007. FSKD 2007. Fourth International Conference on
(Volume:2
)
Date of Conference: 24-27 Aug. 2007