By Topic

An algorithm for clustering heterogeneous data streams with uncertainty

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$31 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

4 Author(s)
Guo-Yan Huang ; Coll. of Inf. Sci. & Eng., Yanshan Univ., Qinhuangdao, China ; Da-Peng Liang ; Chang-Zhen Hu ; Jia-Dong Ren

In many applications, the heterogeneous data streams with uncertainty are ubiquitous. However, the clustering quality of the existing methods for clustering heterogeneous data streams with uncertainty is lower. In this paper, an algorithm for clustering heterogeneous data streams with uncertainty, called HU-Clustering, is proposed. A Heterogeneous Uncertainty Clustering Feature (H-UCF) is presented to describe the feature of heterogeneous data streams with uncertainty. Based on H-UCF, a probability frequency histogram is proposed to track the statistics of categorical attributes; the algorithm initially creates n clusters by k-prototypes algorithm. In order to improve clustering quality, a two phase streams clustering selection process is applied to HU-Clustering algorithm. Firstly, the candidate clustering is selected through the new similarity measure; secondly, the most similar cluster for each new arriving tuple is selected through clustering uncertainty in candidate clustering set. The experimental results show that the clustering quality of HU-Clustering is higher than that of UMicro.

Published in:

Machine Learning and Cybernetics (ICMLC), 2010 International Conference on  (Volume:4 )

Date of Conference:

11-14 July 2010