Abstract:
A fundamental assumption often made in unsupervised learning is that the problem is static, i.e., the description of the classes does not change with time. However, many ...Show MoreMetadata
Abstract:
A fundamental assumption often made in unsupervised learning is that the problem is static, i.e., the description of the classes does not change with time. However, many practical clustering tasks involve changing environments. It is hence recognized that the methods and techniques to analyze the evolving trends for changing environments are of increasing interest and importance. Although the problem of clustering numerical time-evolving data is well-explored, the problem of clustering categorical time-evolving data remains as a challenging issue. In this paper, we propose a generalized clustering framework for categorical time-evolving data, which is composed of three algorithms: a drifting-concept detecting algorithm that detects the difference between the current sliding window and the last sliding window, a data-labeling algorithm that decides the most-appropriate cluster label for each object of the current sliding window based on the clustering results of the last sliding window, and a cluster-relationship-analysis algorithm that analyzes the relationship between clustering results at different time stamps. The time-complexity analysis indicates that these proposed algorithms are effective for large datasets. Experiments on a real dataset show that the proposed framework not only accurately detects the drifting concepts but also attains clustering results of better quality. Furthermore, compared with the other framework, the proposed one needs fewer parameters, which is favorable for specific applications.
Published in: IEEE Transactions on Fuzzy Systems ( Volume: 18, Issue: 5, October 2010)
Keywords assist with retrieval of results and provide a means to discovering other relevant content. Learn more.
- IEEE Keywords
- Index Terms
- Categorical Data ,
- Categorical Data Clustering ,
- Time-evolving Data ,
- Time Complexity ,
- Clustering Results ,
- Problem Of Data ,
- Fewer Parameters ,
- Time Stamp ,
- Description Of Classes ,
- Time Complexity Analysis ,
- Clustering Algorithm ,
- Trend Analysis ,
- Class Labels ,
- Precision And Recall ,
- Objective Data ,
- Data Streams ,
- Membership Function ,
- Number Of Objects ,
- Fuzzy Set ,
- Outlier Detection ,
- Cluster Representatives ,
- Sliding Window Technique ,
- Runtime Complexity ,
- Transmission Control Protocol ,
- Propagation Clustering ,
- Foregoing Discussion ,
- Upper Estimate
- Author Keywords
Keywords assist with retrieval of results and provide a means to discovering other relevant content. Learn more.
- IEEE Keywords
- Index Terms
- Categorical Data ,
- Categorical Data Clustering ,
- Time-evolving Data ,
- Time Complexity ,
- Clustering Results ,
- Problem Of Data ,
- Fewer Parameters ,
- Time Stamp ,
- Description Of Classes ,
- Time Complexity Analysis ,
- Clustering Algorithm ,
- Trend Analysis ,
- Class Labels ,
- Precision And Recall ,
- Objective Data ,
- Data Streams ,
- Membership Function ,
- Number Of Objects ,
- Fuzzy Set ,
- Outlier Detection ,
- Cluster Representatives ,
- Sliding Window Technique ,
- Runtime Complexity ,
- Transmission Control Protocol ,
- Propagation Clustering ,
- Foregoing Discussion ,
- Upper Estimate
- Author Keywords