Clustering is particularly useful in problems where there is little prior information about the data under analysis. This is usually the case when attempting to evaluate a software system's maintainability, as many dimensions must be taken into account in order to reach a conclusion. On the other hand partitional clustering algorithms suffer from being sensitive to noise and to the initial partitioning. In this paper we propose a novel partitional clustering algorithm, k-Attractors. It employs the maximal frequent itemset discovery and partitioning in order to define the number of desired clusters and the initial cluster attractors. Then it utilizes a similarity measure which is adapted to the way initial attractors are determined. We apply the k-Attractors algorithm to two custom industrial systems and we compare it with WEKA 's implementation of K-Means. We present preliminary results that show our approach is better in terms of clustering accuracy and speed.
Published in:
Tools with Artificial Intelligence, 2007. ICTAI 2007. 19th IEEE International Conference on
(Volume:1
)
Date of Conference: 29-31 Oct. 2007