Skip to Main Content
Computing technologies have enabled the collection of large amounts of complex data in many fields. There has been enormous growth in the amount of commercial and scientific data. Such datasets consist of sequence data that have an inherent sequential nature. In this paper, we study how to cluster these sequence datasets. We propose an extended concept of the measure of similarity. In addition, we propose an effective hierarchical clustering algorithm. Using a splice dataset, we show that the quality of clusters generated by our proposed approach is better than that of clusters produced by traditional clustering algorithms.