Skip to Main Content
With the development of speech recognition, speech data mining becomes a hot topic in fields of data mining and natural language processing. In this paper, a novel clustering algorithm is presented to describe how to do semantic mining and how to understand the developing trend of event implied in speech sequence. At first, the speech sequences are extracted into a Bayesian network presenting the relationship between different speech elements. Then, we utilize a 3-dimensional space and sequence cluster techniques to excavate implied information from speech. Considering speech data features, we improve traditional distance-based clustering algorithm to get semantic information and enhance performance. The experimental results show that our algorithm is correct and effective.