Several studies show that background knowledge of a domain can improve the results of clustering algorithms. In this paper, we illustrate how to use the background knowledge of medical domain in clustering process to predict the likelihood of diseases. To find the likelihood of diseases, clustering has to be done based on anticipated likelihood attributes with core attributes of disease in data point. To find the likelihood of diseases, we have proposed constraint k-Means-Mode clustering algorithm. Attributes of Medical data are both continuous and categorical. The developed algorithm can handle both continuous and discrete data and perform clustering based on anticipated likelihood attributes with core attributes of disease in data point. We have demonstrated its effectiveness by testing it for a real world patient data set.
Published in:
Digital Information Management (ICDIM), 2010 Fifth International Conference on
Date of Conference: 5-8 July 2010