Skip to Main Content
This paper presents a new clustering algorithm for analyzing unordered discrete-valued data. This algorithm consists of a cluster initiation phase and a sample regrouping phase. The first phase is based on a data-directed valley detection process utilizing the optimal second-order product approximation of high-order discrete probability distribution, together with a distance measure for discrete-valued data. As for the second phase, it involves the iterative application of the Bayes' decision rule based on subgroup discrete distributions. Since probability is used as its major decision criterion, the proposed method minimizes the disadvantages of yielding solutions sensitive to the arbitrary distance measure adopted. The performance of the proposed algorithm is evaluated by applying it to four different sets of simulated data and a set of clinical data. For performance comparison, the decision-directed algorithm  is also applied to the same set of data. These evaluation experiments fully demonstrate the validity and the operational feasibility of the proposed algorithm and its superiority as compared to the decision-directed algorithm.