Skip to Main Content
An unsupervised competitive neural network for efficient clustering of Gaussian probability density function (GPDF) data of continuous density hidden Markov models (CDHMMs) is proposed in this paper. The proposed unsupervised competitive neural network, called the divergence-based centroid neural network (DCNN), employs the divergence measure as its distance measure and utilizes the statistical characteristics of observation densities in the HMM for speech recognition problems. While the conventional clustering algorithms used for the vector quantization (VQ) codebook design utilize only the mean values of the observation densities in the HMM, the proposed DCNN utilizes both the mean and the covariance values. When compared with other conventional unsupervised neural networks, the DCNN successfully allocates more code vectors to the regions where GPDF data are densely distributed while it allocates fewer code vectors to the regions where GPDF data are sparsely distributed. When applied to Korean monophone recognition problems as a tool to reduce the size of the codebook, the DCNN reduced the number of GPDFs used for code vectors by 65.3% while preserving recognition accuracy. Experimental results with a divergence-based k-means algorithm and a divergence-based self-organizing map algorithm are also presented in this paper for a performance comparison.