Skip to Main Content
Decision tree state tying aims to perform divisive clustering, which can combine the phonetics and acoustics of speech signal for large vocabulary continuous speech recognition. A tree is built by successively splitting the observation frames of a phonetic unit according to the best phonetic questions. To prevent building over-large tree models, the stopping criterion is required to suppress tree growing. Accordingly, it is crucial to exploit the goodness-of-split criteria to choose the best questions for node splitting and test whether the splitting should be terminated or not. In this paper, we apply the Hubert's Γ statistic as the node splitting criterion and the T2-statistic as the stopping criterion. The Hubert's Γ statistic sufficiently characterizes the clustering structure in the given data. This cluster validity criterion is adopted to select the best questions to unravel tree nodes. Further, we examine the population closeness of two split nodes with a significance level. The T2-statistic expressed by an F distribution is determined to verify whether the mean vectors of two nodes are close together. The splitting is stopped when verified. In the experiments of Mandarin speech recognition, the proposed methods achieve better syllable recognition rates with smaller tree models compared to the conventional maximum likelihood and minimum description length criteria.
Date of Publication: March 2005