Skip to Main Content
The performance of a speech recognizer can be enhanced by carefully selecting the words that form the vocabulary, for example, by maximizing the dissimilarity of words. This paper presents a method to select a set of words from a given large vocabulary, such that the minimum of the distances between all pairs of words (minimum interset distance) is maximum. To achieve speaker independence, and to keep the task manageable, the phonemic content of words is compared by a dynamic programming method. The selection of words, based on a word-distance matrix generated from the dynamic programming, consists of two steps: a vocabulary buildup step and an optimization step. As an example the method is tested on the vocabulary of letters of the English alphabet.