Skip to Main Content
In this paper, we propose a method for out-of-vocabulary (OOV) word detection and take a step toward open vocabulary automatic speech recognition. The proposed method uses a hybrid language model combining words and subword units such as phones or syllables. We describe a detection algorithm based on the posterior count of the OOV words given the hybrid model, and compare it to using the posterior probability of the best word string given a conventional word only model. Experimental results on the Switchboard corpus are presented for different vocabulary sizes. The new method yields a gain of over 10% in OOV word detection. In addition, a modest number of the OOV word pronunciations are found correctly.