Skip to Main Content
This paper presents two-pass speech recognition techniques to handle the out-of-vocabulary (OOV) problem in Turkish newspaper content transcription. OOV words are assumed to be replaced by acoustically ldquosimilarrdquo in-vocabulary (IV) words during decoding. Therefore, the first pass recognition lattice is used as the prior knowledge to adapt the vocabulary and the search space for the second pass. Vocabulary adaptation and lattice extension are performed with words similar to the hypothesis lattice words. These words are selected from a fallback vocabulary using distance functions that take the agglutinative language characteristics of Turkish into account. Morphology-based and phonetic-distance-based similarity functions respectively yield 1.9% and 4.6% absolute accuracy improvements. Statistical sub-word units are also utilized to handle the OOV problem encountered in the word-based system. Using sub-words alleviates the OOV problem and improves the recognition accuracy - OOV accuracy improved from 0% to 60.2%. However, this introduces ungrammatical items to the recognition output. Since automatically derived sub-word units do not provide explicit morphological features, the lattice extension strategy is modified to correct these ungrammatical items. Lattice extension for sub-words reduces the word error rate to 32.3% from 33.9%. This improvement is statistically significant at p=0.002 as measured by the NIST MAPSSWE significance test.