Skip to Main Content
A novel method is proposed, which is for performing just-in- time adaptation on language models in Chinese speech recognition using Web search engines. Latent semantic analysis (LSA) is employed to change the probability distribution of N-gram language model. The method has two advantages. First, it needs relatively small amount of data which can be obtained from Web on-the-fly. Second, comparing to traditional adaptation formula of LSA, the proposed approach is more efficient, which ensures second pass decoding to be performed with high speed. Experiments show that the perplexity of language model is reduced by over 13% after adaptation. A 4.29% relative reduction on WER is achieved in large vocabulary Chinese speech recognition over standard test set.