Skip to Main Content
The development of speech recognition technology has made it possible for some intelligent query systems to use a voice interface. In this paper, we developed a pop-song music retrieval system for telecom carriers to facilitate the interactions between the end users and the music database. When trying to improve the system performance, however, it was found that some typical recognizing optimization techniques for large vocabulary continuous speech recognition (LVCSR) is not practicable for such a real-time application, in which accuracy and speed are both highly stressed. Thus, model optimization techniques are considered. Feature discriminative analysis and minimum phone error discriminative training techniques proposed in recent years have obtained great success in LVCSR, however, there are few reports about their practical applications on online grammar-constrained recognition tasks. In this paper, these techniques are employed and evaluated on such a real-time recognition task. The experimental result shows that these techniques can be effectively implemented in our practical application system with a remarkable error rate reduction of 13.3%.