Abstract:
Traditional hybrid DNN-HMM based ASR system for keywords spotting which models HMM states are not flexible to optimize for a specific language. In this paper, we construc...Show MoreMetadata
Abstract:
Traditional hybrid DNN-HMM based ASR system for keywords spotting which models HMM states are not flexible to optimize for a specific language. In this paper, we construct an end-to-end acoustic model based ASR for keywords spotting in Mandarin. This model is constructed by LSTM-RNN and trained with objective measure of connectionist temporal classification. The input of the network is feature sequences, and the output the probabilities of the initials and finals of Mandarin syllables. Compared with hybrid based ASR systems, the end-to-end system achieves a significant improvement of 6.32% on ATWV relatively. The best result of our system is ATWV 0.8310 on RASC863 data set. The proposed CTC based method applies to KWS in a specific language.
Date of Conference: 17-20 October 2016
Date Added to IEEE Xplore: 04 May 2017
ISBN Information: