By Topic

Key technologies of pre-processing and post-processing methods for embedded automatic speech recognition systems

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$33 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

4 Author(s)
Dongzhi He ; Inst. of Embedded Software & Syst., Beijing Univ. of Technol., Beijing, China ; Yibin Hou ; Yuanyuan Li ; Zhi-Hao Ding

Signal pre-processing and post-processing are becoming two key factors that impact embedded speech recognition systems from the laboratory to practical application. Speech endpoint detection and out-of-vocabulary rejection are the most important part of the speech pre-processing and post-processing respectively. The performance of traditional speech endpoint detection based on short-term energy and zero-crossing rate degrade dramatically in noisy environments. Methods based on frequency-domain need complex computing, and they can not meet embedded systems well. In this paper, we present a new endpoint detection algorithm that is based on statistical theory for isolated-word. The correct endpoint detection rate reaches 97.40% using the method. In this paper one-class support vector machine theory is introduced to solve out-of-vocabulary rejection. Using this algorithm system, true recognition fraction(TRF) is up to 96%, and false recognition fraction(FRF ) is about 95%.

Published in:

Mechatronics and Embedded Systems and Applications (MESA), 2010 IEEE/ASME International Conference on

Date of Conference:

15-17 July 2010