Close category search window
 

Hardware Implementation of Real-Time Speech Recognition System Using TMS320C6713 DSP

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$31 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

5 Author(s)
Manikandan, J. ; Dept. of Electron. & Commun. Eng. (ECE), Nat. Inst. of Technol., Trichy (NITT), Trichy, India ; Venkataramani, B. ; Girish, K. ; Karthic, H.
more authors

Continuous, real-time speech recognition is required for various mobile and hands-free applications. In this paper, hardware implementation of real-time speech recognition system is proposed using two approaches and their performances are evaluated. The first approach uses Mel Filter Banks with Mel Frequency Cepstrum Coefficients (MFCC) as feature input and the second approach uses Cochlear Filter Banks with Zero-crossings (ZC) as feature input for recognition. The features extracted from input speech are fed to multi-class Support Vector Machine (SVM) classifier for recognition. The proposed recognition systems are implemented on a Texas Instruments TMS320C6713 floating point digital signal processor for recognizing isolated digits (0-9) and their performances are compared. It is observed that the program memory required for MFCC feature extraction is 44.42% higher than that required for feature extraction using Cochlear filters. Recognition accuracies of 93.33% and 98.67% are achieved for feature inputs from Mel filter banks and Cochlear filter banks respectively. It is also observed that the computational complexity of feature extraction using cochlear filters is 1.53 times of that required for MFCC feature extraction. The recognition performance is also studied for different combinations of test and training utterances. It is found that training using 15 utterances of each digit results in best recognition accuracy. The techniques proposed here can be adapted for various other hands-free consumer applications such as washing machines, hands-free cordless and many more.

Published in:
VLSI Design (VLSI Design), 2011 24th International Conference on

Date of Conference: 2-7 Jan. 2011

Need Help?


IEEE Advancing Technology for Humanity About IEEE Xplore | Contact | Help | Terms of Use | Nondiscrimination Policy | Site Map | Privacy & Opting Out of Cookies

A not-for-profit organization, IEEE is the world's largest professional association for the advancement of technology.
© Copyright 2013 IEEE - All rights reserved. Use of this web site signifies your agreement to the terms and conditions.