Acoustic Feature Comparison of MFCC and CZT-Based Cepstrum for Speech Recognition | IEEE Conference Publication | IEEE Xplore
Scheduled Maintenance: On Monday, 30 June, IEEE Xplore will undergo scheduled maintenance from 1:00-2:00 PM ET (1800-1900 UTC).
On Tuesday, 1 July, IEEE Xplore will undergo scheduled maintenance from 1:00-5:00 PM ET (1800-2200 UTC).
During these times, there may be intermittent impact on performance. We apologize for any inconvenience.

Acoustic Feature Comparison of MFCC and CZT-Based Cepstrum for Speech Recognition


Abstract:

The speech cepstral features are important parameter in Automatic Speech Recognition (ASR), which symbolizes the property of human auditory system (HAS). The Mel-Frequenc...Show More

Abstract:

The speech cepstral features are important parameter in Automatic Speech Recognition (ASR), which symbolizes the property of human auditory system (HAS). The Mel-Frequency Cepstral Coefficients (MFCC) are the most widely used features in speech recognition field. This paper discusses about the algorithm of Chirp Z-Transform (CZT), and the CZT-based cepstral coefficients are proposed along with the corresponding method of feature extraction. We used MATLAB to perform the experiments. Simulation results show the correctness and effectiveness of the MFCC and the CZT-based cepstrum in speech recognition for Mandarin digits recognition. The recognition rate of MFCC algorithm is compared with Chirp Z-Transform for speech recognition system. The inclusion of cepstrum CZT-based features in parameters space may improve the correct rate of speech recognition.
Date of Conference: 14-16 August 2009
Date Added to IEEE Xplore: 28 December 2009
Print ISBN:978-0-7695-3736-8

ISSN Information:

Conference Location: Tianjian, China

1. Introduction

Feature extraction of speech is one of the most important issues in the field of speech recognition and representative of the speech. In the design of any speech recognition system, the best parametric representation of acoustic signals would be extracted and selected are important tasks. It significantly affects the recognition performance, a set of mel-frequency cepstrum coefficients (MFCC) provide a compact representation that are the results of a cosine transform of the real logarithm of the short-term energy spectrum expressed on a mel-frequency scale [1]. In recent studies of speech recognition system, the MFCC parameters perform better than others in the recognition accuracy [2], [3]. The Mel-Frequency Cepstral Coefficients (MFCC) is the most widely used features in speech recognition field. MFCC exploit the property of human auditory system (HAS), may acquire approximately perfect speech signal feature parameters in almost all possible voice transform spaces. Recent years, Chirp Z-Transform(CZT) gets more and more applicability and effectuality in signal processing field due to its highly selectivity of frequency range and resolution. In particular, the cepstrum spectral resolution can be improved by applying Chirp Z-Transform. In this paper, we performed the experiments; the results show the correctness and effectiveness of the MFCC and the CZT-based cepstrum in speech recognition for Mandarin digits recognition.

Contact IEEE to Subscribe

References

References is not available for this document.