Automatic speech recognition (ASR) using multi-band decomposition provides high recognition rates especially in noisy environments. The discrete wavelet transform (DWT) is known to be an efficient tool for decomposing signals into frequency sub-bands. The concept of feature recombination (FC) as applied to the recognition of spoken Arabic numerals is suggested. Utterances are decomposed using DWT before cepstral coefficients of the resulting sub-bands are calculated. The obtained coefficients are concatenated to form a single feature vector that is used as an input to the speech classifier, e.g. a hidden Markov model (HMM), to compute the likelihood. Simulation results have demonstrated that the achieved correct recognition rates using the suggested method are comparable with the full-band ASR (conventional) system.
Published in:
Radio Science Conference, 2002. (NRSC 2002). Proceedings of the Nineteenth National
Date of Conference: 2002