Skip to Main Content
This paper addresses the problem of noise robustness of automatic speech recognition (ASR) systems, using a hybrid technique: a speech pre-processing enhancement technique and the use of the syllables as the acoustic units for the ASR process. The speech pre-processing enhancement technique was accomplished by the use of the Ephraim-Malah filter. We tested our system using a database that consists of spoken Arabic names in noisy environments. This is achieved by the use of an HMM-based statistical recognition engine. Comparative experiments show that the syllable-based recognition outperforms the monophone- and triphone-based recognition in noisy environments. The HTK hidden Markov model toolkit was used throughout our experiments. Results show that the recognition rate obtained in noisy environments using syllables, outperformed the rates obtained using both triphones and monophones by 5.79% and 39.72%, respectively. On the other hand, with the integration of the Ephraim-Malah filter in the front-end of our syllable-based ASR system, we show through experiments that, the recognition rate using syllables outperformed the rate obtained using triphones and monophones by 6.58% and 39.72%, respectively.