Skip to Main Content
In this paper, we have prepared a medium size Bangla speech corpus and compare performances of different acoustic features for Bangla word recognition. Most of the Bangla automatic speech recognition (ASR) system uses a small number of speakers, but 40 speakers selected from a wide area of Bangladesh, where Bangla is used as a native language, are involved here. In the experiments, mel-frequency cepstral coefficients (MFCCs) and local features (LFs) are inputted to the hidden Markov model (HMM) based classifiers for obtaining word recognition performance. From the experiments, it is shown that MFCC-based method of 39 dimensions provides a higher word correct rate (WCR) than the other methods investigated. Moreover, a higher WCR is obtained by the MFCC39-based method with fewer mixture components in the HMM.