Abstract:
This paper looks at the feasibility of a Hidden Markov Model (HMM) based speech recognition system to serve as a Bangla transcription device for doctors, who will dictate...Show MoreMetadata
Abstract:
This paper looks at the feasibility of a Hidden Markov Model (HMM) based speech recognition system to serve as a Bangla transcription device for doctors, who will dictate the case history of patients. The experiments are performed using Hidden Markov Toolkit (HTK). The features used are the Mel Frequency Cepstral Coefficients (MFCC) of the audio signal, which 39 features. The audio data is collected from ten male speakers and the train-test split is 50-50. The system consists of a word parser program, followed by an isolated word recognizer. The word parser takes discretely spoken sentences and outputs word audios. Each word audio is inputted to the word recognizer and the output words are concatenated. Five experiments were repeated twice, due to some words performing poorly in the first run. So, in the second run, more training data was added for the low accuracy words. The final sentence recognition accuracy was 80% and for most words, the recognition accuracy is above 90%. In conclusion, HMM-based recognition systems are feasible for transcription devices.
Published in: 2018 4th International Conference on Electrical Engineering and Information & Communication Technology (iCEEiCT)
Date of Conference: 13-15 September 2018
Date Added to IEEE Xplore: 31 January 2019
ISBN Information: