Loading [MathJax]/extensions/MathMenu.js
Word-Level Speech Dataset Creation for Sourashtra and Recognition System Using Kaldi | IEEE Conference Publication | IEEE Xplore

Word-Level Speech Dataset Creation for Sourashtra and Recognition System Using Kaldi


Abstract:

For a nation such as India with more than 20 formally identified regional languages, it is essential to overcome language barriers to ensure smooth communication. Researc...Show More

Abstract:

For a nation such as India with more than 20 formally identified regional languages, it is essential to overcome language barriers to ensure smooth communication. Researchers have been working on the creation of intelligent engines capable of bridging the gap between different natural languages through machine perception. This paper proposed and describes an Automatic Speech Recognition (ASR) system for the language, Sourashtra which is built using the Kaldi toolkit. A custom speech dataset was created with the help of native Sourashtra speakers. Due to the absence of phoneme representations for this language currently, the Devanagari script was used for transliteration and language modelling, following the ILSL12 convention. With a total of 2000 word utterances, we achieve a word error rate (WER) of 5.5 and a CER of 0.2, using a GMM-HMM based acoustic model trained for monophones.
Date of Conference: 24-26 November 2022
Date Added to IEEE Xplore: 16 February 2023
ISBN Information:

ISSN Information:

Conference Location: Kochi, India

References

References is not available for this document.