A Visual Inspection Tool for Evaluation of ASR Model Using PyKaldi and PyCHAIN | IEEE Conference Publication | IEEE Xplore

A Visual Inspection Tool for Evaluation of ASR Model Using PyKaldi and PyCHAIN


Abstract:

We developed a tool to create and evaluate a transcript and an alignment of an utterance. The tool will display speech waveform and MFCC features on HTML5 canvas. It also...Show More

Abstract:

We developed a tool to create and evaluate a transcript and an alignment of an utterance. The tool will display speech waveform and MFCC features on HTML5 canvas. It also shows transcript and phonemes alignment using PyKaldi and PyCHAIN. Maintainers of medical dictation systems will use this tool to examine speech waveform, MFCC features, transcription results, and phonemes alignment of an utterance in the evaluation process. PyKaldi is a Python scripting layer for the Kaldi speech recognition toolkit. At the same time, PyCHAIN is a fully parallelized PyTorch implementation of end-to-end lattice-free maximum mutual information (LF-MMI) training for the chain models in the Kaldi speech recognition toolkit. As a user guide, we demonstrate in this paper a use case for the operation of the tool's features to analyze the performance of the model by inspecting the transcript and the alignment of the utterance.
Date of Conference: 25-26 August 2022
Date Added to IEEE Xplore: 25 October 2022
ISBN Information:
Conference Location: Semarang, Indonesia

Funding Agency:


Contact IEEE to Subscribe

References

References is not available for this document.