Comparison of Large Pre-trained Models and Adaptation Methods for Japanese Dialects ASR | IEEE Conference Publication | IEEE Xplore

Comparison of Large Pre-trained Models and Adaptation Methods for Japanese Dialects ASR


Abstract:

In recent years, the accuracy of automatic speech recognition (ASR) for major languages has been greatly improved by pre-training methods using large spoken language reso...Show More

Abstract:

In recent years, the accuracy of automatic speech recognition (ASR) for major languages has been greatly improved by pre-training methods using large spoken language resources. However, practical ASR technology has not yet been realized to cover the large and rich variety of regional dialects of the Japanese language. This study focuses on the adaptability of two state-of-the-art large pretrained models for building a unified ASR model for Japanese dialects. We present results from adapting these models using a total of several dozen hours of Japanese dialect speech. We compare models optimized for each dialect region, including dialect region identification, with models adapted without distinguishing between dialect regions. By comparing these two different learning processes, we investigate how various adaptation methods impact ASR performance for Japanese dialects.
Date of Conference: 29 October 2024 - 01 November 2024
Date Added to IEEE Xplore: 28 November 2024
ISBN Information:

ISSN Information:

Conference Location: Kitakyushu, Japan

Contact IEEE to Subscribe

References

References is not available for this document.