Loading [MathJax]/extensions/MathMenu.js
Comparison of Large Pre-trained Models and Adaptation Methods for Japanese Dialects ASR | IEEE Conference Publication | IEEE Xplore

Comparison of Large Pre-trained Models and Adaptation Methods for Japanese Dialects ASR


Abstract:

In recent years, the accuracy of automatic speech recognition (ASR) for major languages has been greatly improved by pre-training methods using large spoken language reso...Show More

Abstract:

In recent years, the accuracy of automatic speech recognition (ASR) for major languages has been greatly improved by pre-training methods using large spoken language resources. However, practical ASR technology has not yet been realized to cover the large and rich variety of regional dialects of the Japanese language. This study focuses on the adaptability of two state-of-the-art large pretrained models for building a unified ASR model for Japanese dialects. We present results from adapting these models using a total of several dozen hours of Japanese dialect speech. We compare models optimized for each dialect region, including dialect region identification, with models adapted without distinguishing between dialect regions. By comparing these two different learning processes, we investigate how various adaptation methods impact ASR performance for Japanese dialects.
Date of Conference: 29 October 2024 - 01 November 2024
Date Added to IEEE Xplore: 28 November 2024
ISBN Information:

ISSN Information:

Conference Location: Kitakyushu, Japan
No metrics found for this document.

I. Introduction

In recent years, the construction of ASR systems utilizing large-scale pre-trained models has become mainstream. Models such as XLSR [1] and Whisper [2], trained on tens of thousands to hundreds of thousands of hours of multilingual speech, have led to rapid advancements in multilingual speech processing. However, for low-resource languages and dialects not included in the pre-training data, or included only in small quantities, the recognition accuracy is often not practical.

No metrics found for this document.

Contact IEEE to Subscribe

References

References is not available for this document.