Cross-lingual Automatic Speech Recognition Exploiting Articulatory Features | IEEE Conference Publication | IEEE Xplore

Cross-lingual Automatic Speech Recognition Exploiting Articulatory Features


Abstract:

Articulatory features (AFs) provide language-independent attribute by exploiting the speech production knowledge. This paper proposes a cross-lingual automatic speech rec...Show More

Abstract:

Articulatory features (AFs) provide language-independent attribute by exploiting the speech production knowledge. This paper proposes a cross-lingual automatic speech recognition (ASR) based on AF methods. Various neural network (NN) architectures are explored to extract cross-lingual AFs and their performance is studied. The architectures include mutilayer perception(MLP), convolutional NN (CNN) and long short-term memory recurrent NN (LSTM). In our cross-lingual setup, only the source language (English, representing a well-resourced language) is used to train the AF extractors. AFs are then generated for the target language (Mandarin, representing an under-resourced language) using the trained extractors. The frame-classification accuracy indicates that the LSTM has an ability to perform a knowledge transfer through the robust cross-lingual AFs from well-resourced to under-resourced language. The final ASR system is built using traditional approaches (e.g. hybrid models), combining AFs with conventional MFCCs. The results demonstrate that the cross-lingual AFs improve the performance in under-resourced ASR task even though the source and target languages come from different language family. Overall, the proposed cross-lingual ASR approach provides slight improvement over the monolingual LF-MMI and cross-lingual (acoustic model adaptation-based) ASR systems.
Date of Conference: 18-21 November 2019
Date Added to IEEE Xplore: 05 March 2020
ISBN Information:

ISSN Information:

Conference Location: Lanzhou, China

Contact IEEE to Subscribe

References

References is not available for this document.