Conferences >ICASSP 2023 - 2023 IEEE Inter...

Fine-Grained Textual Knowledge Transfer to Improve RNN Transducers for Speech Recognition and Understanding

Download PDF
Download References
Request Permissions
Save to
Alerts

Abstract:

RNN Tranducer (RNN-T) technology is very popular for building deployable models for end-to-end (E2E) automatic speech recognition (ASR) and spoken language understanding ...Show More

Metadata

Abstract:

RNN Tranducer (RNN-T) technology is very popular for building deployable models for end-to-end (E2E) automatic speech recognition (ASR) and spoken language understanding (SLU). Since these are E2E models operating on speech directly, there remains a potential to improve their performance using purely text based models like BERT, which have strong language understanding capabilities. In this paper, we propose a new training criteria for RNN-T based E2E ASR and SLU to transfer BERT’s knowledge into these systems. In the first stage of our proposed mechanism, we improve ASR performance by using a fine-grained, tokenwise knowledge transfer from BERT. In the second stage, we fine-tune the ASR model for SLU such that the above knowledge is explicitly utilized by the RNN-T model for improved performance. Our techniques improve ASR performance on the Switchboard and CallHome test sets of the NIST Hub5 2000 evaluation and on the recently released SLURP dataset on which we achieve a new state-of-the-art performance. For SLU, we show significant improvements on the SLURP slot filling task, outperforming HuBERT-base and reaching a performance close to HuBERTlarge. Compared to large transformer based speech models like HuBERT, our model is significantly more compact and uses only 300 hours of speech pretraining data.

Published in: ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

Date of Conference: 04-10 June 2023

Date Added to IEEE Xplore: 05 May 2023

ISBN Information:

ISSN Information:

DOI: 10.1109/ICASSP49357.2023.10094997

Conference Location: Rhodes Island, Greece

Contents

References is not available for this document.

Fine-Grained Textual Knowledge Transfer to Improve RNN Transducers for Speech Recognition and Understanding

Abstract:

Metadata

Abstract:

ISSN Information:

References

IEEE Account

Purchase Details

Profile Information

Need Help?

Fine-Grained Textual Knowledge Transfer to Improve RNN Transducers for Speech Recognition and Understanding

Alerts

Abstract:

Metadata

Abstract:

ISSN Information:

References

IEEE Account

Purchase Details

Profile Information

Need Help?