Conferences >2023 IEEE Automatic Speech Re...

Hierarchical Attention-Based Contextual Biasing For Personalized Speech Recognition Using Neural Transducers

Download PDF
Download References
Request Permissions
Save to
Alerts

Abstract:

Although end-to-end (E2E) automatic speech recognition (ASR) systems excel in general tasks, they frequently struggle with accurately recognizing personal rare words. Lev...Show More

Metadata

Abstract:

Although end-to-end (E2E) automatic speech recognition (ASR) systems excel in general tasks, they frequently struggle with accurately recognizing personal rare words. Leveraging contextual information to bias the internal states of E2E ASR model has proven to be an effective solution. However most existing work focuses on biasing for a single domain and it is still challenging to expand such contextualization mechanisms to many domains. To address this limitation, in this work we propose a hierarchical attention architecture to scale contextual biasing to a wide range of domains simultaneously. Given multiple catalogs of contextual information, the high-level attention determines which source of catalog to focus on and the low-level attention learns to attend to the most relevant entity within the focused catalog. Experiments on diverse domains demonstrate the proposed architecture results in

$35 \%$ to

$60 \%$ relative WER improvements on personal rare words and outperforms existing approaches.

Published in: 2023 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU)

Date of Conference: 16-20 December 2023

Date Added to IEEE Xplore: 19 January 2024

ISBN Information:

DOI: 10.1109/ASRU57964.2023.10389675

Conference Location: Taipei, Taiwan

Contents

References is not available for this document.

Hierarchical Attention-Based Contextual Biasing For Personalized Speech Recognition Using Neural Transducers

Abstract:

Metadata

Abstract:

References

IEEE Account

Purchase Details

Profile Information

Need Help?

Hierarchical Attention-Based Contextual Biasing For Personalized Speech Recognition Using Neural Transducers

Alerts

Abstract:

Metadata

Abstract:

Authors

Figures

References

Keywords

Metrics

References

IEEE Account

Purchase Details

Profile Information

Need Help?