Conferences >ICASSP 2021 - 2021 IEEE Inter...

Extending Parrotron: An End-to-End, Speech Conversion and Speech Recognition Model for Atypical Speech

Download PDF
Download References
Request Permissions
Save to
Alerts

Abstract:

We present an extended Parrotron model: a single, end-to-end network that enables voice conversion and recognition simultaneously. Input spectrograms are transformed to o...Show More

Metadata

Abstract:

We present an extended Parrotron model: a single, end-to-end network that enables voice conversion and recognition simultaneously. Input spectrograms are transformed to output spectrograms in the voice of a predetermined target speaker while also generating hypotheses in a target vocabulary. We study the performance of this novel architecture, which jointly predicts speech and text, on atypical (e.g. dysarthric) speech. We show that with as little as an hour of atypical speech, speaker adaptation can yield a 77% relative reduction in Word Error Rate (WER), measured by ASR performance on the converted speech. We also show that data augmentation using a customized synthesizer built on atypical speech can provide an additional 10% relative improvement over the best speaker-adapted model. Finally, we show how these methods generalize across 8 types of atypical speech for a range of speech impairment severities.

Published in: ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

Date of Conference: 06-11 June 2021

Date Added to IEEE Xplore: 13 May 2021

ISBN Information:

ISSN Information:

DOI: 10.1109/ICASSP39728.2021.9414644

Conference Location: Toronto, ON, Canada

Contents

References is not available for this document.

Extending Parrotron: An End-to-End, Speech Conversion and Speech Recognition Model for Atypical Speech

Abstract:

Metadata

Abstract:

ISSN Information:

References

IEEE Account

Purchase Details

Profile Information

Need Help?

Extending Parrotron: An End-to-End, Speech Conversion and Speech Recognition Model for Atypical Speech

Alerts

Abstract:

Metadata

Abstract:

ISSN Information:

Authors

Figures

References

Citations

Keywords

Metrics

Footnotes

References

IEEE Account

Purchase Details

Profile Information

Need Help?