Conferences >2022 Asia-Pacific Signal and ...

ASGAN-VC: One-Shot Voice Conversion with Additional Style Embedding and Generative Adversarial Networks

Download PDF
Download References
Request Permissions
Save to
Alerts

Abstract:

In this paper, we present a voice conversion system that improves the quality of generated voice and its similarity to the target voice style significantly. Many VC syste...Show More

Metadata

Abstract:

In this paper, we present a voice conversion system that improves the quality of generated voice and its similarity to the target voice style significantly. Many VC systems use feature-disentangle-based learning techniques to separate speakers' voices from their linguistic content in order to translate a voice into another style. This is the approach we are taking. To prevent speaker-style information from obscuring the content embedding, some previous works quantize or reduce the dimension of the embedding. However, an imperfect disentanglement would damage the quality and similarity of the sound. In this paper, to further improve quality and similarity in voice conversion, we propose a novel style transfer method within an autoencoder-based VC system that involves generative adversarial training. The conversion process was objectively evaluated using the fair third-party speaker verification system, the results shows that ASGAN-VC outperforms VQVC + and AGAINVC in terms of speaker similarity. A subjectively observing that our proposal outperformed the VQVC + and AGAINVC in terms of naturalness and speaker similarity.

Published in: 2022 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)

Date of Conference: 07-10 November 2022

Date Added to IEEE Xplore: 21 December 2022

ISBN Information:

ISSN Information:

DOI: 10.23919/APSIPAASC55919.2022.9979975

Conference Location: Chiang Mai, Thailand

Contents

References is not available for this document.

ASGAN-VC: One-Shot Voice Conversion with Additional Style Embedding and Generative Adversarial Networks

Abstract:

Metadata

Abstract:

ISSN Information:

References

IEEE Account

Purchase Details

Profile Information

Need Help?

ASGAN-VC: One-Shot Voice Conversion with Additional Style Embedding and Generative Adversarial Networks

Alerts

Abstract:

Metadata

Abstract:

ISSN Information:

Authors

Figures

References

Keywords

Metrics

Footnotes

References

IEEE Account

Purchase Details

Profile Information

Need Help?