Loading [MathJax]/extensions/MathZoom.js
ExcitGlow: Improving a WaveGlow-based Neural Vocoder with Linear Prediction Analysis | IEEE Conference Publication | IEEE Xplore

ExcitGlow: Improving a WaveGlow-based Neural Vocoder with Linear Prediction Analysis


Abstract:

In this paper, we propose ExcitGlow, a vocoder that incorporates the source-filter model of voice production theory into a flow-based deep generative model. By targeting ...Show More

Abstract:

In this paper, we propose ExcitGlow, a vocoder that incorporates the source-filter model of voice production theory into a flow-based deep generative model. By targeting the distribution of the excitation signal instead of the speech waveform itself, we significantly reduce the size of the flow-based generative model. To further reduce the number of parameters, we apply a parameter sharing technique in which a single affine coupling layer is used for several flow layers. To avoid quality degradation, we also introduce a closed-loop training framework to optimize the flow model for both the speech and excitation signal generation processes. Specifically, we choose negative log-likelihood (NLL) loss for the excitation signal and multi-resolution spectral distance for the speech signal. As a result, we are able to reduce the model size from 87. 73M to 15. 60M parameters while maintaining the perceptual quality of synthesized speech.
Date of Conference: 07-10 December 2020
Date Added to IEEE Xplore: 31 December 2020
ISBN Information:

ISSN Information:

Conference Location: Auckland, New Zealand

Funding Agency:


References

References is not available for this document.