Loading [MathJax]/extensions/MathZoom.js
Spectrum-Aware Neural Vocoder Based on Self-Supervised Learning for Speech Enhancement | IEEE Conference Publication | IEEE Xplore

Spectrum-Aware Neural Vocoder Based on Self-Supervised Learning for Speech Enhancement


Abstract:

Self-supervised learning (SSL) models for speech provide an efficient way to utilise raw, real-world data for acquiring versatile representations. Here, we investigate th...Show More

Abstract:

Self-supervised learning (SSL) models for speech provide an efficient way to utilise raw, real-world data for acquiring versatile representations. Here, we investigate the benefits of employing such pre-trained SSL models for speech enhancement. Our approach involves customising a neural vocoder that produces enhanced speech using embeddings extracted from the noisy input by pre-trained SSL models. Specifically, we investigate the suitable incorporation of the noisy spectrogram in the network, to address possible loss of acoustic details in the embeddings. Through the exploration of different fusion techniques, we find that effectively incorporating both the SSL embeddings and noisy spectrogram into the neural vocoder results in a model that relies more on the noisy spectrogram for acoustic details and on the SSL embeddings for semantic information. Experimental results show that our proposed model yields a significant improvement of speech quality, compared to baseline models that rely solely on embeddings.
Date of Conference: 26-30 August 2024
Date Added to IEEE Xplore: 23 October 2024
ISBN Information:

ISSN Information:

Conference Location: Lyon, France

References

References is not available for this document.