Journals & Magazines >IEEE/ACM Transactions on Audi... >Volume: 27 Issue: 1

Gated Residual Networks With Dilated Convolutions for Monaural Speech Enhancement

Download PDF
Download References
Request Permissions
Save to
Alerts

Abstract:

For supervised speech enhancement, contextual information is important for accurate mask estimation or spectral mapping. However, commonly used deep neural networks (DNNs...Show More

Metadata

Abstract:

For supervised speech enhancement, contextual information is important for accurate mask estimation or spectral mapping. However, commonly used deep neural networks (DNNs) are limited in capturing temporal contexts. To leverage long-term contexts for tracking a target speaker, we treat speech enhancement as a sequence-to-sequence mapping, and present a novel convolutional neural network (CNN) architecture for monaural speech enhancement. The key idea is to systematically aggregate contexts through dilated convolutions, which significantly expand receptive fields. The CNN model additionally incorporates gating mechanisms and residual learning. Our experimental results suggest that the proposed model generalizes well to untrained noises and untrained speakers. It consistently outperforms a DNN, a unidirectional long short-term memory (LSTM) model, and a bidirectional LSTM model in terms of objective speech intelligibility and quality metrics. Moreover, the proposed model has far fewer parameters than DNN and LSTM models.

Published in: IEEE/ACM Transactions on Audio, Speech, and Language Processing ( Volume: 27, Issue: 1, January 2019)

Page(s): 189 - 198

Date of Publication: 15 October 2018

ISSN Information:

PubMed ID: 31355300

DOI: 10.1109/TASLP.2018.2876171

Funding Agency:

Contents

References is not available for this document.

Gated Residual Networks With Dilated Convolutions for Monaural Speech Enhancement

Abstract:

Metadata

Abstract:

ISSN Information:

Funding Agency:

References

IEEE Account

Purchase Details

Profile Information

Need Help?

Gated Residual Networks With Dilated Convolutions for Monaural Speech Enhancement

Alerts

Abstract:

Metadata

Abstract:

ISSN Information:

Funding Agency:

References

IEEE Account

Purchase Details

Profile Information

Need Help?