Loading [a11y]/accessibility-menu.js
Multi-Scale Temporal Frequency Convolutional Network With Axial Attention for Speech Enhancement | IEEE Conference Publication | IEEE Xplore

Multi-Scale Temporal Frequency Convolutional Network With Axial Attention for Speech Enhancement


Abstract:

Speech quality is often degraded by acoustic echoes, background noise, and reverberation. In this paper, we propose a system consisting of deep learning and signal proces...Show More

Abstract:

Speech quality is often degraded by acoustic echoes, background noise, and reverberation. In this paper, we propose a system consisting of deep learning and signal processing to simultaneously suppress echoes, noise, and reverberation. For the deep learning, we design a novel speech dense-prediction backbone. For the signal processing, a linear acoustic echo canceller is used as conditional information for deep learning. To improve the performance of the speech dense-prediction backbone, strategies such as a microphone and reference phase encoder, multi-scale time-frequency processing, and streaming axial attention are designed. The proposed system ranked first in both AEC and DNS Challenge (non-personal track) of ICASSP 2022. In addition, this backbone has also been extended to the multi-channel speech enhancement task, and placed second in ICASSP 2022 L3DAS22 Challenge1.
Date of Conference: 23-27 May 2022
Date Added to IEEE Xplore: 27 April 2022
ISBN Information:

ISSN Information:

Conference Location: Singapore, Singapore

Contact IEEE to Subscribe

References

References is not available for this document.