Journals & Magazines >IEEE Transactions on Audio, S... >Volume: 33

Imposing Correlation Structures for Deep Binaural Spatio-Temporal Wiener Filtering

Abstract:

To improve speech quality and intelligibility in environments with noise and interfering sounds, binaural speech enhancement algorithms use the microphone signals from bo...Show More

Metadata

Abstract:

To improve speech quality and intelligibility in environments with noise and interfering sounds, binaural speech enhancement algorithms use the microphone signals from both the left and the right hearing device to generate an enhanced output signal for each ear. As a multi-frame extension of the binaural multi-channel Wiener filter, in this paper we consider the binaural spatio-temporal Wiener filter (STWF) in the short-time Fourier transform domain, which requires estimates of the highly time-varying spatio-temporal correlations of the speech and interference components. To this end, the binaural STWF is embedded into an end-to-end supervised learning framework, where temporal convolutional networks estimate the required quantities, i.e., the inverse spatio-temporal correlation matrices of the interference component and the spatio-temporal correlation vectors and power spectral densities of the speech components. In this paper, we impose spatio-temporal correlation structure on these quantities and relate them between the left and the right hearing device, aiming to reduce computational complexity while maintaining speech enhancement and interaural cue preservation performance. Assuming that the spatial correlation of the speech component is stationary over a small number of frames, we propose to decompose the spatio-temporal correlation vectors as the Kronecker product of a relative transfer function vector and a temporal correlation vector, either considering a global reference microphone or a reference microphone for each hearing device. In addition, we consider a deep bilateral STWF by neglecting the spatio-temporal correlations of the speech and interference components between both devices. The imposed spatio-temporal correlation structures greatly differ in the number of parameters that need to be estimated. The performance of causal versions of the deep binaural and bilateral STWF algorithms is evaluated based on both simulated and measured binaural room imp...

Published in: IEEE Transactions on Audio, Speech and Language Processing ( Volume: 33)

Page(s): 1278 - 1292

Date of Publication: 05 March 2025

Electronic ISSN: 2998-4173

DOI: 10.1109/TASLPRO.2025.3548454

Funding Agency:

Contents

References is not available for this document.

Imposing Correlation Structures for Deep Binaural Spatio-Temporal Wiener Filtering

Abstract:

Metadata

Abstract:

Funding Agency:

References

IEEE Account

Purchase Details

Profile Information

Need Help?

Imposing Correlation Structures for Deep Binaural Spatio-Temporal Wiener Filtering

Alerts

Abstract:

Metadata

Abstract:

Funding Agency:

Authors

Figures

References

Keywords

Metrics

Footnotes

References

IEEE Account

Purchase Details

Profile Information

Need Help?