Gesper: A Unified Framework for General Speech Restoration | IEEE Conference Publication | IEEE Xplore

Gesper: A Unified Framework for General Speech Restoration


Abstract:

This paper describes the legends-tencent team’s real-time General Speech Restoration (Gesper) system submitted to the ICASSP 2023 Speech Signal Improvement (SSI) Challeng...Show More

Abstract:

This paper describes the legends-tencent team’s real-time General Speech Restoration (Gesper) system submitted to the ICASSP 2023 Speech Signal Improvement (SSI) Challenge. This newly proposed system is a two-stage architecture, in which the speech restoration is performed, and then followed by speech enhancement. We propose a complex spectral mapping-based generative adversarial network (CSM-GAN) as the speech restoration module for the first time. For noise suppression and dereverberation, the enhancement module is presented with fullband-wideband parallel processing. On the blind test set of ICASSP 2023 SSI Challenge, the proposed Gesper system, which satisfies the real-time condition, achieves 3.27 P.804 overall mean opinion score (MOS) and 3.35 P.835 overall MOS, ranked 1st in both track 1 and track 2.
Date of Conference: 04-10 June 2023
Date Added to IEEE Xplore: 05 May 2023
ISBN Information:

ISSN Information:

Conference Location: Rhodes Island, Greece
Citations are not available for this document.

1. INTRODUCTION

Real-time communication (RTC) systems such as teleconferencing systems, smartphones and telephones, have become a necessity in the life and work of individuals. However, due to the influence of acoustical capturing, noise/reverberation corruption and network congestion, the speech quality of current RTC systems is still deficient. The ICASSP 2023 SSI Challenge [1] focuses on improving the speech signal quality in RTC systems, which involves tackling the difficulties of noise, coloration, discontinuity, loudness, and reverberation of speech in a variety of complex acoustic conditions.

Cites in Papers - |

Cites in Papers - IEEE (10)

Select All
1.
Tushar Dhyani, Florian Lux, Michele Mancusi, Giorgio Fabbro, Fritz Hohl, Ngoc Thang Vu, "High-Resolution Speech Restoration with Latent Diffusion Model", ICASSP 2025 - 2025 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp.1-5, 2025.
2.
Kai Li, Yi Luo, "Apollo: Band-sequence Modeling for High-Quality Audio Restoration", ICASSP 2025 - 2025 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp.1-5, 2025.
3.
Nicolae-Cătălin Ristea, Babak Naderi, Ando Saabas, Ross Cutler, Sebastian Braun, Solomiya Branets, "ICASSP 2024 Speech Signal Improvement Challenge", IEEE Open Journal of Signal Processing, vol.6, pp.238-246, 2025.
4.
Chang Zeng, Chunhui Wang, Xiaoxiao Miao, Jian Zhao, Zhonglin Jiang, Yong Chen, "Instructsing: High-Fidelity Singing Voice Generation Via Instructing Yourself", 2024 IEEE Spoken Language Technology Workshop (SLT), pp.675-681, 2024.
5.
Zining Liang, Hucheng Wang, Yichen Yang, Wen Zhang, Thushara D. Abhayapala, "Active Road Noise Control Based on Data-Driven Predictions of Passenger Ear Noise Signal", 2024 18th International Workshop on Acoustic Signal Enhancement (IWAENC), pp.424-428, 2024.
6.
Guochen Yu, Runqiang Han, Chenglin Xu, Haoran Zhao, Nan Li, Chen Zhang, Xiguang Zheng, Chao Zhou, Qi Huang, Bing Yu, "Ks-Net: Multi-Band Joint Speech Restoration and Enhancement Network for 2024 ICASSP SSI Challenge", 2024 IEEE International Conference on Acoustics, Speech, and Signal Processing Workshops (ICASSPW), pp.33-34, 2024.
7.
Fengyuan Hao, Huiyong Zhang, Lingling Dai, Xiaoxue Luo, Xiaodong Li, Chengshi Zheng, "Renet: A Time-Frequency Domain General Speech Restoration Network for Icassp 2024 Speech Signal Improvement Challenge", 2024 IEEE International Conference on Acoustics, Speech, and Signal Processing Workshops (ICASSPW), pp.37-38, 2024.
8.
Julius Richter, Simon Welker, Jean-Marie Lemercier, Bunlong Lay, Tal Peer, Timo Gerkmann, "Causal Diffusion Models for Generalized Speech Enhancement", IEEE Open Journal of Signal Processing, vol.5, pp.780-789, 2024.
9.
Ross Cutler, Ando Saabas, Babak Naderi, Nicolae-Cătălin Ristea, Sebastian Braun, Solomiya Branets, "ICASSP 2023 Speech Signal Improvement Challenge", IEEE Open Journal of Signal Processing, vol.5, pp.662-674, 2024.
10.
Takaaki Saeki, Shinnosuke Takamichi, Tomohiko Nakamura, Naoko Tanji, Hiroshi Saruwatari, "SelfRemaster: Self-Supervised Speech Restoration for Historical Audio Resources", IEEE Access, vol.11, pp.144831-144843, 2023.
Contact IEEE to Subscribe

References

References is not available for this document.