1. INTRODUCTION
Real-time communication (RTC) systems such as teleconferencing systems, smartphones and telephones, have become a necessity in the life and work of individuals. However, due to the influence of acoustical capturing, noise/reverberation corruption and network congestion, the speech quality of current RTC systems is still deficient. The ICASSP 2023 SSI Challenge [1] focuses on improving the speech signal quality in RTC systems, which involves tackling the difficulties of noise, coloration, discontinuity, loudness, and reverberation of speech in a variety of complex acoustic conditions.