Skip to Main Content
This letter proposes a computational auditory scene analysis (CASA) model for monaural speech separation. In this model, we integrate three biologically inspired approaches for: auditory spectrogram generation, analysis of its spectro-temporal content, and tracking its harmonic structure. In a top-down process, the estimated ideal binary mask (EIBM) is calculated using the spectral amplitude of the extracted spectrograms to enhance the harmonic filters for separation. Experimental results showed that our model outperformed the harmonic magnitude suppression technique in both signal-to-interference ratio and percentage of crosstalk. Moreover, the result is comparable with a current state-of-the-art system.