Skip to Main Content
In this work, we present a model-based approach to enhance noisy speech using an analysis-synthesis framework. Target speech is reconstructed with model parameters estimated from noisy observations. In particular, spectral envelope is estimated by tracking its temporal trajectories in order to improve the noise-distorted short-time spectral amplitude. Initially, we propose an analysis-synthesis framework for speech enhancement based on harmonic noise model (HNM). Acoustic parameters such as pitch, spectral envelope, and spectral gain are extracted from HNM analysis. Spectral envelope estimation is improved by tracking its line spectrum frequency trajectories through Kalman filtering. System identification of Kalman filter is achieved via a combined design of codebook mapping scheme and maximum-likelihood estimator with parallel training data. Complete system design and experimental validations are given in details. Through performance evaluation based on a study of spectrogram, objective measures and a subjective listening test, it is demonstrated that the proposed approach achieves significant improvement over conventional methods in various conditions. A distinct advantage of the proposed method is that it successfully tackles the “musical tones” problem.