By Topic

Time-domain approach using multiple Kalman filters and EM algorithm to speech enhancement with nonstationary noise

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$31 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

2 Author(s)
Lee, Ki Yong ; Sch. of Electron. Eng., Soongsil Univ., Seoul, South Korea ; Souhwan Jung

A time-domain approach for enhancing speech signals degraded by statistically independent additive nonstationary noise with no a priori information is developed. The autoregressive (AR)-hidden filter model (HFM) with gain contour is proposed for modeling the statistical characteristics of the clean speech signal. Given the HFM parameter set of the speech, speech enhancement becomes a set of problems of joint signal estimation for clean speech and system identification for the gain contour and time-varying parameter of noise. Then, the expectation-maximization (EM) algorithm is applied to signal estimation and system identification. In the E-step, the signal estimation becomes a weighted sum of conditional mean estimator using multiple Kalman filters with Markovian switching coefficient, where the weights equal to a posteriori probabilities of the specific state sequence history given the noisy speech. The probability is computed by the Viterbi algorithm (VA). In M-step, the gain contour and noise parameters are recursively updated by an adaptive algorithm modified from the gradient-based algorithm. The proposed method does not require framing of speech signal in, the train and enhancement procedure. The proposed method is tested against the noisy speech signals degraded by nonstationary noise at various input signal-to-noise ratios. An approximate improvement of 4.5-6.0 dB in signal-to-noise ratio (SNR) is achieved at the input SNR 10 and 15 dB

Published in:

Speech and Audio Processing, IEEE Transactions on  (Volume:8 ,  Issue: 3 )