Skip to Main Content
In frequency-domain blind source separation (BSS) for speech with independent component analysis (ICA), a practical parametric Pearson distribution system is used to model the distribution of frequency-domain source signals. ICA adaptation rules have a score function determined by an approximated signal distribution. Approximation based on the data may produce better separation performance than we can obtain with ICA. Previously, conventional hyperbolic tangent (tanh) or generalized Gaussian distribution (GGD) was uniformly applied to the score function for all frequency bins, even though a wideband speech signal has different distributions at different frequencies. To deal with this, we propose modeling the signal distribution at each frequency by adopting a parametric Pearson distribution and employing it to optimize the separation matrix in the ICA learning process. The score function is estimated by the appropriate Pearson distribution parameters for each frequency bin. We devised three methods for Pearson distribution parameter estimation and conducted separation experiments with real speech signals convolved with actual room impulse responses (T60=130 ms). Our experimental results show that the proposed frequency-domain Pearson-ICA (FD-Pearson-ICA) adapted well to the characteristics of frequency-domain source signals. By applying the FD-Pearson-ICA performance, the signal-to-interference ratio significantly improved by around 2-3 dB compared with conventional nonlinear functions. Even if the signal-to-interference ratio (SIR) values of FD-Pearson-ICA were poor, the performance based on a disparity measure between the true score function and estimated parametric score function clearly showed the advantage of FD-Pearson-ICA. Furthermore, we confirmed the optimum of the proposed approach for/optimized the proposed approach as regards separation performance. By combining individual distribution parameters directly estimated at low frequency with the ap- - propriate parameters optimized at high frequency, it was possible to both reasonably improve the FD-Pearson-ICA performance without any significant increase in the computational burden by comparison with conventional nonlinear functions.
Audio, Speech, and Language Processing, IEEE Transactions on (Volume:17 , Issue: 4 )
Date of Publication: May 2009