Skip to Main Content
This paper presents a theoretical analysis of a certain criterion for complex-valued independent component analysis (ICA) with a focus on blind speech extraction (BSE) of a spatio-temporally nonstationary speech source. In the paper, the proposed criteria denoted KSICA is related to the well-known FastICA method with the Kurtosis contrast function. The proposed method is shown to share the important fixed-point feature with the FastICA method, although an improvement with the proposed method is that it does not exhibit the divergent behavior for a of Gaussian-only sources that the FastICA method tends to do, and it shows better performance in online implementations. Compared to the FastICA, the KSICA method provides a 10 dB higher source extraction performance and a 10 dB lower standard deviation in a data batch approach when the data batch size is less than 100 samples. For larger batch sizes, the KSICA metod performs equally well. In an online application with spatially stationary sources the KSICA method provides around 10 dB higher interference suppression, and 1 MOS-unit lower speech distortion compared to the FastICA for 0.15 s time constant in the algorithm update parameter. Thus, the FastICA performance matches the KSICA performance for a time constant above 1 s. Finally, in an online application with a moving speech source, the KSICA method provides 10 dB higher interference suppression, compared to the FastICA for the same algorithm settings. All in all, the proposed KSICA method is shown to be a viable alternative for online BSE of complex-valued signal mixtures.
Audio, Speech, and Language Processing, IEEE Transactions on (Volume:16 , Issue: 8 )
Date of Publication: Nov. 2008