Skip to Main Content
Discrimination of common environmental background noise sources like train, airport, car, restaurant, street and exhibition mixed with speech signals are required in many applications. These signals are stochastic, non-stationary, non-Gaussian, non-linear and with non-uniform distribution of spectral contents throughout its time length. In this paper, the signal under test is decomposed in different sequences by filtering through a filter bank of ranges 0-500Hz, 500-1000Hz, 1000-1500Hz, 1500-2000Hz, 2000-2500Hz, 2500-3000Hz and above 3000Hz. The feature vector contain the features of only those filtered decomposed sequences corresponding to the particular noise source which can discriminate the other noise sources for the decomposed sequence of same frequency band. The higher order statistics (HOS) based parameters like third-order autocumulant, fourth-order autocumulant, skewness and kurtosis are found to be efficient features for the same. The cumulant based features are modified here as the ratio of their values corresponding to noisy speech decomposed signal to the clean speech (without background noise) decomposed signal for the same frequency range are proved to give better results. It is observed that the extracted feature vectors of some of the decomposed sequences of different noise sources are found more discriminating as compared to without decomposition. Finally the classification of noise sources is done by separating the corresponding feature vectors using Gaussian mixture model (GMM) classifier.
Note: As originally published there was an error in this document. The manuscript submitted had equations that did not display correctly. A corrected PDF is now provided.