Skip to Main Content
Unlike other audio or speech signals, sound events have a relatively short time span. They are usually distinguished by their unique spectro-temporal signature. This paper proposes a novel classification method based on probabilistic distance support vector machines (SVMs). We study a parametric approach to characterizing sound signals using the distribution of the subband temporal envelope (STE), and kernel techniques for the subband probabilistic distance (SPD) under the framework of SVM. We show that generalized gamma modeling is well devised for sound characterization, and that the probabilistic distance kernel provides a closed form solution to the calculation of divergence distance, which tremendously reduces computational cost. We conducted experiments on a database of ten types of sound events. The results show that the proposed classification method significantly outperforms conventional SVM classifiers with Mel-frequency cepstral coefficients (MFCCs). The rapid computation of probabilistic distance also makes the proposed method an obvious choice for online sound event recognition.
Audio, Speech, and Language Processing, IEEE Transactions on (Volume:19 , Issue: 6 )
Date of Publication: Aug. 2011