Skip to Main Content
This paper presents a method aimed at recognizing environmental sounds for surveillance and security applications. We propose to apply one-class support vector machines (1-SVMs) together with a sophisticated dissimilarity measure in order to address audio classification, and more specifically, sound recognition. We illustrate the performance of this method on an audio database, which consists of 1015 sounds belonging to nine classes. The database used presents high intraclass diversity in temps of signal properties and some kind of interclass similarities. A large discrepancy in the number of items in each class implies nonuniform probability of sound appearances. The method proceeds as follows: first, the use of a set of state-of-the-art audio features is studied. Then, we introduce a set of novel features obtained by combining elementary features. Experiments conducted on a nine-class classification problem show the superiority of this novel sound recognition method. The best recognition accuracy (96.89%) is obtained when combining wavelet-based features, MFCCs, and individual temporal and frequency features. Our 1-SVM-based multiclass classification approach overperforms the conventional hidden Markov model-based system in the experiments conducted, the improvement in the error rate can reach 50%. Besides, we provide empirical results showing that the single-class SVM outperforms a combination of binary SVMs. Additional experiments demonstrate our method is robust to environmental noise.