Abstract:
This paper evaluates the potential of convolutional neural networks in classifying short audio clips of environmental sounds. A deep model consisting of 2 convolutional l...Show MoreMetadata
Abstract:
This paper evaluates the potential of convolutional neural networks in classifying short audio clips of environmental sounds. A deep model consisting of 2 convolutional layers with max-pooling and 2 fully connected layers is trained on a low level representation of audio data (segmented spectrograms) with deltas. The accuracy of the network is evaluated on 3 public datasets of environmental and urban recordings. The model outperforms baseline implementations relying on mel-frequency cepstral coefficients and achieves results comparable to other state-of-the-art approaches.
Published in: 2015 IEEE 25th International Workshop on Machine Learning for Signal Processing (MLSP)
Date of Conference: 17-20 September 2015
Date Added to IEEE Xplore: 12 November 2015
Electronic ISBN:978-1-4673-7454-5