Loading [MathJax]/extensions/MathMenu.js
Investigation of Different CNN-Based Models for Improved Bird Sound Classification | IEEE Journals & Magazine | IEEE Xplore

Investigation of Different CNN-Based Models for Improved Bird Sound Classification


The flow diagram of our proposed approach. The value K denotes the number of selected CNN-based models for the fusion, where it is set to 2, 3, or 4 in this study.

Abstract:

Automatic bird sound classification plays an important role in monitoring and further protecting biodiversity. Recent advances in acoustic sensor networks and deep learni...Show More

Abstract:

Automatic bird sound classification plays an important role in monitoring and further protecting biodiversity. Recent advances in acoustic sensor networks and deep learning techniques provide a novel way for continuously monitoring birds. Previous studies have proposed various deep learning based classification frameworks for recognizing and classifying birds. In this study, we compare different classification models and selectively fuse them to further improve bird sound classification performance. Specifically, we not only use the same deep learning architecture with different inputs but also employ two different deep learning architectures for constructing the fused model. Three types of time-frequency representations (TFRs) of bird sounds are investigated aiming to characterize different acoustic components of birds: Mel-spectrogram, harmonic-component based spectrogram, and percussive-component based spectrogram. In addition to different TFRs, a different deep learning architecture, SubSpectralNet, is employed to classify bird sounds. Experimental results on classifying 43 bird species show that fusing selected deep learning models can effectively increase the classification performance. Our best fused model can achieve a balanced accuracy of 86.31% and a weighted F1-score of 93.31%.
The flow diagram of our proposed approach. The value K denotes the number of selected CNN-based models for the fusion, where it is set to 2, 3, or 4 in this study.
Published in: IEEE Access ( Volume: 7)
Page(s): 175353 - 175361
Date of Publication: 04 December 2019
Electronic ISSN: 2169-3536

Funding Agency:


References

References is not available for this document.