Skip to Main Content
In this paper, we present new probabilistic models for identifying bird species from audio recordings. We introduce the independent syllable model and consider two ways of aggregating frame level features within a syllable. We characterize each syllable as a probability distribution of its frame level features. The independent frame independent syllable (IFIS) model allows us to distinguish syllables whose feature distributions are different from one another. The Markov chain frame independent syllable (MCFIS) model is introduced for scenarios where the temporal structure within the syllable provides significant amount of discriminative information. We derive the Bayes risk minimizing classifier for each model and show that it can be approximated as a nearest neighbour classifier. Our experiments indicate that the IFIS and MCFIS models achieve 88.26% and 90.61% correct classification rates, respectively, while the equivalent SVM implementation achieves 86.15%.