Skip to Main Content
We address the problem of audio analytics with respect to efficient modeling of audio classes and continuous decoding of audio stream to automatically segment and label the audio stream as required in audio indexing. We propose the use of left-to-right HMMs and ergodic HMMs to respectively model definite and indefinite duration audio classes and Viterbi decoding using these HMMs with non-emitting states for continuous decoding of audio streams. We quantify the decoding performance using detection and false-alarm rates and show that the proposed HMM based modeling and Viterbi decoding can have high decoding accuracies with average (%Hit, %False-alarm) of (79.2%, 1.6%), which are significantly better than VQ, GMM and Template based decoding, indicating the viability of the proposed modeling and decoding technique for practical surveillance audio analytics.
Date of Conference: 22-27 May 2011