Skip to Main Content
In this paper, a new feature set is presented and evaluated based on sinusoidal modeling of audio signals. Duration of the longest sinusoidal model frequency track, as a measure of the harmony, is used and compared to typical features as input into an audio classifier. The performance of this sinusoidal model feature is evaluated through classification of audio to speech and music using both the GMM and the SVM classifiers. Classification results show the proposed feature, which could be used for the first time in such an audio classification, is quite successful in speech/music classification. Experimental comparisons with popular features for audio classification, such as HZCRR and LSTER, are presented and discussed. By using a set of three features, extracted from 1-second segments of the signal, we achieved 94.32% accuracy in the audio classification.