Skip to Main Content
The paper investigates the use of acoustic based features for music information retrieval. Two specific problems are studied: similarity search (searching for music sound files similar to a given music sound file) and emotion detection (detection of emotion in music sounds). The Daubechies wavelet coefficient histograms (Li, T. et al., SIGIR'03, p.282-9, 2003), which consist of moments of the coefficients calculated by applying the Db8 wavelet filter, are combined with the timbral features extracted using the MARSYAS system of G. Tzanctakis and P. Cook (see IEEE Trans. on Speech and Audio Process., vol.10, no.5, p.293-8, 2002) to generate compact music features. For the similarity search, the distance between two sound files is defined to be the Euclidean distance of their normalized representations. Based on the distance measure, the closest sound files to an input sound file are obtained. Experiments on jazz vocal and classical sound files achieve a very high level of accuracy. Emotion detection is cast as a multiclass classification problem, decomposed as a multiple binary classification problem, and is resolved with the use of support vector machines trained on the extracted features. Our experiments on emotion detection achieved reasonably accurate performance and provided some insights on future work.