Skip to Main Content
Motivated by the rich, psycho-physiologically grounded properties of auditory cortical representations and the power of sparse representation-based classifiers, we propose a robust music genre classification framework. Its first pilar is a novel multilinear subspace analysis method that reduces the dimensionality of cortical representations of music signals, while preserving the topology of the cortical representations. Its second pilar is the sparse representation based classification, that models any test cortical representation as a sparse weighted sum of dictionary atoms, which stem from training cortical representations of known genre, by assuming that the representations of music recordings of the same genre are close enough in the tensor space they lie. Accordingly, the dimensionality reduction is made in a compatible manner to the working principle of the sparse-representation based classification. Music genre classification accuracy of 93.7% and 94.93% is reported on the GTZAN and the ISMIR2004 Genre datasets, respectively. Both accuracies outperform any accuracy ever reported for state of the art music genre classification algorithms applied to the aforementioned datasets.
Date of Conference: 14-19 March 2010