Skip to Main Content
This paper explores the automatic classification of audio tracks into musical genres. Our goal is to achieve human-level accuracy with fast training and classification. This goal is achieved with radial basis function (RBF) networks by using a combination of unsupervised and supervised initialization methods. These initialization methods yield classifiers that are as accurate as RBF networks trained with gradient descent (which is hundreds of times slower). In addition, feature subset selection further reduces training and classification time while preserving classification accuracy. Combined, our methods succeed in creating an RBF network that matches the musical classification accuracy of humans. The general algorithmic contribution of this paper is to show experimentally that RBF networks initialized with a combination of methods can yield good classification performance without relying on gradient descent. The simplicity and computational efficiency of our initialization methods produce classifiers that are fast to train as well as fast to apply to novel data. We also present an improved method for initializing the k-means clustering algorithm, which is useful for both unsupervised and supervised initialization methods.