Skip to Main Content
Polyphonic music transcription is a fundamental problem in computer music and over the last decade many sophisticated and application-specific methods have been proposed for its solution. However, most techniques cannot make fully use of all the available training data efficiently and do not scale well beyond a certain size. In this study, we develop an approach based on matrix factorization that can easily handle very large training corpora encountered in real applications. We evaluate and compare four different techniques that are based on randomized approaches to SVD and CUR decompositions. We demonstrate that by only retaining the relevant parts of the training data via matrix skeletonization based on CUR decomposition, we maintain comparable transcription performance with only 2% of the training data. The method seems to compete with the state-of-the-art techniques in the literature. Furthermore, it is very efficient in terms of time and space complexities, can work even in real time without compromising the success rate.