Skip to Main Content
In this paper we present a transcription method for polyphonic music. The short time Fourier transform is used first to decompose an acoustic signal into sonic partials in a time-frequency representation. In general the segmented partials exhibit distinguishable features if they originate from different ldquovoicesrdquo in the polyphonic mix. We define feature vectors and utilize a max-margin classification algorithm to produce classification labels to serve as grouping cues, i.e., to decide which partials should be assigned to each voice. These classification labels are then used in statistical optimal grouping decisions and confidence levels are assigned to each decision. This classification algorithm shows promising results for the musical source separation.