Skip to Main Content
A multiple fundamental frequency estimator is a key building block in music transcription and indexing operations. However, systems trying to perform this task tend to be very complex. Indeed, music transcription requires an analysis accounting for both physical and psycho-acoustical matters. In this work, we propose a physically-motivated audio signal analysis followed by an auditory-based selection. The audio signal model allows for a better time/frequency resolution tradeoff, while the auditory distance discards the redundant/non-relevant information. No prior information on the musical instrument, musical genre, and/or maximum polyphony are needed. Simulations show that the proposed technique achieves good transcription results for a variety of string and wind instruments. The proposed scheme is also shown to be robust in the presence of noise, percussive sounds and in unbalanced signal-to-interference ratio (SIR) situations.