Skip to Main Content
Multipitch estimation techniques are widely used for music transcription and acquisition of musical data from digital signals. In this paper, we propose a flexible harmonic temporal timbre model to decompose the spectral energy of the signal in the time-frequency domain into individual pitched notes. Each note is modeled with a 2-dimensional Gaussian mixture. Unlike previous approaches, the proposed model is able to represent not only the harmonic partials but also the inharmonic attack of each note. We derive an Expectation-Maximization (EM) algorithm to estimate the parameters of this model and illustrate the higher performance of the proposed algorithm than NMF algorithm and HTC algorithm for the task of multipitch estimation over synthetic and real-world data.