The paper describes a novel approach for audio stream segregation of a multi-pitch music signal. We propose a parameter-constrained time-frequency spectrum model expressing both a harmonic spectral structure and a temporal curve of the power envelope with Gaussian kernels. MAP estimation of the model parameters using the EM algorithm provides fundamental frequency, onset and offset time, spectral envelope and power envelope of every underlying audio stream. Our proposed method showed high accuracy in a pitch name estimation task of several pieces of real music performance data.
Published in:
Acoustics, Speech, and Signal Processing, 2005. Proceedings. (ICASSP '05). IEEE International Conference on
(Volume:3
)
Date of Conference: 18-23 March 2005