A key building block in music transcription and indexing operations is the decomposition of the music signal into notes. We model a note signal as a periodic signal with (slow) global variation of amplitude (reflecting attack, sustain, decay) and frequency (limited time warping). Also voiced speech admits such a representation. The bandlimited variation of global amplitude and frequency gets expressed through a subsampled representation and parameterization of the corresponding signals. The periodic signal is assumed to arrive at a set of sensors with different amplitude and delay. Assuming additive white Gaussian noise, a maximum likelihood approach is proposed for the estimation of the model parameters and the optimization is performed in an iterative (cyclic) fashion that leads to a sequence of simple least-squares problems. Particular attention is paid to the estimation of the basic periodic signal, which can have a non-integer period. Simulation results reveal that the proposed approach allows extracting such signals accurately from an underdetermined mixture of several, using iterated successive interference cancellation
Published in:
Statistical Signal Processing, 2005 IEEE/SP 13th Workshop on
Date of Conference: 17-20 July 2005