Audio processing applications that use short-time signal analysis techniques typically utilize fixed window duration single- or multi-resolution analyses. However, different real-world signal conditions such as polyphony and non-stationarity, manifested as musical accompaniment and pitch-modulations, respectively, in the context of music content analysis, require varying data window lengths for reliable processing. In this paper, we investigate the use of signal sparsity for adapting analysis window lengths. Adaptive-window analysis driven by different measures of sparsity applied to the local spectrum, such as kurtosis and Gini index, is evaluated and shown to be superior to fixed-window analysis in terms of sinusoid detection and frequency estimation for simulated and real signals. A window main-lobe matching method for sinusoid detection is also shown to be more robust to signal conditions such as polyphony and frequency modulation relative to other methods.
Published in:
Audio, Speech, and Language Processing, IEEE Transactions on
(Volume:20
,
Issue:
1
)
Date of Publication: Jan. 2012