Similarity matrices have become an important tool in music audio analysis. However, the quadratic time and space complexity as well as the intricacy of extracting the desired structural information from these matrices are often prohibitive with regard to real-world applications. In this paper, we describe an approach for enhancing the structural properties of similarity matrices based on two concepts: first, we introduce a new class of robust and scalable audio features which absorb local temporal variations. As a second contribution, we then incorporate contextual information into the local similarity measure. The resulting enhancement leads to significant reduction in matrix size and also eases the structure extraction step. As an example, we sketch the application of our techniques to the problems of audio summarization and audio synchronization, obtaining effective and computationally feasible algorithms
Published in:
Acoustics, Speech and Signal Processing, 2006. ICASSP 2006 Proceedings. 2006 IEEE International Conference on
(Volume:5
)
Date of Conference: 14-19 May 2006