Skip to Main Content
This paper presents a new method for a quick similarity-based search through long unlabeled audio streams to detect and locate audio clips provided by users. The method involves feature-dimension reduction based on a piecewise linear representation of a sequential feature trajectory extracted from a long audio stream. Two techniques enable us to obtain a piecewise linear representation: the dynamic segmentation of feature trajectories and the segment-based Karhunen-Loeve (KL) transform. The proposed search method guarantees the same search results as the search method without the proposed feature-dimension reduction method in principle. Experimental results indicate significant improvements in search speed. For example, the proposed method reduced the total search time to approximately 1/12 that of previous methods and detected queries in approximately 0.3 s from a 200-h audio database.
Audio, Speech, and Language Processing, IEEE Transactions on (Volume:16 , Issue: 2 )
Date of Publication: Feb. 2008