Skip to Main Content
In this paper, we investigate the problem of automatically detecting and tracking a specified person's singing portions within a music recording with multiple simultaneous or nonsimultaneous singers. This problem reflects an important issue in multimedia applications which require transcription and indexing of music data in response to the increasing demand nowadays for content-based information retrieval. The major challenges of this study arise from the fact that the singer's voices are inextricably intertwined with the signal of the background accompaniment. To determine whether or when an accompanied voice is present and from a sought singer, methods are presented for separating vocal from non-vocal regions, for extracting and modeling singers' vocal characteristics, and for distinguishing among vocal regions performed by the target singer and other simultaneous or non-simultaneous singers.