Skip to Main Content
We propose a music segment detection method for audio signals. Unlike many existing methods, ours specifically focuses on a background-music detection task, that is, detecting music used in background of main sounds. This task is important because music is almost always overlapped by speech or other environmental sounds in visual materials such as TV programs. Our method consists of feature extraction, dimension reduction, and statistical discrimination steps. For each step, we analyzed a set of methods to maximize the detection accuracy. With a simple post processing step, we achieved a framewise error rate as low as 8 % even when the mixed speech was louder than the target music by 10dB.