Skip to Main Content
This paper describes a music information retrieval system that uses humming as the key for retrieval. Humming is an easy way for a user to input a melody. However, there are several problems with humming that degrade the retrieval of information. One problem is the human factor. Sometimes, people do not sing accurately, especially if they are inexperienced or unaccompanied. Another problem arises from signal processing. Therefore, a music information retrieval method should be sufficiently robust to surmount various humming errors and signal processing problems. A retrieval system has to extract the pitch from the user's humming. However, pitch extraction is not perfect. It often captures half or double pitches, which are harmonic frequencies of the true pitch, even if the extraction algorithms take the continuity of the pitch into account. Considering these problems, we propose a system that takes multiple pitch candidates into account. In addition to the frequencies of the pitch candidates, the confidence measures obtained from their powers are taken into consideration as well. We also propose the use of an algorithm with three dimensions that is an extension of the conventional Dynamic Programming (DP)algorithm, so that multiple pitch candidates can be treated. Moreover, in the proposed algorithm, DP paths are changed dynamically to take deltaPitches and IOIratios (inter-onset-interval) of input and reference notes into account in order to treat notes being split or unified. We carried out an evaluation experiment to compare the proposed system with a conventional system . When using three-pitch candidates with conference measure and IOI features, the top-ten retrieval accuracy was 94.1%. Thus, the proposed method gave a better retrieval performance than the conventional system.
Date of Publication: June 2006