By Topic

Singer Identification Based on Spoken Data in Voice Characterization

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$31 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

2 Author(s)
Tsai, W.-H. ; Department of Electronic Engineering & Graduate Institute of Computer and Communication Engineering, National Taipei University of Technology, ; Lee, H.-C.

Currently existing singer identification (SID) methods follow the framework of speaker identification (SPID), which requires that singing data be collected beforehand to establish each singer's voice characteristics. This framework, however, is unsuitable for many SID applications, because acquiring solo a cappella from each singer is usually not as feasible as collecting spoken data in SPID applications. Since a cappella data are difficult to acquire, many studies have tried to improve SID accuracies when only accompanied singing data are available for training; but, the improvements are not always satisfactory. Recognizing that spoken data are usually available easily, this work investigates the possibility of characterizing singers' voices using the spoken data instead of their singing data. Unfortunately, our experiment found it difficult to replace singing data fully by using spoken data in singer voice characterization, due to the significant difference between singing and speech voice for most people. Thus, we propose two alternative solutions based on the use of few singing data. The first solution aims at adapting a speech-derived model to cover singing voice characteristics. The second solution attempts to establish the relationships between speech and singing using a transformation, so that an unknown test singing clip can be converted into its speech counterpart and then identified using speech-derived models; or alternatively, training data can be converted from speech into singing to generate a singer model capable of matching test singing clips. Our experiments conducted using a 20-singer database validate the proposed solutions.

Published in:

Audio, Speech, and Language Processing, IEEE Transactions on  (Volume:20 ,  Issue: 8 )
Biometrics Compendium, IEEE