Skip to Main Content
This paper presents a statistical method for speaker feature extraction and voice conversion within sinusoidal + noise (S+N) modeling framework. With fundamental researches on speaker characteristics embedded in the parameter sets of S+N model, we found the vector sets of statistical eigenvoice (SEV) and weighted statistical eigenvoice (wSEV), which are basis vectors of GMM representation, have significant properties: approximately speaker-dependent and language-independent. Piered by the feature vectors of SEV and wSEV, we address a new algorithm for context-free voice conversion. Subjective tests suggest that the SEV-based method achieves convincing results while maintaining high synthesis quality in comparison to the traditional LPC approaches.