Skip to Main Content
In our model for voice conversion, we represent the joint probabilistic acoustic space of the source and target speakers with a mixture of probabilistic principal component analyzers (PPCAs). We present a finer resolution of options to the user of the voice conversion system than traditional Gaussian mixture model based conversion. Objective experiments demonstrate that the dimension of the PPCA directly impacts resulting objective performance but saves both time and memory complexity. Subjective tests imply that incremental removal of information does not affect the listener perceptually. Thus, the end user can select with more freedom how well the system should perform.