Skip to Main Content
This article explores the potential of the harmonics plus noise model of speech in the development of a high-quality vocoder applicable in statistical frameworks, particularly in modern speech synthesizers. It presents an extensive explanation of all the different alternatives considered during the design of the HNM-based vocoder, together with the corresponding objective and subjective experiments, and a careful description of its implementation details. Three aspects of the analysis have been investigated: refinement of the pitch estimation using quasi-harmonic analysis, study and comparison of several spectral envelope analysis procedures, and strategies to analyze and model the maximum voiced frequency. The performance of the resulting vocoder is shown to be similar to that of state-of-the-art vocoders in synthesis tasks.