Skip to Main Content
This paper proposes a new method to calculate the cepstral coefficients for an HMM-based synthesizer. It consists of a direct maximization of the log-likelihood function of a Gaussian mixture model using a gradient ascent algorithm. The method permits to integrate efficiently the global variance factor with a Gaussian mixture acoustic model. The perceptual experiments confirmed that these two factors produce significant improvements on the speech quality, which are independent from each other. By using the proposed method, it is possible to get the benefits of both factors. This paper also proposes a 2-class model for the global variance that discriminates between consonants and vowels. Such 2-class global variance model produces more stable cepstral coefficients than the single-class one.