Skip to Main Content
Harmonic sinusoidal representations of speech have proven to be useful in many speech processing tasks. This work focuses on the phase spectra of the harmonics and provides a methodology to analyze and subsequently to model the statistics of the harmonic phases. To do so, we propose the use of a wrapped Gaussian mixture model (WGMM), a model suitable for random variables that belong to circular spaces, and provide an expectation-maximization algorithm for training. The WGMM is then used to construct a phase quantizer. The quantizer is employed in a prototype variable rate narrow-band VoIP sinusoidal codec that is equivalent to iLBC in terms of PESQ-MOS, at ~13 kbps.