By Topic

Low-complexity source coding using Gaussian mixture models, lattice vector quantization, and recursive coding with application to speech spectrum quantization

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$33 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

3 Author(s)

In this paper, we use the Gaussian mixture model (GMM) based multidimensional companding quantization framework to develop two important quantization schemes. In the first scheme, the scalar quantization in the companding framework is replaced by more efficient lattice vector quantization. Low-complexity lattice pruning and quantization schemes are provided for the E8 Gossett lattice. At moderate to high bit rates, the proposed scheme recovers much of the space-filling loss due to the product vector quantizers (PVQ) employed in earlier work, and thereby, provides improved performance with a marginal increase in complexity. In the second scheme, we generalize the compression framework to accommodate recursive coding. In this approach, the joint probability density function (PDF) of the parameter vectors of successive source frames is modeled using a GMM. The conditional density of the parameter vector of the current source frame based on the quantized values of the parameter vector of the previous source frames is used to generate a new codebook for every current source frame. We demonstrate the efficacy of the proposed schemes in the application of speech spectrum quantization. The proposed scheme is shown to provide superior performance with moderate increase in complexity when compared with conventional one-step linear prediction based compression schemes for both narrow-band and wide-band speech.

Published in:

IEEE Transactions on Audio, Speech, and Language Processing  (Volume:14 ,  Issue: 2 )