Cart (Loading....) | Create Account
Close category search window

A parallel implementation of Viterbi training for acoustic models using graphics processing units

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$31 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

3 Author(s)
Buthpitiya, S. ; Dept. of Electr. & Comput. Eng., Carnegie Mellon Univ., Pittsburgh, PA, USA ; Lane, I. ; Chong, J.

Robust and accurate speech recognition systems can only be realized with adequately trained acoustic models. For common languages, state-of-the-art systems are trained on many thousands of hours of speech data and even with large clusters of machines the entire training process can take many weeks. To overcome this development bottleneck, we propose a parallel implementation of Viterbi training optimized for training Hidden-Markov-Model (HMM)-based acoustic models using highly parallel graphics processing units (GPUs). In this paper, we introduce Viterbi training, illustrate its application concurrency characteristics, data working set sizes, and describe the optimizations required for effective throughput on GPU processors. We demonstrate that the acoustic model training process is well-suited for GPUs. Using a single NVIDIA GTX580 GPU our proposed approach is shown to be 94.8× faster than a sequential CPU implementation, enabling a moderately sized acoustic model to be trained on 1000 hours of speech data in under 7 hours. Moreover, we show that our implementation on a two-GPU system can perform 3.3× faster than a standard parallel reference implementation on a high-end 32-core Xeon server at 1/15th the cost. Our GPU-based training platform empowers research groups to rapidly evaluate new ideas and build accurate and robust acoustic models on very large training corpora at nominal cost.

Published in:

Innovative Parallel Computing (InPar), 2012

Date of Conference:

13-14 May 2012

Need Help?

IEEE Advancing Technology for Humanity About IEEE Xplore | Contact | Help | Terms of Use | Nondiscrimination Policy | Site Map | Privacy & Opting Out of Cookies

A not-for-profit organization, IEEE is the world's largest professional association for the advancement of technology.
© Copyright 2014 IEEE - All rights reserved. Use of this web site signifies your agreement to the terms and conditions.