Scheduled System Maintenance:
Some services will be unavailable Sunday, March 29th through Monday, March 30th. We apologize for the inconvenience.
By Topic

Trends and advances in speech recognition

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$31 $31
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

7 Author(s)
Picheny, M. ; IBM Research Division, Thomas J. Watson Research Center, Yorktown Heights, NY, USA ; Nahamoo, D. ; Goel, V. ; Kingsbury, B.
more authors

One of the earliest successful applications of machine-learning techniques to pattern recognition was the application of information-theoretic principles to speech recognition. Previous approaches relied heavily on expert input through the painstaking analysis of data to relate speech signals to the word sequences that produced them. Such methodologies were completely displaced by casting the speech recognition problem in a probabilistic framework by modeling the joint probability distribution of speech signals and word sequences. At the beginning of the 21st century, the amount of data and computation to train and build models has increased exponentially, and the emergence of new machine-learning algorithms and methodologies has opened new vistas in approaching complex pattern recognition problems. This is enabled by a new set of machine-learning techniques referred to as graphical models, with computationally tractable training algorithms. Closely related are neural-network modeling techniques, and there has been a resurgence of interest in the application of neural-network concepts, such as deep networks to speech recognition. The explosion of data has caused the development of new ways to capture the key features in massive amounts of data using efficient methods deploying exemplar-based sparse representations. Lastly, all of these different approaches can be tied together in a principled fashion using another variation of graphical models: an exponential model framework. This paper describes the current state of the art in speech recognition systems and highlights the developments that are expected to produce major breakthroughs in our ability to automatically recognize speech using computers.

Note: The Institute of Electrical and Electronics Engineers, Incorporated is distributing this Article with permission of the International Business Machines Corporation (IBM) who is the exclusive owner. The recipient of this Article may not assign, sublicense, lease, rent or otherwise transfer, reproduce, prepare derivative works, publicly display or perform, or distribute the Article.  

Published in:

IBM Journal of Research and Development  (Volume:55 ,  Issue: 5 )