By Topic

Statistical Machine Translation for Speech: A Perspective on Structures, Learning, and Decoding

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$31 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

1 Author(s)
Bowen Zhou ; IBM T. J. Watson Res. Center, New York, NY, USA

In this paper, we survey and analyze state-of-the-art statistical machine translation (SMT) techniques for speech translation (ST). We review key learning problems, and investigate essential model structures in SMT, taking a unified perspective to reveal both connections and contrasts between automatic speech recognition (ASR) and SMT. We show that phrase-based SMT can be viewed as a sequence of finite-state transducer (FST) operations, similar in spirit to ASR. We further inspect the synchronous context-free grammar (SCFG)-based formalism that includes hierarchical phrase-based and many linguistically syntax-based models. Decoding for ASR, FST-based, and SCFG-based translation is also presented from a unified perspective as different realizations of the generic Viterbi algorithm on graphs or hypergraphs. These consolidated perspectives are helpful to catalyze tighter integrations for improved ST, and we discuss joint decoding and modeling toward coupling ASR and SMT.

Published in:

Proceedings of the IEEE  (Volume:101 ,  Issue: 5 )