Abstract:
We describe the Arabic broadcast transcription system fielded by IBM in the GALE Phase 4 machine translation evaluation. Key advances over our Phase 3.5 system include im...Show MoreMetadata
Abstract:
We describe the Arabic broadcast transcription system fielded by IBM in the GALE Phase 4 machine translation evaluation. Key advances over our Phase 3.5 system include improvements to context-dependent modeling in vowelized Arabic acoustic models; the use of neural-network features provided by the International Computer Science Institute; Model M language models; a neural network language model that uses syntactic and morphological features; and improvements to our system combination strategy. These advances were instrumental in achieving a word error rate of 8.9% on the Phase 4 evaluation set, and an absolute improvement of 1.6% word error rate over our 2008 system on the unsequestered Phase 3.5 evaluation data.
Published in: 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
Date of Conference: 22-27 May 2011
Date Added to IEEE Xplore: 11 July 2011
ISBN Information:
ISSN Information:
Keywords assist with retrieval of results and provide a means to discovering other relevant content. Learn more.
- IEEE Keywords
- Index Terms
- Transcription System ,
- Speech Transcription ,
- Neural Network ,
- Morphological Features ,
- Neural Model ,
- Use Of Features ,
- Rate Set ,
- Language Model ,
- Machine Translation ,
- Syntactic Features ,
- Acoustic Model ,
- Neural Network Features ,
- Word Error Rate ,
- Neural Language Models ,
- Training Data ,
- Decision Tree ,
- Frame Rate ,
- Linear Interpolation ,
- Multilayer Perceptron ,
- Exponential Model ,
- Parse Tree ,
- Audio Transcripts ,
- Subset Of The Training Data ,
- Small Improvement ,
- Similar Improvements ,
- Hourly Data
- Author Keywords
Keywords assist with retrieval of results and provide a means to discovering other relevant content. Learn more.
- IEEE Keywords
- Index Terms
- Transcription System ,
- Speech Transcription ,
- Neural Network ,
- Morphological Features ,
- Neural Model ,
- Use Of Features ,
- Rate Set ,
- Language Model ,
- Machine Translation ,
- Syntactic Features ,
- Acoustic Model ,
- Neural Network Features ,
- Word Error Rate ,
- Neural Language Models ,
- Training Data ,
- Decision Tree ,
- Frame Rate ,
- Linear Interpolation ,
- Multilayer Perceptron ,
- Exponential Model ,
- Parse Tree ,
- Audio Transcripts ,
- Subset Of The Training Data ,
- Small Improvement ,
- Similar Improvements ,
- Hourly Data
- Author Keywords