Skip to Main Content
System combination is a technique which has been shown to yield significant gains in speech recognition and machine translation. Most combination schemes perform an alignment between different system outputs in order to produce lattices (or confusion networks), from which a composite hypothesis is chosen, possibly with the help of a large language model. The benefit of this approach is two-fold: (i) whenever many systems agree with each other on a set of words, the combination output contains these words with high confidence; and (ii) whenever the systems disagree, the language model resolves the ambiguity based on the (probably correct) agreed upon context. The case of machine translation system combination is more challenging because of the different word orders of the translations: the alignment has to incorporate computationally expensive movements of word blocks. In this paper, we show how one can combine translation outputs efficiently, extending the incremental alignment procedure of (A-V.I. Rosti et al., 2008). A comparison between different system combination design choices is performed on an Arabic speech translation task.