Skip to Main Content
Across language portability of a spoken language understanding system (SLU) deals with the possibility of reusing with moderate effort in a new language knowledge and data acquired for another language. The approach proposed in this paper is motivated by the availability of the fairly large MEDIA corpus carefully transcribed in French and semantically annotated in terms of constituents. A method is proposed for manually translating a portion of the training set for training an automatic machine translation (MT) system to be used for translating the remaining data. As the source language is annotated in terms of concept tags, a solution is presented for automatically transferring these tags to the translated corpus. Experimental results are presented on the accuracy of the translation expressed with the BLEU score as function of the size of the training corpus. It is shown that the process leads to comparable concept error rates in the two languages making the proposed approach suitable for SLU portability across languages.
Date of Conference: 14-19 March 2010