Abstract:
This paper describes a morphological analysis of the Kazakh language for Kazakh-English statistical machine translation through changing the compound words of Kazakh lang...Show MoreMetadata
Abstract:
This paper describes a morphological analysis of the Kazakh language for Kazakh-English statistical machine translation through changing the compound words of Kazakh language, and explores the effect of using the modified input on translation quality with a large number of training sentences. Word alignment problem would become more serious for translation from morphologically rich language such as Kazakh to morphologically simple one such as English, due to the problem of data sparseness on translation word forms in many different morphological variants. We present our investigations on unsupervised Kazakh morphological segmentation over newspaper corpus and compare unsupervised segmentation against rule-based language processing tools. In our experiments, the results show that our proposed method can improve word alignment and translation quality.
Published in: 2014 IEEE 8th International Conference on Application of Information and Communication Technologies (AICT)
Date of Conference: 15-17 October 2014
Date Added to IEEE Xplore: 09 February 2015
ISBN Information: