Abstract:
In recent years, the development of machine translation systems has made significant progress, but the quality and availability of translations between less common langua...Show MoreMetadata
Abstract:
In recent years, the development of machine translation systems has made significant progress, but the quality and availability of translations between less common languages remain limited. This paper presents the results of using the modern multilingual machine translation model NLLB to develop a high-quality translator from Russian to Tatar. This model, which supports more than 200 languages, can serve as a basis for translation between different language pairs. The choice of this model is due to the fact that this model is pre-trained in a large number of Turkic languages, including Tatar. We analyze the existing analogs and approaches used to train Russian-Tatar machine translation models, and show the impact of various methods of machine translation of low-resource languages for the Russian-Tatar pair. The results of experiments show that the developed model achieves better quality results compared to open access translators, which opens up new opportunities for integrating the Tatar language into the global communication network.
Published in: 2024 IEEE 3rd International Conference on Problems of Informatics, Electronics and Radio Engineering (PIERE)
Date of Conference: 15-17 November 2024
Date Added to IEEE Xplore: 25 December 2024
ISBN Information: