Improved English to Hindi MT using Long Short-Term Memory | IEEE Conference Publication | IEEE Xplore

Improved English to Hindi MT using Long Short-Term Memory


Abstract:

It is in the field of language translation that robots are unable to compete with humans. Statistical Machine Translation (SMT) is one of the traditional methods for solv...Show More

Abstract:

It is in the field of language translation that robots are unable to compete with humans. Statistical Machine Translation (SMT) is one of the traditional methods for solving the Machine Translation (MT) problem. This approach is best suited for grammatically organized language pairings with comparable syntax, and it necessitates a large number of training data sets. Recent years have seen the rise of Neural Machine Translation (NMT) as a potential alternate way of dealing with the same issue. Various NMT systems for the Indian language Hindi are examined in this study. Earlier NMT approaches have the problem with longer sequences and not being able to capture object importance. In this proposed solution, Long Short-Term Memory (LSTM) mechanism has been used which efficiently deals with such problems. When translating from English to Hindi, eight different architectures of NMT are experimented and the results are compared with those of more standard MT methods. NMT only needs a tiny parallel corpus for training, which means that it can handle tens of thousands of training data sets with ease.
Date of Conference: 25-27 May 2022
Date Added to IEEE Xplore: 08 June 2022
ISBN Information:

ISSN Information:

Conference Location: Madurai, India

I. Introduction

Scientists were originally tasked with solving the challenge of language translation, and work on it has persisted for more than four decades. MT's current condition was the result of years of collaboration between linguists and computer scientists. First, MT relied on dictionary matching approaches, but rule-based systems have since taken their place. In recent years, SMT has been widely adopted in most Machine Translation Systems (MTS). Words, phrases, and sentences are the fundamental translation units in these systems [1], [2], and [3]. Most classic translation systems employ Bayesian inference to predict the likelihood of a pair of words being translated. One phrase represents the language of the source, while the second phase represents the language of the target. The frequency of words is exceedingly low, making it extremely difficult to pair and forecast the proper pair. Increased data set size was one of the most viable strategies for increasing the probability of a given pair of phrases. [4] As standard MTS are limited by their dependence on huge datasets, there is a need for alternate MT techniques.

Contact IEEE to Subscribe

References

References is not available for this document.