Abstract:
Despite using cross-lingual knowledge to learn word embeddings for various NLP tasks, there is no comprehensive analysis of the multiple methodologies on the Manipuri-Eng...Show MoreMetadata
Abstract:
Despite using cross-lingual knowledge to learn word embeddings for various NLP tasks, there is no comprehensive analysis of the multiple methodologies on the Manipuri-English language pair in the literature. Manipuri is a low-resource language spoken in the northeastern states of India. This study provides an extensive evaluation of two popular unsupervised approaches of inducing cross-lingual word embeddings, namely MUSE and Vecmap, on the language pair bilingual dictionary induction task. We found that the Vecmap consistently outperforms the MUSE. We also propose a novel model to enhance the embeddings by exploiting a temporally aligned comparable corpus. From various experimental results, it is evident that the proposed model outperforms its counterparts.
Date of Conference: 11-13 December 2021
Date Added to IEEE Xplore: 19 January 2022
ISBN Information: