Manipuri-English Cross-lingual Word Embeddings using a Temporally Aligned Comparable Corpus | IEEE Conference Publication | IEEE Xplore

Manipuri-English Cross-lingual Word Embeddings using a Temporally Aligned Comparable Corpus


Abstract:

Despite using cross-lingual knowledge to learn word embeddings for various NLP tasks, there is no comprehensive analysis of the multiple methodologies on the Manipuri-Eng...Show More

Abstract:

Despite using cross-lingual knowledge to learn word embeddings for various NLP tasks, there is no comprehensive analysis of the multiple methodologies on the Manipuri-English language pair in the literature. Manipuri is a low-resource language spoken in the northeastern states of India. This study provides an extensive evaluation of two popular unsupervised approaches of inducing cross-lingual word embeddings, namely MUSE and Vecmap, on the language pair bilingual dictionary induction task. We found that the Vecmap consistently outperforms the MUSE. We also propose a novel model to enhance the embeddings by exploiting a temporally aligned comparable corpus. From various experimental results, it is evident that the proposed model outperforms its counterparts.
Date of Conference: 11-13 December 2021
Date Added to IEEE Xplore: 19 January 2022
ISBN Information:
Conference Location: Singapore, Singapore

Contact IEEE to Subscribe

References

References is not available for this document.