Loading [MathJax]/extensions/MathMenu.js
Geez Part of Speech Tagging Using Deep Learning Approaches | IEEE Conference Publication | IEEE Xplore

Geez Part of Speech Tagging Using Deep Learning Approaches


Abstract:

Geez is an ancient language used in Ethiopia and Eritrea as a liturgical language and it has also gained attention in academic areas in Ethiopia and abroad. Despite its h...Show More

Abstract:

Geez is an ancient language used in Ethiopia and Eritrea as a liturgical language and it has also gained attention in academic areas in Ethiopia and abroad. Despite its historical significance, there is a lack of computational resources for natural language processing tasks in Geez. To address this issue, this study develops a deep learning-based Geez Part-of-Speech (POS) tagger model. POS tagging is the process of labeling words in a text according to their grammatical category. A manually annotated dataset of 4981 sentences containing 30K words and 11K unique words is collected and used for training and evaluation. The dataset undergoes preprocessing techniques such as tokenization, sequencing, and sequence padding. Two experiments are conducted using LSTM, BiLSTM, GRU, and BiGRU deep learning models. The results show that the BiLSTM model achieves higher performance, with an accuracy of 94.5% in the 70-15-15 splitting and 95.01% in the 80-10-10 splitting. These experimental findings suggest that deep learning models have the potential to identify the part of speech of the Geez language. Consequently, they can be used in the development of natural language processing tools and resources for low-resource languages. Future studies may explore more sophisticated architectures and techniques to further enhance the model's accuracy on complex and diverse datasets.
Date of Conference: 26-28 October 2023
Date Added to IEEE Xplore: 06 November 2023
ISBN Information:
Conference Location: Bahir Dar, Ethiopia
Related Articles are not available for this document.

Contact IEEE to Subscribe

References

References is not available for this document.