Loading [MathJax]/extensions/MathMenu.js
BERT-Based Sentiment Analysis for Low-Resourced Languages: A Case Study of Urdu Language | IEEE Journals & Magazine | IEEE Xplore

BERT-Based Sentiment Analysis for Low-Resourced Languages: A Case Study of Urdu Language


The Overview of USA-BERT.

Abstract:

Sentiment analysis holds significant importance in research projects by providing valuable insights into public opinions. However, the majority of sentiment analysis stud...Show More

Abstract:

Sentiment analysis holds significant importance in research projects by providing valuable insights into public opinions. However, the majority of sentiment analysis studies focus on the English language, leaving a gap in research for other low-resourced languages or regional languages, e.g., Persian, Pashto, and Urdu. Moreover, computational linguists face the challenge of developing lexical resources for these languages. In light of this, this paper presents a deep learning-based approach for Urdu Text Sentiment Analysis (USA-BERT), leveraging Bidirectional Encoder Representations from Transformers and introduces an Urdu Dataset for Sentiment Analysis-23 (UDSA-23). USA-BERT first preprocesses the Urdu reviews by exploiting BERT-Tokenizer. Second, it creates BERT embeddings for each Urdu review. Third, given the BERT embeddings, it fine-tunes a deep learning classifier (BERT). Finally, it employs the Pareto principle on two datasets (the state-of-the-art (UCSA-21) and UDSA-23) to assess USA-BERT. The assessment results demonstrate that USA-BERT significantly surpasses the existing methods by improving the accuracy and f-measure up to 26.09% and 25.87%, respectively.
The Overview of USA-BERT.
Published in: IEEE Access ( Volume: 11)
Page(s): 110245 - 110259
Date of Publication: 04 October 2023
Electronic ISSN: 2169-3536

Funding Agency:


References

References is not available for this document.