Semi-supervised approach for Persian word sense disambiguation | IEEE Conference Publication | IEEE Xplore

Semi-supervised approach for Persian word sense disambiguation


Abstract:

Word-sense disambiguation is one of the key concepts in natural language processing. The main goal of a language is to present a specific concept to the audience. This co...Show More

Abstract:

Word-sense disambiguation is one of the key concepts in natural language processing. The main goal of a language is to present a specific concept to the audience. This concept is extracted from the meaning of words in that language. System should be able to identify role and meaning of words in order to identify the concepts in texts properly. This issue becomes more problematic if there are words that take different meanings because of their surrounding words. Regarding that different practical programs have been developed in Persian language, it is vital now to find a solution for word-sense disambiguation in Persian language. Lack of training data is the biggest challenge in the course of word-sense disambiguation in Persian language. In order to face this problem, machine learning approach with minimal supervision is employed in this research. The applied method tries to disambiguate word senses by considering defined features of target words and applying collaborative learning method. Extracted corpus from published news by news agencies is used as the reference corpus. Evaluating the program by the available corpus on three considered ambiguous words, the implemented method has been able to properly identify the meaning of 5368 documents with 88% recall, 95% precision and 93% accuracy rate.
Date of Conference: 26-27 October 2017
Date Added to IEEE Xplore: 07 December 2017
ISBN Information:
Conference Location: Mashhad, Iran
References is not available for this document.

I. Introduction

Word-sense disambiguation means to determine the right meaning of a word which is ambiguous at the first sight and it seems to have several semantic classes. Generally, words with more than a single meaning can be set as target words and our aim is to disambiguate their sense. Humans can easily identify the right meaning of a word based on deduced concept from the sentence, but computer needs more information than only the input sentence in order to identify semantic class of an ambiguous word. By applying disambiguation methods it is possible to extract concepts of a sentence and give it to computer to decide about semantic class of ambiguous word. Word-sense disambiguation is highly taken into consideration in prevalent languages like English and lots of research has been carried out in order to disambiguate words in English language.

Select All
1.
Liu H., V. Teller and C. Friedman ( 2004 ) A multi-aspect comparison study of supervised word sense disambiguation, Journal of American Medical informatics association.
2.
Fernández A., C. Valdés, R. Claramunt, A. Batalla and J. Tormo ( 2004 ) Automatic acquisition of sense examples using Exretriever.
3.
Pedersen T. ( 2000 ) A Simple Approach to Building Ensembles of Naive Bayesian Classifiers for Word Sense Disambiguation, NAACL 2000 Proceedings of the 1st North American chapter of the Association for Computational Linguistics conference, 63–69.
4.
Borah, P., G. Talukdar and A. Baruah ( 2014 ) Assamese Word Sense Disambiguation using Supervised Learning, Contemporary Computing and Informatics (IC3I).
5.
Fan D., Z. Lu, R. Zhang and X. Li ( 2008 ) Word Sense Disambiguation method based on probability model improved by information gain, Intelligent Control and Automation.
6.
Wai P. ( 2011 ) Myanmar to English verb translation disambiguation approach based on Naïve Bayesian classifier, Computer Research and Development (ICCRD).
7.
Navigli R. (2009) Word sense disambiguation: A survey, ACM Computing Surveys (CSUR), 2009.
8.
Hearst M. ( 1991 ) Noun homograph disambiguation using local context in large text corpora, In Proceedings of the 7th Annual Conference of the UW Centre for the New OED and Text Research: Using Corpora.
9.
Yarowski D. ( 1995 ) Unsupervised word sense disambiguation rivaling supervised methods, In Proceedings of the 33rd Annual Meeting of the Association for Computational Linguistics.
10.
Riahi, N. and F. Sedghi ( 2012 ) A Semi-Supervised method for Persian homograph Disambiguation, Electrical Engineering (ICEE), 20th Iranian Conference on, vol., no., pp. 748, 751, 15–17.
11.
Feili, H. and M. Soltani ( 2010 ) Word sense disambiguation based on statistical methods, 15th National CSI Computer Conference.
12.
BijanKhan, M. ( 2004 ) The role of the corpus in writing a grammar: an introduction to a software, Iranian Journal of Linguistics, Vol. 19, No. 2.
13.
AleAhmad A., H. Amiri, E. Darrudi, M. Rahgozar and F. Oroumchian ( 2009 ) Hamshahri: A standard Persian text collection, Journal of Knowledge-Based Systems, Vol. 22 No. 5, p. 382–387.
14.
Mahmoodvand M. and M. Hourali ( 2015 ) Persian Word Sense Disambiguation Corpus Extraction Based on Web Crawler Method, Advances in Computer Science: an International Journal, Vol 4. Issue 5. 101–106.
15.
Yarowsky D. ( 1994 ) Homograph disambiguation in speech synthesis, ESCAIIEEE Workshop on speech synthesis.

Contact IEEE to Subscribe

References

References is not available for this document.