A Study of Parsing Process on Natural Language Processing in Bahasa Indonesia | IEEE Conference Publication | IEEE Xplore

A Study of Parsing Process on Natural Language Processing in Bahasa Indonesia


Abstract:

Research on Natural Language Processing (NLP) in Indonesian is still limited and the results of available research that can be used for further research are also limited....Show More

Abstract:

Research on Natural Language Processing (NLP) in Indonesian is still limited and the results of available research that can be used for further research are also limited. In a series of natural language processing, the initial step is parsing the sentence in a particular language based on the grammar in order to help understanding the meaning of a sentence. This research aims to produce a simulation of Indonesian parser by adapting the process which was conducted by using Collins Algorithm. The three main stages are: 1) preprocessing to generate corpus and events files, 2) lexical analysis to convert the corpus into tokens, and 3) syntax analysis to build parse tree that requires file events to calculate the probability of the grammar by count the occurrence frequency on file events to determine the best sentence trees. An evaluation was performed to the parser using 30 simple sentences and the outcomes were able to generate a corpus file, file events, parse-tree and probability calculations. Nevertheless some sentences could not be parsed completely true because of the limitations of the Tree bank file in Indonesian. Some future works are to develop complete and valid Tree bank and Lexicon files.
Date of Conference: 03-05 December 2013
Date Added to IEEE Xplore: 06 March 2014
Electronic ISBN:978-0-7695-5096-1
Conference Location: Sydney, NSW, Australia

Contact IEEE to Subscribe

References

References is not available for this document.