By Topic

A Novel Algorithm for Normalizing Noisy Arabic Text

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$31 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

1 Author(s)
Al-Shammari, E.T. ; Kuwait Univ., Kuwait

In this paper, an algorithm to normalize noisy text, which only focuses on the Arabic language, is introduced. Although there have been many theories that discuss Arabic text processing, there has not been, so far, one theory that focuses on noisy Arabic texts. Additionally, this paper introduces a new similarity measure to stem Arabic noisy document. The need for such a new measure stems from the fact that the common rules applied in stemming cannot be applied on noisy texts, which do not conform to the known grammatical rules and have various spelling mistakes. Thus, the proposed normalization algorithm automatically group words after applying the similarity measure. In order to make sure of such a theory of algorithm, the new normalization technique is evaluated by the under-stemming errors reduction technique introduced by Paice.

Published in:

Computer Science and Information Engineering, 2009 WRI World Congress on  (Volume:4 )

Date of Conference:

March 31 2009-April 2 2009