By Topic

A comparative study between methods of Arabic baseline detection

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$33 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

2 Author(s)
Atallah AL-Shatnawi ; Department of System Science and Management, Faculty of Information Science and Technology, University Kebangsaan Malaysia, Selangor, Malaysia ; Khairuddin Omar

Preprocessing is the most important stage in the Arabic OCR system; it has a direct effect on the reliability and efficiency of the segmentation and feature extraction stages. It is worth mentioning that Arabic language is cursively written, and its characters have between two to four shapes. An Arabic word likely consists of two or more characters which are connected through an imaginary line called baseline. Detecting baseline is one of the main majorities in preprocessing Arabic OCR system. The baseline can be used for both skew normalization and character segmentation. In this paper the challenges of the Arabic baseline detection methods are listed and clarified. Also this paper aims to provide a brief comparison between the methods of Arabic baseline detection. The comparison has been done based on each of the natures of the Arabic language written, and the diacritics, such as dots and zigzag, and the word slop, and the subwords found.

Published in:

2009 International Conference on Electrical Engineering and Informatics  (Volume:01 )

Date of Conference:

5-7 Aug. 2009