By Topic

Text extraction from gray scale document images using edge information

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$33 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

2 Author(s)
Q. Yuan ; Dept. of Comput. Sci., Nat. Univ. of Singapore, Singapore ; C. L. Tan

In this paper we present a well designed method that makes use of edge information to extract textual blocks from gray scale document images. It aims at detecting textual regions on heavy noise infected newspaper images and separate them from graphical regions. The algorithm traces the feature points in different entities and then groups those edge points of textual regions. From using the technology of line approximation and layout categorization, it can successfully retrieve directional placed text blocks. Finally feature based connected component merging was introduced to gather homogeneous textual regions together within the scope of its bounding rectangles. We can obtain correct page decomposition with efficient computation and reduced memory size by handling line segments instead of small pixels. The proposed method has been tested on a large group of newspaper images with multiple page layouts, promising results approved the effectiveness of our method

Published in:

Document Analysis and Recognition, 2001. Proceedings. Sixth International Conference on

Date of Conference: