By Topic

An approach to extracting the target text line from a document image captured by a pen scanner

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$33 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

2 Author(s)
Zhen-Long Bai ; Comput. Sci. & Inf. Syst. Dept., Hongkong Univ., Hong Kong, China ; Qiang Huo

In this paper, we present a new approach to extracting the target text line from a document image captured by a pen scanner. Given the binary image, a set of possible text lines are first formed by nearest-neighbor grouping of connected components (CC). They are then refined by text line merging and adding the missed CCs. The possible target text line is identified by using a geometric feature based score function and fed to an OCR engine for character recognition. If the recognition result is confident enough, the target text line is accepted. Otherwise, all the remaining text lines are fed to the OCR engine to verify whether an alternative target text line exists or the whole image should be rejected. The effectiveness of the above approach is confirmed by experiments on a testing database consisting of 117 document images captured by C-Pen and ScanEye pen scanners.

Published in:

Document Analysis and Recognition, 2003. Proceedings. Seventh International Conference on

Date of Conference:

3-6 Aug. 2003