We are currently experiencing intermittent issues impacting performance. We apologize for the inconvenience.
By Topic

Text block recognition from TIFF images

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$31 $31
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

2 Author(s)
Lovegrove, W. ; Nottingham Univ., UK ; Elliman, D.

The reproduction of a scanned document should include not only the optical character recognition of text, but also the structure of that text on the page and the appearance of that text itself (i.e. font recognition). This is paper presents an algorithm which structurally recognises the text of a page image. The method is based upon the “Docstrum plot” algorithm by L.O'Gorman (1993). Modifications have been made to O'Gorman's algorithm which render very good results at identifying paragraphs and lines in particular. The algorithm implementation can, to a limited degree, describe the logical relationship of the text elements of the original page. The limitations of the algorithm are due to the lack of information available without OCR and font technology incorporated into the algorithm implementation. The algorithm implementation has a graphical interface which portrays the state of the algorithm during the process of decomposition

Published in:

Document Image Processing and Multimedia Environments, IEE Colloquium on

Date of Conference:

2 Nov 1995