By Topic

Text string extraction within mixed-mode documents

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$31 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

2 Author(s)
Hones, F. ; German Res. Center for Artificial Intelligence, Kairserslautern, Germany ; Lichter, J.

Digitized images of printed documents typically consist of a mixture of text, graphics, and image elements. For proper processing and efficient representation, these elements have to be separated. For most applications it is sufficient to separate between text and non-text, because text captures the most information. The authors describe the implementation and performance of a robust algorithm for text string extraction which is completely independent from text orientation and may deal with text in various font styles and sizes. Text objects may be nested in non-text areas and inverse printing can also be analyzed. It should be mentioned that no recognition of individual characters is performed. The classification is only based on rough image features

Published in:

Document Analysis and Recognition, 1993., Proceedings of the Second International Conference on

Date of Conference:

20-22 Oct 1993