By Topic

Using stochastic syntactic analysis for extracting a logical structure from a document image

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$31 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

2 Author(s)
Tateisi, Y. ; Res. Lab., IBM Japan Ltd., Kanagawa, Japan ; Itoh, N.

A method of stochastic syntactic analysis is applied to extracting the logical structure of a printed document from its physical layout and keywords indicating logical components. The document is parsed as a sentence consisting of text lines and graphic objects according to a stochastic regular grammar with attributes. By using stochastic analysis, the parser can retain possible results in order of their probability, and thus, if ambiguity occurs, it selects an optimal result more appropriately than deterministic systems. A mark up system applying the method was constructed, and 87% of the logical components of manuals and 82% of those of technical papers are correctly marked up. The rate improved to 89% when the second candidates were considered, showing the advantage of the authors' approach over the deterministic approach

Published in:

Pattern Recognition, 1994. Vol. 2 - Conference B: Computer Vision & Image Processing., Proceedings of the 12th IAPR International. Conference on  (Volume:2 )

Date of Conference:

9-13 Oct 1994