Scheduled System Maintenance:
Some services will be unavailable Sunday, March 29th through Monday, March 30th. We apologize for the inconvenience.
By Topic

Logical structure analysis of book document images using contents information

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$31 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

3 Author(s)
ChunChen Lin ; Sch. of Sci. & Eng., Waseda Univ., Tokyo, Japan ; Niwa, Y. ; Narita, S.

Numerous studies have so far been carried out extensively for the analysis of document image structure, with particular emphasis placed on media conversion and layout analysis. For the conversion of a collection of books in a library into the form of hypertext documents, a logical structure extraction technology is indispensable, in addition to document layout analysis. The table of contents of a book generally involves very concise and faithful information to represent the logical structure of the entire book. That is to say, we can efficiently analyze the logical structure of a book by making full use of its contents pages. This paper proposes a new approach for document logical structure analysis to convert document images and contents information into an electronic document. First, the contents pages of a book are analyzed to acquire the overall document logical structure. Thereafter, we are able to use this information to acquire the logical structure of all the pages of the book by analyzing consecutive pages of a portion of the book. Test results demonstrate very high discrimination rates: up to 97.6% for the headline structure, 99.4% for the text structure, 97.8% for the page-number structure and almost 100% for the head-foot structure

Published in:

Document Analysis and Recognition, 1997., Proceedings of the Fourth International Conference on  (Volume:2 )

Date of Conference:

18-20 Aug 1997