A generic system for form dropout
Bin Yu
Jain, A.K.
Dept. of Comput. Sci., Michigan State Univ., East Lansing, MI;
This paper appears in: Pattern Analysis and Machine Intelligence, IEEE Transactions on
Publication Date: Nov 1996
Volume: 18,
Issue: 11
On page(s): 1127-1134
ISSN: 0162-8828
References Cited: 11
CODEN: ITPIDJ
INSPEC Accession Number: 5446365
Digital Object Identifier: 10.1109/34.544084
Current Version Published: 2002-08-06
Abstract
Recent advances in intelligent character recognition are enabling
us to address many challenging problems in document image analysis. One
of them is intelligent form analysis. This paper describes a generic
system for form dropout when the filled-in characters or symbols are
either touching or crossing the form frames. We propose a method to
separate these characters from form frames whose locations are unknown.
Since some of the character strokes are either touching or crossing the
form frames, we need to address the following three issues: 1)
localization of form frames; 2) separation of characters and form
frames; and 3) reconstruction of broken strokes introduced during
separation. The form frame is automatically located by finding long
straight lines based on the block adjacency graph. Form frame separation
and character reconstruction are implemented by means of this graph. The
proposed system includes form structure learning and form dropout.
First, a form structure-based template is automatically generated from a
blank form which includes form frames, preprinted data areas and skew
angle. With this form template, our system can then extract both
handwritten and machine-typed filled-in data. Experimental results on
three different types of forms show the performance of our system.
Further, the proposed method is robust to noise and skew that is
introduced during scanning
Index
Terms
Available to subscribers and IEEE members.
References
Available to subscribers and IEEE members.
Citing Documents
Available to subscribers and IEEE members.