Skip to Main Content
This paper describes a document entry system called the Document Recognition System (DRS), which facilitates the conversion of printed documents into electronic form. DRS was developed on a personal computer (PC) with an adapter card for recognizing more than 3000 Kanji characters. It provides a flexible framework for object-oriented management of data and processing modules. The framework allows the user to change the combination of processing modules and to select pipelining (parallel processing) or sequential processing. DRS includes processing modules for layout analysis functions such as blob detection, block segmentation, and model matching, and for character recognition functions such as Kanji character recognition, Japanese postprocessing, postprocessing by a user, and error correction through a user interface. The character recognition functions on the card and the other processing-related recognition functions on the PC work cooperatively in the proposed framework. Within the basic framework, we have customized DRS for practical applications. Examples of successful applications—entry into a text database, creation of an electronic catalog, entry of family registration data, and entry of tag data in a manufacturing process—provide evidence of the processing accuracy and robustness of the framework.
Note: The Institute of Electrical and Electronics Engineers, Incorporated is distributing this Article with permission of the International Business Machines Corporation (IBM) who is the exclusive owner. The recipient of this Article may not assign, sublicense, lease, rent or otherwise transfer, reproduce, prepare derivative works, publicly display or perform, or distribute the Article.