By Topic

Document Analysis Systems, 2008. DAS '08. The Eighth IAPR International Workshop on

Date 16-19 Sept. 2008

Filter Results

Displaying Results 1 - 25 of 93
  • [Front cover]

    Page(s): C1
    Save to Project icon | Request Permissions | PDF file iconPDF (280 KB)  
    Freely Available from IEEE
  • [Title page i]

    Page(s): i
    Save to Project icon | Request Permissions | PDF file iconPDF (28 KB)  
    Freely Available from IEEE
  • [Title page iii]

    Page(s): iii
    Save to Project icon | Request Permissions | PDF file iconPDF (64 KB)  
    Freely Available from IEEE
  • [Copyright notice]

    Page(s): iv
    Save to Project icon | Request Permissions | PDF file iconPDF (44 KB)  
    Freely Available from IEEE
  • Table of contents

    Page(s): v - xi
    Save to Project icon | Request Permissions | PDF file iconPDF (122 KB)  
    Freely Available from IEEE
  • Foreword

    Page(s): xii - xiii
    Save to Project icon | Request Permissions | PDF file iconPDF (94 KB)  
    Freely Available from IEEE
  • Conference organization

    Page(s): xiv
    Save to Project icon | Request Permissions | PDF file iconPDF (121 KB)  
    Freely Available from IEEE
  • list-reviewer

    Page(s): xv - xvi
    Save to Project icon | Request Permissions | PDF file iconPDF (113 KB)  
    Freely Available from IEEE
  • Sponsors

    Page(s): xvii
    Save to Project icon | Request Permissions | PDF file iconPDF (75 KB)  
    Freely Available from IEEE
  • Extraction of Text Objects in Video Documents: Recent Progress

    Page(s): 5 - 17
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (449 KB) |  | HTML iconHTML  

    Text extraction in video documents, as an important research field of content-based information indexing and retrieval, has been developing rapidly since 1990s. This has led to much progress in text extraction, performance evaluation, and related applications. By reviewing the approaches proposed during the past five years, this paper introduces the progress made in this area and discusses promising directions for future research. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A Hilbert Warping Algorithm for Recognizing Characters from Moving Camera

    Page(s): 21 - 27
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (488 KB) |  | HTML iconHTML  

    We present a method for recognizing characters from image sequences captured by moving camera. In the proposed method, the sequence of the captured images is compared with those of reference character patterns using the concept of analytic signal. Since the captured image sequence can be nonlinearly warped along the time axis due to the movement of a hand-held camera, phase synchronization of two analytic signals is used for the alignment of two image sequences. Hilbert transform is used to convert all the image sequences into analytic signals whose phases are supposed to be increasing. Experimental results showed the usefulness of the proposed phase-based alignment algorithm. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Writer Verification of Arabic Handwriting

    Page(s): 28 - 34
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (951 KB) |  | HTML iconHTML  

    Expanding on an earlier study to objectively validate the hypothesis that handwriting is individualistic, we extend the study to include handwriting in the Arabic script. Handwriting samples from twelve native speakers of Arabic were obtained. Analyzing differences in handwriting was done by using computer algorithms for extracting features from scanned images of handwriting. Attributes characteristic of the handwriting were obtained, e.g., line separation, slant, character shapes, etc. These attributes, which are a subset of attributes used by forensic document examiners (FDEs), were used to quantitatively establish individuality by using machine learning approaches. Using global attributes of handwriting, the ability to determine the writer with a high degree of confidence was established. The work is a step towards providing scientific support for admitting handwriting evidence in court. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A Robust System to Detect and Localize Texts in Natural Scene Images

    Page(s): 35 - 42
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (848 KB) |  | HTML iconHTML  

    In this paper, we present a robust system to accurately detect and localize texts in natural scene images. For text detection, a region-based method utilizing multiple features and cascade AdaBoost classifier is adopted. For text localization, a window grouping method integrating text line competition analysis is used to generate text lines. Then within each text line, local binarization is used to extract candidate connected components (CCs) and non-text CCs are filtered out by Markov Random Fields (MRF) model, through which text line can be localized accurately. Experiments on the public benchmark ICDAR 2003 Robust Reading and Text Locating Dataset show that our system is comparable to the best existing methods both in accuracy and speed. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • An image based watermark string detection system for document security checking

    Page(s): 43 - 50
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (8 KB)  

    Document security is a very important topic in information management. In this paper, an image based watermark string detection system is proposed to detect the documents that include printed keyword strings as the watermark in the background. Therefore, the disclosure of the sensitive documents can be monitored automatically. Since the documents are represented in image format, the watermark string is detected by a parts based object recognition strategy. The two key contributions of our paper are cross validation based image registration and theMaximum Clique (MC) based object parts recognition. Experiments on PPT and WORD document pages with 5 different watermark keywords show the excellent performance of our system. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Feature Extraction for Document Image Segmentation by pLSA Model

    Page(s): 53 - 60
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (2939 KB) |  | HTML iconHTML  

    In this paper, we propose a method for document image segmentation based on pLSA (probabilistic latent semantic analysis) model. The pLSA model is originally developed for topic discovery in text analysis using "bag-of-words" document representation. The model is useful for image analysis by "bag-of-visual words" image representation. The performance of the method depends on the visual vocabulary generated by feature extraction from the document image. We compare several feature extraction and description methods, and examine the relations to segmentation performance. Through the experiments, we show accurate content-based document segmentation is made possible by using pLSA-based method. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Grouping Text Lines in Online Handwritten Japanese Documents by Combining Temporal and Spatial Information

    Page(s): 61 - 68
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (613 KB)  

    We present an effective approach for grouping text lines in online handwritten Japanese documents by combining temporal and spatial information. Initially, strokes are grouped into text line strings according to off-stroke distances. Each text line string is segmented into text lines by dynamic programming (DP) optimizing a cost function trained by the minimum classification error (MCE) method. Over-segmented text lines are then merged with a support vector machine (SVM) classifier for making merge/non-merge decisions, and last, a spatial merge module corrects the segmentation errors caused by delayed strokes. In experiments on the TUAT Kondate database, the proposed approach achieves the Entity Detection Metric (EDM) rate of 0.8816, the Edit-Distance Rate (EDR) of 0.1234, which demonstrates the superiority of our approach. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Accurate Alignment of Double-Sided Manuscripts for Bleed-Through Removal

    Page(s): 69 - 75
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (2536 KB) |  | HTML iconHTML  

    Double-sided manuscripts are often degraded by bleed-through interference. Such degradation must be corrected to facilitate human perception and machine recognition. Most approaches to bleed-through removal rely on perfect alignment between the recto and verso images of a document. This paper presents a two-stage hierarchical alignment technique that can efficiently and accurately align the two sides of a document. Our approach first coarsely aligns the two images using a pair of anchors extracted from the recto and verso images respectively. The coarsely aligned images are then precisely aligned using block matching and radial basis function (RBF) based interpolation techniques. To evaluate the proposed alignment technique, we build a classification and recovery system to remove bleed-through interference and restore historical manuscripts. The accuracy of our alignment approach is then assessed with the accuracy of bleed-through correction. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Super-Resolution of Text Images Using Edge-Directed Tangent Field

    Page(s): 76 - 83
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1269 KB) |  | HTML iconHTML  

    This paper presents an edge-directed super-resolution algorithm for document images without using any training set. This technique creates an image with smooth regions in both the foreground and the background, while allowing sharp discontinuities across and smoothness along the edges. Our method preserves sharp corners in text images by using the local edge direction, which is computed first by evaluating the gradient field and then taking its tangent. Super-resolution of document images is characterized by bimodality, smoothness along the edges as well as subsampling consistency. These characteristics are enforced in a Markov random field (MRF) framework by defining an appropriate energy function. In our method, subsampling of super-resolution image will return the original low-resolution one, proving the correctness of the method. The super-resolution image, is generated by iteratively reducing this energy function. Experimental results on a variety of input images, demonstrate the effectiveness of our method for document image super-resolution. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Attention-Based Document Classifier Learning

    Page(s): 87 - 94
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1271 KB) |  | HTML iconHTML  

    We describe an approach for creating precise personalized document classifiers based on the user's attention. The general idea is to observe which parts of a document the user was interested in just before he or she comes to a classification decision. Having information about this manual classification decision and the document parts the decision was based on, we can learn precise classifiers. For observing the user's focus point of attention we use an unobtrusive eye tracking device and apply an algorithm for reading behavior detection. On this basis, we can extract terms characterizing the text parts interesting to the user and employ them for describing the class the document was assigned to by the user. Having learned classifiers in that way, new documents can be classified automatically using techniques of passage-based retrieval. We prove the very strong improvement of incorporating the user's visual attention by a case study that evaluates an attention-based term extraction method. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Categorization of On-Line Handwritten Documents

    Page(s): 95 - 102
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (588 KB) |  | HTML iconHTML  

    With the growth of on-line handwriting technologies, managing facilities for handwritten documents, such as retrieval of documents by topic, are required. These documents can contain graphics, equations or text for instance. This work reports experiments on categorization of on-line handwritten documents based on their textual contents. We assume that handwritten text blocks have been extracted from the documents, and as a first step of the proposed system, we process them with an existing handwritten recognition engine. We analyse the effect of the word recognition rate on the categorization performances, and we compare them with those obtained with the same texts available as ground truth. Two categorization algorithms (kNN and SVM) are compared in this work. The handwritten texts are a subset of the Reuters-21578 corpus collected from more than 1500 writers. Results show that there is no significant categorization performance loss when the word error rate stands below 22%. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Combining Multiple Methods for Book Indexing

    Page(s): 103 - 110
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (469 KB) |  | HTML iconHTML  

    In this paper we are interested in the problem of book splitting or more generally of indexing the logical parts of a document. This involves determining the boundaries of these parts as well as their label. We report here on the combined use of generic methods published in previous papers. We discuss the effect of combining several methods, also from a quality assurance perspective. Our experiments ground on real case studies of technical documents, such as books of specifications. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Automated OCR Ground Truth Generation

    Page(s): 111 - 117
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1830 KB) |  | HTML iconHTML  

    Most optical character recognition (OCR) systems need to be trained and tested on the symbols that are to be recognized. Therefore, ground truth data is needed. This data consists of character images together with their ASCII code. Among the approaches for generating ground truth of real world data, one promising technique is to use electronic version of the scanned documents. Using an alignment method, the character bounding boxes extracted from the electronic document are matched to the scanned image. Current alignment methods are not robust to different similarity transforms. They also need calibration to deal with non-linear local distortions introduced by the printing/scanning process. In this paper we present a significant improvement over existing methods, allowing to skip the calibration step and having a more accurate alignment, under all similarity transforms. Our method finds a robust and pixel accurate scanner independent alignment of the scanned image with the electronic document, allowing the extraction of accurate ground truth character information. The accuracy of the alignment is demonstrated using documents from the UW3 dataset. The results show that the mean distance between the estimated and the ground truth character bounding box position is less than one pixel. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Digital Renaissance Making Archives, Sharing Wisdoms and Creating Values

    Page(s): 121 - 132
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (60509 KB) |  | HTML iconHTML  

    This paper highlights the DIS (digital image system) technology and its application projects. The "Digital Ambassadorship project" between Italy and Japan brought the exhibition of the "Mind of Leonardo" to Japan for the first time and realized 3rd Italy-Japan real-time symposium for "Primavera Italiana 2007 in Japan," resulting in a great success. This will show us new directions of the technology, and find the possibilities of paradigm shift to revitalize and redefine traditional ones as a modern version of "digital renaissance". View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • State: A Multimodal Assisted Text-Transcription System for Ancient Documents

    Page(s): 135 - 142
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (2106 KB) |  | HTML iconHTML  

    We present a complete assisted transcription system for ancient documents: State. The system consists of two applications: a pen-based, interactive application to assist humans in transcribing ancient documents and a recognition engine which offers automatic transcriptions via a web service. The interaction model and the recognition algorithm employed in the current version of State are presented. Some preliminary experiments show the productivity gains obtained with the system when transcribing a document and the error rate of the current recognition engine. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Authorship Identification of Ukiyoe by Using Rakkan Image

    Page(s): 143 - 150
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (826 KB) |  | HTML iconHTML  

    This paper describes a method of identifying authorship of Ukiyoe prints by using Rakkan images found in the prints. A weighted direction index histogram method has been used to create the feature vector for Rakkan character analysis. Also the Pseudo Mahalanobis distances were used to judge distances between dictionary templates and test data. The method includes binarization of Rakkan images which has been done by recursively applying the Otsu method, which realized good character segmentation performance, eliminating the influence of stain and smears. We used 100 Ukiyoe prints which were drawn by 10 artists as both templates and test samples. The identification experiment was done using the leave-one-out method. The results of the experiment indicate there is a possibility of successfully using this method for identifying authorship of Ukiyoe. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.