By Topic

Pattern Analysis and Machine Intelligence, IEEE Transactions on

Issue 7 • Date July 1998

Filter Results

Displaying Results 1 - 10 of 10
  • Rotation invariant texture features and their use in automatic script identification

    Page(s): 751 - 756
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (582 KB)  

    Concerns the extraction of rotation invariant texture features and the use of such features in script identification from document images. Rotation invariant texture features are computed based on an extension of the popular multi-channel Gabor filtering technique, and their effectiveness is tested with 300 randomly rotated samples of 15 Brodatz textures. These features are then used in an attempt to solve a practical but hitherto mostly overlooked problem in document image processing-the identification of the script of a machine printed document. Automatic script and language recognition is an essential front-end process for the efficient and correct use of OCR and language translation products in a multilingual environment. Six languages (Chinese, English, Greek, Russian, Persian, and Malayalam) are chosen to demonstrate the potential of such a texture-based approach in script identification. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Comments on "Geodesic saliency of watershed contours and hierarchical segmentation" [with reply]

    Page(s): 762 - 766
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (57 KB)  

    In a paper on morphological image segmentation, Najman and Schmitt (1996) introduce the powerful concept of edge dynamics. In this communication, we show that the method that they propose to compute the edge dynamics gives erroneous results for certain spatial configurations, and we propose a new algorithm which always yields correct edge dynamics. The reply presents in detail the algorithm of the watershed, which have been sketched in the original article and criticized in the comment. First, the formal definition of the flooding list, the key data structure of the algorithm, is given. Then, the construction of this flooding list and of the watershed are described and proved. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Full text access may be available. Click article title to sign in or learn about subscription options.
  • INFORMys: a flexible invoice-like form-reader system

    Page(s): 730 - 745
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1844 KB)  

    We describe a flexible form-reader system capable of extracting textual information from accounting documents, like invoices and bills of service companies. In this kind of document, the extraction of some information fields cannot take place without having detected the corresponding instruction fields, which are only constrained to range in given domains. We propose modeling the document's layout by means of attributed relational graphs, which turn out to be very effective for form registration, as well as for performing a focused search for instruction fields. This search is carried out by means of a hybrid model, where proper algorithms, based on morphological operations and connected components, are integrated with connectionist models. Experimental results are given in order to assess the actual performance of the system View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • On the optimization criteria used in two-view motion analysis

    Page(s): 717 - 729
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (4132 KB)  

    The three best-known criteria in two-view motion analysis are based, respectively, on the distances between points and their corresponding epipolar lines, on the gradient-weighted epipolar errors, and on the distances between points and the re-projections of their reconstructed points. The last one has a better statistical interpretation, but is significantly slower than the first two. The author shows that, given a reasonable initial guess of the epipolar geometry, the last two criteria are equivalent when the epipoles are at infinity, and differ from each other only a little even when the epipoles are in the image, as shown experimentally. The first two criteria are equivalent only when the epipoles are at infinity and when the observed object/scene has the same scale in the two images. This suggests that the second criterion is sufficient in practice because of its computational efficiency. Experiments with several thousand computer simulations and four sets of real data confirm the analysis View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • The vector-gradient Hough transform

    Page(s): 746 - 750
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (140 KB)  

    The paper presents a new transform, called vector-gradient Hough transform, for identifying elongated shapes in gray-scale images. This goal is achieved not only by collecting information on the edges of the objects, but also by reconstructing their transversal profile of luminosity. The main features of the new approach are related to its vector space formulation and the associated capability of exploiting all the vector information of the luminosity gradient View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Local scale control for edge detection and blur estimation

    Page(s): 699 - 716
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (2508 KB)  

    We show that knowledge of sensor properties and operator norms can be exploited to define a unique, locally computable minimum reliable scale for local estimation at each point in the image. This method for local scale control is applied to the problem of detecting and localizing edges in images with shallow depth of field and shadows. We show that edges spanning a broad range of blur scales and contrasts can be recovered accurately by a single system with no input parameters other than the second moment of the sensor noise. A natural dividend of this approach is a measure of the thickness of contours which can be used to estimate focal and penumbral blur. Local scale control is shown to be important for the estimation of blur in complex images, where the potential for interference between nearby edges of very different blur scale requires that estimates be made at the minimum reliable scale View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A list-processing approach to compute Voronoi diagrams and the Euclidean distance transform

    Page(s): 757 - 761
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (872 KB)  

    We propose an efficient Voronoi transform algorithm for constructing Voronoi diagrams using segment lists of rows. A significant feature of the algorithm is that it takes segments rather than pixels as the basic units to represent and propagate the nearest neighbor information. The segment lists are dynamically updated as they are scanned. A distance map can then be easily computed from the segment list representation of the Voronoi diagram. Experimental results have demonstrated its high efficiency. Extension of the algorithm to higher dimensions is also discussed View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Junctions: detection, classification, and reconstruction

    Page(s): 687 - 698
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (2524 KB)  

    Junctions are important features for image analysis and form a critical aspect of image understanding tasks such as object recognition. We present a unified approach to detecting, classifying, and reconstructing junctions in images. Our main contribution is a modeling of the junction which is complex enough to handle all these issues and yet simple enough to admit an effective dynamic programming solution. We use a template deformation framework along with a gradient criterium to detect radial partitions of the template. We use the minimum description length principle to obtain the optimal number of partitions that best describes the junction. The Kona detector presented by Parida et al. (1997) is an implementation of this model. We demonstrate the stability and robustness of the detector by analyzing its behavior in the presence of noise, using synthetic/controlled apparatus. We also present a qualitative study of its behavior on real images View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • An analytic-to-holistic approach for face recognition based on a single frontal view

    Page(s): 673 - 686
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (604 KB)  

    We propose an analytic-to-holistic approach which can identify faces at different perspective variations. The database for the test consists of 40 frontal-view faces. The first step is to locate 15 feature points on a face. A head model is proposed, and the rotation of the face can be estimated using geometrical measurements. The positions of the feature points are adjusted so that their corresponding positions for the frontal view are approximated. These feature points are then compared with the feature points of the faces in a database using a similarity transform. In the second step, we set up windows for the eyes, nose, and mouth. These feature windows are compared with those in the database by correlation. Results show that this approach can achieve a similar level of performance from different viewing directions of a face. Under different perspective variations, the overall recognition rates are over 84 percent and 96 percent for the first and the first three likely matched faces, respectively View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.

Aims & Scope

The IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI) is published monthly. Its editorial board strives to present most important research results in areas within TPAMI's scope.

Full Aims & Scope

Meet Our Editors

Editor-in-Chief
David A. Forsyth
University of Illinois