By Topic

Pattern Analysis and Machine Intelligence, IEEE Transactions on

Issue 1 • Date Jan 2002

Filter Results

Displaying Results 1 - 11 of 11
  • Piecewise linear skeletonization using principal curves

    Publication Year: 2002 , Page(s): 59 - 74
    Cited by:  Papers (62)  |  Patents (1)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1237 KB) |  | HTML iconHTML  

    Proposes an algorithm to find piecewise linear skeletons of handwritten characters by using principal curves. The development of the method was inspired by the apparent similarity between the definition of principal curves (smooth curves which pass through the "middle" of a cloud of points) and medial axes (smooth curves that run equidistantly from the contours of a character image). The central fitting-and-smoothing step of the algorithm is an extension of the polygonal line algorithm, which approximates principal curves of data sets by piecewise linear curves. The polygonal line algorithm is extended to find principal graphs and complemented with two steps specific to the task of skeletonization: an initialization method to capture the approximate topology of the character, and a collection of restructuring operations to improve the structural quality of the skeleton produced by the initialization method. An advantage of our approach over existing methods is that we optimize the skeleton graph by minimizing an intuitive and explicit objective function that captures the two competing criteria of smoothing the skeleton and fitting it closely to the pixels of the character image. We tested the algorithm on isolated handwritten digits and images of continuous handwriting. The results indicated that the proposed algorithm can find a smooth medial axis in the great majority of a wide variety of character templates and that it substantially improves the pixel-wise skeleton obtained by traditional thinning methods View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Integrated position estimation using aerial image sequences

    Publication Year: 2002 , Page(s): 1 - 18
    Cited by:  Papers (35)  |  Patents (2)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1480 KB) |  | HTML iconHTML  

    Presents an integrated system for navigation parameter estimation using sequential aerial images, where the navigation parameters represent the positional and velocity information of an aircraft for autonomous navigation. The proposed integrated system is composed of two parts: relative position estimation and absolute position estimation. Relative position estimation recursively computes the current position of an aircraft by accumulating relative displacement estimates extracted from two successive aerial images. Simple accumulation of parameter values reduces the reliability of the extracted parameter estimates as an aircraft goes on navigating, resulting in a large positional error. Therefore, absolute position estimation is required to compensate for the positional error generated by the relative position estimation. Absolute position estimation algorithms using image matching and digital elevation model (DEM) matching are presented. In the image matching, a robust-oriented Hausdorff measure (ROHM) is employed, whereas in the DEM matching, an algorithm using multiple image pairs is used. Experiments with four real aerial image sequences show the effectiveness of the proposed integrated position estimation algorithm View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Object tracking with Bayesian estimation of dynamic layer representations

    Publication Year: 2002 , Page(s): 75 - 89
    Cited by:  Papers (96)  |  Patents (2)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (918 KB) |  | HTML iconHTML  

    Decomposing video frames into coherent 2D motion layers is a powerful method for representing videos. Such a representation provides an intermediate description that enables applications such as object tracking, video summarization and visualization, video insertion, and sprite-based video compression. Previous work on motion layer analysis has largely concentrated on two-frame or multi-frame batch formulations. The temporal coherency of motion layers and the domain constraints on shapes have not been exploited. This paper introduces a complete dynamic motion layer representation in which spatial and temporal constraints on shape, motion and layer appearance are modeled and estimated in a maximum a-posteriori (MAP) framework using the generalized expectation-maximization (EM) algorithm. In order to limit the computational complexity of tracking arbitrarily shaped layer ownership, we propose a shape prior that parameterizes the representation of shape and prevents motion layers from evolving into arbitrary shapes. In this work, a Gaussian shape prior is chosen to specifically develop a near-real-time tracker for vehicle tracking in aerial videos. However, the general idea of using a parametric shape representation as part of the state of a tracker is a powerful one that can be extended to other domains as well. Based on the dynamic layer representation, an iterative algorithm is developed for continuous object tracking over time. The proposed method has been successfully applied in an airborne vehicle tracking system. Its performance is compared with that of a correlation-based tracker and a motion change-based tracker to demonstrate the advantages of the new method. Examples of tracking when the backgrounds are cluttered and the vehicles undergo various rigid motions and complex interactions such as passing, turning, and stop-and-go demonstrate the strength of the complete dynamic layer representation View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Multihierarchical graph search

    Publication Year: 2002 , Page(s): 103 - 113
    Cited by:  Papers (14)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1074 KB) |  | HTML iconHTML  

    The use of hierarchical graph searching for finding paths in graphs is well known in the literature, providing better results than plain graph searching, with respect to computational costs, in many cases. This paper offers a step forward by including multiple hierarchies in a graph-based model. Such a multi-hierarchical model has the following advantages: First, a multiple hierarchy permits us to choose the best hierarchy to solve each search problem; second, when several search problems have to be solved, a multiple hierarchy provides the possibility of solving some of them simultaneously; and third, solutions to the search problems can be expressed in any of the hierarchies of the multiple hierarchy, which allows us to represent the information in the most suitable way for each specific purpose. In general, multiple hierarchies have proven to be a more adaptable model than single-hierarchy or non-hierarchical models. This paper formalizes the multi-hierarchical model, describes the techniques that have been designed for taking advantage of multiple hierarchies in a hierarchical path search, and presents some experiments and results on the performance of these techniques View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Extraction and optimization of B-spline PBD templates for recognition of connected handwritten digit strings

    Publication Year: 2002 , Page(s): 132 - 139
    Cited by:  Papers (6)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (340 KB) |  | HTML iconHTML  

    The recognition of connected handwritten digit strings is a challenging task due mainly to two problems: poor character segmentation and unreliable isolated character recognition. The authors first present a rational B-spline representation of digit templates based on Pixel-to-Boundary Distance (PBD) maps. We then present a neural network approach to extract B-spline PBD templates and an evolutionary algorithm to optimize these templates. In total, 1000 templates (100 templates for each of 10 classes) were extracted from and optimized on 10426 training samples from the NIST Special Database 3. By using these templates, a nearest neighbor classifier can successfully reject 90.7 percent of nondigit patterns while achieving a 96.4 percent correct classification of isolated test digits. When our classifier is applied to the recognition of 4958 connected handwritten digit strings (4555 2-digit, 355 3-digit, and 48 4-digit strings) from the NIST Special Database 3 with a dynamic programming approach, it has a correct classification rate of 82.4 percent with a rejection rate of as low as 0.85 percent. Our classifier compares favorably in terms of correct classification rate and robustness with other classifiers that are tested View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • The radiometry of multiple images

    Publication Year: 2002 , Page(s): 19 - 33
    Cited by:  Papers (8)  |  Patents (1)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (692 KB) |  | HTML iconHTML  

    We introduce a methodology for radiometric reconstruction (i.e. the simultaneous recovery of multiple illuminants and surface albedoes from multiple views), assuming that the geometry of the scene and of the cameras is known. We formulate a linear theory of multiple illuminants and show its similarity to the theory of geometric recovery of multiple views. Linear and nonlinear implementations are proposed, simulation results are discussed and, finally, results on real images are presented View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Detecting faces in images: a survey

    Publication Year: 2002 , Page(s): 34 - 58
    Cited by:  Papers (970)  |  Patents (214)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1299 KB) |  | HTML iconHTML  

    Images containing faces are essential to intelligent vision-based human-computer interaction, and research efforts in face processing include face recognition, face tracking, pose estimation and expression recognition. However, many reported methods assume that the faces in an image or an image sequence have been identified and localized. To build fully automated systems that analyze the information contained in face images, robust and efficient face detection algorithms are required. Given a single image, the goal of face detection is to identify all image regions which contain a face, regardless of its 3D position, orientation and lighting conditions. Such a problem is challenging because faces are non-rigid and have a high degree of variability in size, shape, color and texture. Numerous techniques have been developed to detect faces in a single image, and the purpose of this paper is to categorize and evaluate these algorithms. We also discuss relevant issues such as data collection, evaluation metrics and benchmarking. After analyzing these algorithms and identifying their limitations, we conclude with several promising directions for future research View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A note on Park and Chin's algorithm [structuring element decomposition]

    Publication Year: 2002 , Page(s): 139 - 144
    Cited by:  Papers (2)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (245 KB) |  | HTML iconHTML  

    A finite subset of Z2 is called a structuring element. A decomposition of a structuring element A is a sequence of subsets of the elementary square (i.e., the 3×3 square centered at the origin) such that the Minkowski addition of them is equal to A. H. Park and R.T. Chin (see ibid., vol.17, no.1, p.2-15, 1995) developed an algorithm for finding the optimal decomposition of simply connected structuring elements (i.e., 8-connected structuring elements that contain no holes), imposing the restriction that all subsets in this decomposition are also simply connected. The authors show that there exist infinite families of simply connected structuring elements that have decompositions but are not decomposable according to Park and Chin's definition View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Reconstruction of three-dimensional objects through matching of their parts

    Publication Year: 2002 , Page(s): 114 - 124
    Cited by:  Papers (29)  |  Patents (1)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (991 KB) |  | HTML iconHTML  

    The problem of re-assembling an object from its parts or fragments has never been addressed with a unified computational approach, which depends on the pure geometric form of the parts and not on application-specific features. We propose a method for the automatic reconstruction of a model based on the geometry of its parts, which may be computer-generated models or range-scanned models. The matching process can benefit from any other external constraint imposed by the specific application View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Optical flow in log-mapped image plane - a new approach

    Publication Year: 2002 , Page(s): 125 - 131
    Cited by:  Papers (9)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (476 KB) |  | HTML iconHTML  

    Foveating vision sensors are important in both machine and biological vision. The term space-variant or foveating vision refers to sensor architectures based on smooth variation of resolution across the visual field, like that of the human visual system. Traditional image processing techniques do not hold when applied directly to such an image representation since the translation symmetry and the neighborhood structure in the spatial domain is broken by the space-variant properties of the sensor. Unfortunately, there has been little systematic development of image processing tools that are explicitly designed for foveated vision. The author proposes a novel approach to compute the optical flow directly on log-mapped images. We propose the use of a generalized dynamic image model (GDIM) based method for computing the optical flow as opposed to the brightness constancy model (BCM) based method. We introduce a new notion of "variable window" and use the space-variant form of gradient operator while computing the spatio-temporal gradient in log-mapped images for a better accuracy and to ensure that the local neighborhood is preserved. We emphasize that the proposed method must be numerically accurate, provide a consistent interpretation, and be capable of computing the peripheral motion. Experimental results on both the synthetic and real images have been presented to show the efficacy of the proposed method View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • ICP registration using invariant features

    Publication Year: 2002 , Page(s): 90 - 102
    Cited by:  Papers (135)  |  Patents (4)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (687 KB) |  | HTML iconHTML  

    Investigates the use of Euclidean invariant features in a generalization of iterative closest point (ICP) registration of range images. Pointwise correspondences are chosen as the closest point with respect to a weighted linear combination of positional and feature distances. It is shown that, under ideal noise-free conditions, correspondences formed using this distance function are correct more often than correspondences formed using the positional distance alone. In addition, monotonic convergence to at least a local minimum is shown to hold for this method. When noise is present, a method that automatically sets the optimal relative contribution of features and positions is described. This method trades off the error in feature values due to noise against the error in positions due to misalignment. Experimental results suggest that using invariant features decreases the probability of being trapped in a local minimum and may be an effective solution for difficult range image registration problems where the scene is very small compared to the model View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.

Aims & Scope

The IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI) is published monthly. Its editorial board strives to present most important research results in areas within TPAMI's scope.

Full Aims & Scope

Meet Our Editors

Editor-in-Chief
David A. Forsyth
University of Illinois