By Topic

Pattern Analysis and Machine Intelligence, IEEE Transactions on

Issue 5 • Date May 2009

Filter Results

Displaying Results 1 - 19 of 19
  • [Front cover]

    Page(s): c1
    Save to Project icon | Request Permissions | PDF file iconPDF (172 KB)  
    Freely Available from IEEE
  • [Inside front cover]

    Page(s): c2
    Save to Project icon | Request Permissions | PDF file iconPDF (81 KB)  
    Freely Available from IEEE
  • Approximate Matching of Digital Point Sets Using a Novel Angular Tree

    Page(s): 769 - 782
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (3032 KB) |  | HTML iconHTML  

    Matching and analysis of patterns or shapes in the digital plane are of utmost importance in various problems of computer vision and pattern recognition. A digital point set is such a pattern that corresponds to an object in the digital plane. Although there exist several data structures that can be employed for Approximate Point Set Pattern Matching (APSPM) in the real domain, they require substantial modification to support algorithms in the digital domain. To bridge this gap, a novel data structure called "angular treerdquo is proposed, targeting an efficient and error-controllable circular range query in the digital plane. The farthest pair of points may be used as the starting correspondence between the pattern set and the background set. Several classical discrete structures and methodologies of computational geometry, as well as some topological features of circles/discs in digital geometry, have been used in tandem, for successful realization of the proposed APSPM algorithm in the digital plane. The APSPM algorithm based on the angular tree has been implemented and tested on various point sets and the reported results demonstrate the efficiency and versatility of the new data structure for supporting APSPM algorithms. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Branch-and-Bound Methods for Euclidean Registration Problems

    Page(s): 783 - 794
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (989 KB) |  | HTML iconHTML  

    In this paper, we propose a practical and efficient method for finding the globally optimal solution to the problem of determining the pose of an object. We present a framework that allows us to use point-to-point, point-to-line, and point-to-plane correspondences for solving various types of pose and registration problems involving euclidean (or similarity) transformations. Traditional methods such as the iterative closest point algorithm or bundle adjustment methods for camera pose may get trapped in local minima due to the nonconvexity of the corresponding optimization problem. Our approach of solving the mathematical optimization problems guarantees global optimality. The optimization scheme is based on ideas from global optimization theory, in particular convex underestimators in combination with branch-and-bound methods. We provide a provably optimal algorithm and demonstrate good performance on both synthetic and real data. We also give examples of where traditional methods fail due to the local minima problem. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Distribution-Based Dimensionality Reduction Applied to Articulated Motion Recognition

    Page(s): 795 - 810
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (4013 KB) |  | HTML iconHTML  

    Some articulated motion representations rely on frame-wise abstractions of the statistical distribution of low-level features such as orientation, color, or relational distributions. As configuration among parts changes with articulated motion, the distribution changes, tracing a trajectory in the latent space of distributions, which we call the configuration space. These trajectories can then be used for recognition using standard techniques such as dynamic time warping. The core theory in this paper concerns embedding the frame-wise distributions, which can be looked upon as probability functions, into a low-dimensional space so that we can estimate various meaningful probabilistic distances such as the Chernoff, Bhattacharya, Matusita, Kullback-Leibler (KL) or symmetric-KL distances based on dot products between points in this space. Apart from computational advantages, this representation also affords speed-normalized matching of motion signatures. Speed normalized representations can be formed by interpolating the configuration trajectories along their arc lengths, without using any knowledge of the temporal scale variations between the sequences. We experiment with five different probabilistic distance measures and show the usefulness of the representation in three different contexts - sign recognition (with large number of possible classes), gesture recognition (with person variations), and classification of human-human interaction sequences (with segmentation problems). We find the importance of using the right distance measure for each situation. The low-dimensional embedding makes matching two to three times faster, while achieving recognition accuracies that are close to those obtained without using a low-dimensional embedding. We also empirically establish the robustness of the representation with respect to low-level parameters, embedding parameters, and temporal-scale parameters. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Image Transformations and Blurring

    Page(s): 1000 - 9999
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (3449 KB) |  | HTML iconHTML  

    Since cameras blur the incoming light during measurement, different images of the same surface do not contain the same information about that surface. Thus, in general, corresponding points in multiple views of a scene have different image intensities. While multiple-view geometry constrains the locations of corresponding points, it does not give relationships between the signals at corresponding locations. This paper offers an elementary treatment of these relationships. We first develop the notion of "idealrdquo and "realrdquo images, corresponding to, respectively, the raw incoming light and the measured signal. This framework separates the filtering and geometric aspects of imaging. We then consider how to synthesize one view of a surface from another; if the transformation between the two views is affine, it emerges that this is possible if and only if the singular values of the affine matrix are positive. Next, we consider how to combine the information in several views of a surface into a single output image. By developing a new tool called "frequency segmentation," we show how this can be done despite not knowing the blurring kernel. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Make3D: Learning 3D Scene Structure from a Single Still Image

    Page(s): 824 - 840
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (6177 KB) |  | HTML iconHTML  

    We consider the problem of estimating detailed 3D structure from a single still image of an unstructured environment. Our goal is to create 3D models that are both quantitatively accurate as well as visually pleasing. For each small homogeneous patch in the image, we use a Markov random field (MRF) to infer a set of "plane parametersrdquo that capture both the 3D location and 3D orientation of the patch. The MRF, trained via supervised learning, models both image depth cues as well as the relationships between different parts of the image. Other than assuming that the environment is made up of a number of small planes, our model makes no explicit assumptions about the structure of the scene; this enables the algorithm to capture much more detailed 3D structure than does prior art and also give a much richer experience in the 3D flythroughs created using image-based rendering, even for scenes with significant nonvertical structure. Using this approach, we have created qualitatively correct 3D models for 64.9 percent of 588 images downloaded from the Internet. We have also extended our model to produce large-scale 3D models from a few images. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Low-Rank Matrix Fitting Based on Subspace Perturbation Analysis with Applications to Structure from Motion

    Page(s): 841 - 854
    Multimedia
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1520 KB) |  | HTML iconHTML  

    The task of finding a low-rank (r) matrix that best fits an original data matrix of higher rank is a recurring problem in science and engineering. The problem becomes especially difficult when the original data matrix has some missing entries and contains an unknown additive noise term in the remaining elements. The former problem can be solved by concatenating a set of r-column matrices that share a common single r-dimensional solution space. Unfortunately, the number of possible submatrices is generally very large and, hence, the results obtained with one set of r-column matrices will generally be different from that captured by a different set. Ideally, we would like to find that solution that is least affected by noise. This requires that we determine which of the r-column matrices (i.e., which of the original feature points) are less influenced by the unknown noise term. This paper presents a criterion to successfully carry out such a selection. Our key result is to formally prove that the more distinct the r vectors of the r-column matrices are, the less they are swayed by noise. This key result is then combined with the use of a noise model to derive an upper bound for the effect that noise and occlusions have on each of the r-column matrices. It is shown how this criterion can be effectively used to recover the noise-free matrix of rank r. Finally, we derive the affine and projective structure-from-motion (SFM) algorithms using the proposed criterion. Extensive validation on synthetic and real data sets shows the superiority of the proposed approach over the state of the art. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A Novel Connectionist System for Unconstrained Handwriting Recognition

    Page(s): 855 - 868
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1587 KB) |  | HTML iconHTML  

    Recognizing lines of unconstrained handwritten text is a challenging task. The difficulty of segmenting cursive or overlapping characters, combined with the need to exploit surrounding context, has led to low recognition rates for even the best current recognizers. Most recent progress in the field has been made either through improved preprocessing or through advances in language modeling. Relatively little work has been done on the basic recognition algorithms. Indeed, most systems rely on the same hidden Markov models that have been used for decades in speech and handwriting recognition, despite their well-known shortcomings. This paper proposes an alternative approach based on a novel type of recurrent neural network, specifically designed for sequence labeling tasks where the data is hard to segment and contains long-range bidirectional interdependencies. In experiments on two large unconstrained handwriting databases, our approach achieves word recognition accuracies of 79.7 percent on online data and 74.1 percent on offline data, significantly outperforming a state-of-the-art HMM-based system. In addition, we demonstrate the network's robustness to lexicon size, measure the individual influence of its hidden layers, and analyze its use of context. Last, we provide an in-depth discussion of the differences between the network and HMMs, suggesting reasons for the network's superior performance. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • NV-Tree: An Efficient Disk-Based Index for Approximate Search in Very Large High-Dimensional Collections

    Page(s): 869 - 883
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (2433 KB) |  | HTML iconHTML  

    Over the last two decades, much research effort has been spent on nearest neighbor search in high-dimensional data sets. Most of the approaches published thus far have, however, only been tested on rather small collections. When large collections have been considered, high-performance environments have been used, in particular systems with a large main memory. Accessing data on disk has largely been avoided because disk operations are considered to be too slow. It has been shown, however, that using large amounts of memory is generally not an economic choice. Therefore, we propose the NV-tree, which is a very efficient disk-based data structure that can give good approximate answers to nearest neighbor queries with a single disk operation, even for very large collections of high-dimensional data. Using a single NV-tree, the returned results have high recall but contain a number of false positives. By combining two or three NV-trees, most of those false positives can be avoided while retaining the high recall. Finally, we compare the NV-tree to locality sensitive hashing, a popular method for ??-distance search. We show that they return results of similar quality, but the NV-tree uses many fewer disk reads. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Robust Estimation of Albedo for Illumination-Invariant Matching and Shape Recovery

    Page(s): 884 - 899
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (4868 KB) |  | HTML iconHTML  

    We present a nonstationary stochastic filtering framework for the task of albedo estimation from a single image. There are several approaches in the literature for albedo estimation, but few include the errors in estimates of surface normals and light source direction to improve the albedo estimate. The proposed approach effectively utilizes the error statistics of surface normals and illumination direction for robust estimation of albedo, for images illuminated by single and multiple light sources. The albedo estimate obtained is subsequently used to generate albedo-free normalized images for recovering the shape of an object. Traditional shape-from-shading (SFS) approaches often assume constant/piecewise constant albedo and known light source direction to recover the underlying shape. Using the estimated albedo, the general problem of estimating the shape of an object with varying albedo map and unknown illumination source is reduced to one that can be handled by traditional SFS approaches. Experimental results are provided to show the effectiveness of the approach and its application to illumination-invariant matching and shape recovery. The estimated albedo maps are compared with the ground truth. The maps are used as illumination-invariant signatures for the task of face recognition across illumination variations. The recognition results obtained compare well with the current state-of-the-art approaches. Impressive shape recovery results are obtained using images downloaded from the Web with little control over imaging conditions. The recovered shapes are also used to synthesize novel views under novel illumination conditions. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Transitions of the 3D Medial Axis under a One-Parameter Family of Deformations

    Page(s): 900 - 918
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (5752 KB) |  | HTML iconHTML  

    The instabilities of the medial axis of a shape under deformations have long been recognized as a major obstacle to its use in recognition and other applications. These instabilities, or transitions, occur when the structure of the medial axis graph changes abruptly under deformations of shape. The recent classification of these transitions in 2D for the medial axis and for the shock graph was a key factor in the development of an object recognition system where the classified instabilities were utilized to represent deformation paths. The classification of generic transitions of the 3D medial axis could likewise potentially lead to a similar representation in 3D. In this paper, these transitions are classified by examining the order of contact of spheres with the surface, leading to an enumeration of possible transitions which are then examined on a case-by-case basis. Some cases are ruled out as never occurring in any family of deformations, while others are shown to be nongeneric in a one-parameter family of deformations. Finally, the remaining cases are shown to be viable by developing a specific example for each. Our work is inspired by that of Bogaevsky, who obtained the transitions as part of an investigation of viscosity solutions of Hamilton-Jacobi equations. Our contribution is to give a more down-to-earth approach, bringing this work to the attention of the computer vision community, and to provide explicit constructions for the various transitions using simple surfaces. We believe that the classification of these transitions is vital to the successful regularization of the medial axis in its use in real applications. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Visual Tracking by Continuous Density Propagation in Sequential Bayesian Filtering Framework

    Page(s): 919 - 930
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (2970 KB) |  | HTML iconHTML  

    Particle filtering is frequently used for visual tracking problems since it provides a general framework for estimating and propagating probability density functions for nonlinear and non-Gaussian dynamic systems. However, this algorithm is based on a Monte Carlo approach and the cost of sampling and measurement is a problematic issue, especially for high-dimensional problems. We describe an alternative to the classical particle filter in which the underlying density function has an analytic representation for better approximation and effective propagation. The techniques of density interpolation and density approximation are introduced to represent the likelihood and the posterior densities with Gaussian mixtures, where all relevant parameters are automatically determined. The proposed analytic approach is shown to perform more efficiently in sampling in high-dimensional space. We apply the algorithm to real-time tracking problems and demonstrate its performance on real video sequences as well as synthetic examples. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Asymmetric Principal Component and Discriminant Analyses for Pattern Classification

    Page(s): 931 - 937,
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1908 KB) |  | HTML iconHTML  

    This paper studies the roles of the principal component and discriminant analyses in the pattern classification and explores their problems with the asymmetric classes and/or the unbalanced training data. An asymmetric principal component analysis (APCA) is proposed to remove the unreliable dimensions more effectively than the conventional PCA. Targeted at the two-class problem, an asymmetric discriminant analysis in the APCA subspace is proposed to regularize the eigenvalue that is, in general, a biased estimate of the variance in the corresponding dimension. These efforts facilitate a reliable and discriminative feature extraction for the asymmetric classes and/or the unbalanced training data. The proposed approach is validated in the experiments by comparing it with the related methods. It consistently achieves the highest classification accuracy among all tested methods in the experiments. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Estimating 3D Positions and Velocities of Projectiles from Monocular Views

    Page(s): 938 - 944
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (893 KB) |  | HTML iconHTML  

    In this paper, we consider the problem of localizing a projectile in 3D based on its apparent motion in a stationary monocular view. A thorough theoretical analysis is developed, from which we establish the minimum conditions for the existence of a unique solution. The theoretical results obtained have important implications for applications involving projectile motion. A robust, nonlinear optimization-based formulation is proposed, and the use of a local optimization method is justified by detailed examination of the local convexity structure of the cost function. The potential of this approach is validated by experimental results. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Skeletal Shape Abstraction from Examples

    Page(s): 944 - 952
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1712 KB) |  | HTML iconHTML  

    Learning a class prototype from a set of exemplars is an important challenge facing researchers in object categorization. Although the problem is receiving growing interest, most approaches assume a one-to-one correspondence among local features, restricting their ability to learn true abstractions of a shape. In this paper, we present a new technique for learning an abstract shape prototype from a set of exemplars whose features are in many-to-many correspondence. Focusing on the domain of 2D shape, we represent a silhouette as a medial axis graph whose nodes correspond to "partsrdquo defined by medial branches and whose edges connect adjacent parts. Given a pair of medial axis graphs, we establish a many-to-many correspondence between their nodes to find correspondences among articulating parts. Based on these correspondences, we recover the abstracted medial axis graph along with the positional and radial attributes associated with its nodes. We evaluate the abstracted prototypes in the context of a recognition task. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Simultaneous Localized Feature Selection and Model Detection for Gaussian Mixtures

    Page(s): 953 - 960
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1146 KB) |  | HTML iconHTML  

    In this paper, we propose a novel approach of simultaneous localized feature selection and model detection for unsupervised learning. In our approach, local feature saliency, together with other parameters of Gaussian mixtures, are estimated by Bayesian variational learning. Experiments performed on both synthetic and real-world data sets demonstrate that our approach is superior over both global feature selection and subspace clustering methods. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • TPAMI Information for authors

    Page(s): c3
    Save to Project icon | Request Permissions | PDF file iconPDF (81 KB)  
    Freely Available from IEEE
  • [Back cover]

    Page(s): c4
    Save to Project icon | Request Permissions | PDF file iconPDF (172 KB)  
    Freely Available from IEEE

Aims & Scope

The IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI) is published monthly. Its editorial board strives to present most important research results in areas within TPAMI's scope.

Full Aims & Scope

Meet Our Editors

Editor-in-Chief
David A. Forsyth
University of Illinois