By Topic

Pattern Analysis and Machine Intelligence, IEEE Transactions on

Issue 12 • Date Dec. 2009

Filter Results

Displaying Results 1 - 20 of 20
  • [Front cover]

    Page(s): c1
    Save to Project icon | Request Permissions | PDF file iconPDF (238 KB)  
    Freely Available from IEEE
  • [Inside front cover]

    Page(s): c2
    Save to Project icon | Request Permissions | PDF file iconPDF (108 KB)  
    Freely Available from IEEE
  • Guest Editors' Introduction to the Special Section on Award Winning Papers from the IEEE CS Conference on Computer Vision and Pattern Recognition (CVPR)

    Page(s): 2113 - 2114
    Save to Project icon | Request Permissions | PDF file iconPDF (105 KB)  
    Freely Available from IEEE
  • Global Stereo Reconstruction under Second-Order Smoothness Priors

    Page(s): 2115 - 2128
    Multimedia
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (2363 KB) |  | HTML iconHTML  

    Second-order priors on the smoothness of 3D surfaces are a better model of typical scenes than first-order priors. However, stereo reconstruction using global inference algorithms, such as graph cuts, has not been able to incorporate second-order priors because the triple cliques needed to express them yield intractable (nonsubmodular) optimization problems. This paper shows that inference with triple cliques can be effectively performed. Our optimization strategy is a development of recent extensions to alpha-expansion, based on the ldquo QPBOrdquo algorithm. The strategy is to repeatedly merge proposal depth maps using a novel extension of QPBO. Proposal depth maps can come from any source, for example, frontoparallel planes as in alpha-expansion, or indeed any existing stereo algorithm, with arbitrary parameter settings. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Efficient Subwindow Search: A Branch and Bound Framework for Object Localization

    Page(s): 2129 - 2142
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (2646 KB) |  | HTML iconHTML  

    Most successful object recognition systems rely on binary classification, deciding only if an object is present or not, but not providing information on the actual object location. To estimate the object's location, one can take a sliding window approach, but this strongly increases the computational cost because the classifier or similarity function has to be evaluated over a large set of candidate subwindows. In this paper, we propose a simple yet powerful branch and bound scheme that allows efficient maximization of a large class of quality functions over all possible subimages. It converges to a globally optimal solution typically in linear or even sublinear time, in contrast to the quadratic scaling of exhaustive or sliding window search. We show how our method is applicable to different object detection and image retrieval scenarios. The achieved speedup allows the use of classifiers for localization that formerly were considered too slow for this task, such as SVMs with a spatial pyramid kernel or nearest-neighbor classifiers based on the lambda2 distance. We demonstrate state-of-the-art localization performance of the resulting systems on the UIUC Cars data set, the PASCAL VOC 2006 data set, and in the PASCAL VOC 2007 competition. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Fast Similarity Search for Learned Metrics

    Page(s): 2143 - 2157
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1348 KB) |  | HTML iconHTML  

    We introduce a method that enables scalable similarity search for learned metrics. Given pairwise similarity and dissimilarity constraints between some examples, we learn a Mahalanobis distance function that captures the examples' underlying relationships well. To allow sublinear time similarity search under the learned metric, we show how to encode the learned metric parameterization into randomized locality-sensitive hash functions. We further formulate an indirect solution that enables metric learning and hashing for vector spaces whose high dimensionality makes it infeasible to learn an explicit transformation over the feature dimensions. We demonstrate the approach applied to a variety of image data sets, as well as a systems data set. The learned metrics improve accuracy relative to commonly used metric baselines, while our hashing construction enables efficient indexing with learned distances and very large databases. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Epitomic Location Recognition

    Page(s): 2158 - 2167
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (2866 KB) |  | HTML iconHTML  

    This paper presents a novel method for location recognition, which exploits an epitomic representation to achieve both high efficiency and good generalization. A generative model based on epitomic image analysis captures the appearance and geometric structure of an environment while allowing for variations due to motion, occlusions, and non-Lambertian effects. The ability to model translation and scale invariance together with the fusion of diverse visual features yields enhanced generalization with economical training. Experiments on both existing and new labeled image databases result in recognition accuracy superior to state of the art with real-time computational performance. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Leave-One-Out-Training and Leave-One-Out-Testing Hidden Markov Models for a Handwritten Numeral Recognizer: The Implications of a Single Classifier and Multiple Classifications

    Page(s): 2168 - 2178
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (2432 KB) |  | HTML iconHTML  

    Hidden Markov models (HMMs) have been shown to be useful in handwritten pattern recognition. However, owing to their fundamental structure, they have little resistance to unexpected noise among observation sequences. In other words, unexpected noise in a sequence might ldquo breakrdquo the normal transmission of states for this sequence, making it unrecognizable to trained models. To resolve this problem, we propose a leave-one-out-training strategy, which will make the models more robust. We also propose a leave-one-out-testing method, which will compensate for some of the negative effects of this noise. The latter is actually an example of a system with a single classifier and multiple classifications. Compared with the 98.00 percent accuracy of the benchmark HMMs, the new system achieves a 98.88 percent accuracy rate on handwritten digits. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Monocular Pedestrian Detection: Survey and Experiments

    Page(s): 2179 - 2195
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (3619 KB) |  | HTML iconHTML  

    Pedestrian detection is a rapidly evolving area in computer vision with key applications in intelligent vehicles, surveillance, and advanced robotics. The objective of this paper is to provide an overview of the current state of the art from both methodological and experimental perspectives. The first part of the paper consists of a survey. We cover the main components of a pedestrian detection system and the underlying models. The second (and larger) part of the paper contains a corresponding experimental study. We consider a diverse set of state-of-the-art systems: wavelet-based AdaBoost cascade, HOG/linSVM, NN/LRF, and combined shape-texture detection. Experiments are performed on an extensive data set captured onboard a vehicle driving through urban environment. The data set includes many thousands of training samples as well as a 27-minute test sequence involving more than 20,000 images with annotated pedestrian locations. We consider a generic evaluation setting and one specific to pedestrian detection onboard a vehicle. Results indicate a clear advantage of HOG/linSVM at higher image resolutions and lower processing speeds, and a superiority of the wavelet-based AdaBoost cascade approach at lower image resolutions and (near) real-time processing speeds. The data set (8.5 GB) is made public for benchmarking purposes. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Multiple-Target Tracking by Spatiotemporal Monte Carlo Markov Chain Data Association

    Page(s): 2196 - 2210
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1844 KB) |  | HTML iconHTML  

    We propose a framework for tracking multiple targets, where the input is a set of candidate regions in each frame, as obtained from a state-of-the-art background segmentation module, and the goal is to recover trajectories of targets over time. Due to occlusions by targets and static objects, as also by noisy segmentation and false alarms, one foreground region may not correspond to one target faithfully. Therefore, the one-to-one assumption used in most data association algorithms is not always satisfied. Our method overcomes the one-to-one assumption by formulating the visual tracking problem in terms of finding the best spatial and temporal association of observations, which maximizes the consistency of both motion and appearance of trajectories. To avoid enumerating all possible solutions, we take a data-driven Markov Chain Monte Carlo (DD-MCMC) approach to sample the solution space efficiently. The sampling is driven by an informed proposal scheme controlled by a joint probability model combining motion and appearance. Comparative experiments with quantitative evaluations are provided. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Ordinal Measures for Iris Recognition

    Page(s): 2211 - 2226
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (5126 KB) |  | HTML iconHTML  

    Images of a human iris contain rich texture information useful for identity authentication. A key and still open issue in iris recognition is how best to represent such textural information using a compact set of features (iris features). In this paper, we propose using ordinal measures for iris feature representation with the objective of characterizing qualitative relationships between iris regions rather than precise measurements of iris image structures. Such a representation may lose some image-specific information, but it achieves a good trade-off between distinctiveness and robustness. We show that ordinal measures are intrinsic features of iris patterns and largely invariant to illumination changes. Moreover, compactness and low computational complexity of ordinal measures enable highly efficient iris recognition. Ordinal measures are a general concept useful for image analysis and many variants can be derived for ordinal feature extraction. In this paper, we develop multilobe differential filters to compute ordinal measures with flexible intralobe and interlobe parameters such as location, scale, orientation, and distance. Experimental results on three public iris image databases demonstrate the effectiveness of the proposed ordinal feature models. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A Path Following Algorithm for the Graph Matching Problem

    Page(s): 2227 - 2242
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (2828 KB) |  | HTML iconHTML  

    We propose a convex-concave programming approach for the labeled weighted graph matching problem. The convex-concave programming formulation is obtained by rewriting the weighted graph matching problem as a least-square problem on the set of permutation matrices and relaxing it to two different optimization problems: a quadratic convex and a quadratic concave optimization problem on the set of doubly stochastic matrices. The concave relaxation has the same global minimum as the initial graph matching problem, but the search for its global minimum is also a hard combinatorial problem. We, therefore, construct an approximation of the concave problem solution by following a solution path of a convex-concave problem obtained by linear interpolation of the convex and concave formulations, starting from the convex relaxation. This method allows to easily integrate the information on graph label similarities into the optimization problem, and therefore, perform labeled weighted graph matching. The algorithm is compared with some of the best performing graph matching methods on four data sets: simulated graphs, QAPLib, retina vessel images, and handwritten Chinese characters. In all cases, the results are competitive with the state of the art. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Single-Image Vignetting Correction

    Page(s): 2243 - 2256
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (3235 KB) |  | HTML iconHTML  

    In this paper, we propose a method for robustly determining the vignetting function given only a single image. Our method is designed to handle both textured and untextured regions in order to maximize the use of available information. To extract vignetting information from an image, we present adaptations of segmentation techniques that locate image regions with reliable data for vignetting estimation. Within each image region, our method capitalizes on the frequency characteristics and physical properties of vignetting to distinguish it from other sources of intensity variation. Rejection of outlier pixels is applied to improve the robustness of vignetting estimation. Comprehensive experiments demonstrate the effectiveness of this technique on a broad range of images with both simulated and natural vignetting effects. Causes of failures using the proposed algorithm are also analyzed. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Variational Curve Skeletons Using Gradient Vector Flow

    Page(s): 2257 - 2274
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (5145 KB) |  | HTML iconHTML  

    Representing a 3D shape by a set of 1D curves that are locally symmetric with respect to its boundary (i.e., curve skeletons) is of importance in several machine intelligence tasks. This paper presents a fast, automatic, and robust variational framework for computing continuous, subvoxel accurate curve skeletons from volumetric objects. A reference point inside the object is considered a point source that transmits two wave fronts of different energies. The first front (beta-front) converts the object into a graph, from which the object salient topological nodes are determined. Curve skeletons are tracked from these nodes along the cost field constructed by the second front (alpha-front) until the point source is reached. The accuracy and robustness of the proposed work are validated against competing techniques as well as a database of 3D objects. Unlike other state-of-the-art techniques, the proposed framework is highly robust because it avoids locating and classifying skeletal junction nodes, employs a new energy that does not form medial surfaces, and finally extracts curve skeletons that correspond to the most prominent parts of the shape and hence are less sensitive to noise. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Information Loss of the Mahalanobis Distance in High Dimensions: Application to Feature Selection

    Page(s): 2275 - 2281
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1372 KB) |  | HTML iconHTML  

    When an infinite training set is used, the Mahalanobis distance between a pattern measurement vector of dimensionality D and the center of the class it belongs to is distributed as a chi2 with D degrees of freedom. However, the distribution of Mahalanobis distance becomes either Fisher or Beta depending on whether cross validation or resubstitution is used for parameter estimation in finite training sets. The total variation between chi2 and Fisher, as well as between chi2 and Beta, allows us to measure the information loss in high dimensions. The information loss is exploited then to set a lower limit for the correct classification rate achieved by the Bayes classifier that is used in subset feature selection. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Optimal Reconstruction of Approximate Planar Surfaces Using Photometric Stereo

    Page(s): 2282 - 2289
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1663 KB) |  | HTML iconHTML  

    Photometric stereo can be used to obtain a fast and noncontact surface reconstruction of Lambertian surfaces. Despite several published works concerning the uncertainties and optimal light configurations of photometric stereo, no solutions for optimal surface reconstruction from noisy real images have been proposed. In this paper, optimal surface reconstruction methods for approximate planar textured surfaces using photometric stereo are derived, given that the statistics of imaging errors are measurable. Simulated and real surfaces are experimentally studied, and the results validate that the proposed approaches improve the surface reconstruction especially for the high-frequency height variations. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • TurboPixels: Fast Superpixels Using Geometric Flows

    Page(s): 2290 - 2297
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (4087 KB) |  | HTML iconHTML  

    We describe a geometric-flow-based algorithm for computing a dense oversegmentation of an image, often referred to as superpixels. It produces segments that, on one hand, respect local image boundaries, while, on the other hand, limiting undersegmentation through a compactness constraint. It is very fast, with complexity that is approximately linear in image size, and can be applied to megapixel sized images with high superpixel densities in a matter of minutes. We show qualitative demonstrations of high-quality results on several complex images. The Berkeley database is used to quantitatively compare its performance to a number of oversegmentation algorithms, showing that it yields less undersegmentation than algorithms that lack a compactness constraint while offering a significant speedup over N-cuts, which does enforce compactness. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Using Stereo Matching with General Epipolar Geometry for 2D Face Recognition across Pose

    Page(s): 2298 - 2304
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (2815 KB) |  | HTML iconHTML  

    Face recognition across pose is a problem of fundamental importance in computer vision. We propose to address this problem by using stereo matching to judge the similarity of two, 2D images of faces seen from different poses. Stereo matching allows for arbitrary, physically valid, continuous correspondences. We show that the stereo matching cost provides a very robust measure of similarity of faces that is insensitive to pose variations. To enable this, we show that, for conditions common in face recognition, the epipolar geometry of face images can be computed using either four or three feature points. We also provide a straightforward adaptation of a stereo matching algorithm to compute the similarity between faces. The proposed approach has been tested on the CMU PIE data set and demonstrates superior performance compared to existing methods in the presence of pose variation. It also shows robustness to lighting variation. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • TPAMI Information for authors

    Page(s): c3
    Save to Project icon | Request Permissions | PDF file iconPDF (108 KB)  
    Freely Available from IEEE
  • [Back cover]

    Page(s): c4
    Save to Project icon | Request Permissions | PDF file iconPDF (238 KB)  
    Freely Available from IEEE

Aims & Scope

The IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI) is published monthly. Its editorial board strives to present most important research results in areas within TPAMI's scope.

Full Aims & Scope

Meet Our Editors

Editor-in-Chief
David A. Forsyth
University of Illinois