By Topic

Pattern Analysis and Machine Intelligence, IEEE Transactions on

Issue 2 • Date Feb. 2009

Filter Results

Displaying Results 1 - 20 of 20
  • [Front cover]

    Publication Year: 2009 , Page(s): c1
    Save to Project icon | Request Permissions | PDF file iconPDF (179 KB)  
    Freely Available from IEEE
  • [Inside front cover]

    Publication Year: 2009 , Page(s): c2
    Save to Project icon | Request Permissions | PDF file iconPDF (82 KB)  
    Freely Available from IEEE
  • Offline Loop Investigation for Handwriting Analysis

    Publication Year: 2009 , Page(s): 193 - 209
    Cited by:  Papers (4)
    Multimedia
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (4555 KB) |  | HTML iconHTML  

    Resolution of different types of loops in handwritten script presents a difficult task and is an important step in many classic word recognition systems, writer modeling, and signature verification. When processing a handwritten script, a great deal of ambiguity occurs when strokes overlap, merge, or intersect. This paper presents a novel loop modeling and contour-based handwriting analysis that improves loop investigation. We show excellent results on various loop resolution scenarios, including axial loop understanding and collapsed loop recovery. We demonstrate our approach for loop investigation on several realistic data sets of static binary images and compare with the ground truth of the genuine online signal. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Robust Face Recognition via Sparse Representation

    Publication Year: 2009 , Page(s): 210 - 227
    Cited by:  Papers (1127)  |  Patents (5)
    Multimedia
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (3095 KB) |  | HTML iconHTML  

    We consider the problem of automatically recognizing human faces from frontal views with varying expression and illumination, as well as occlusion and disguise. We cast the recognition problem as one of classifying among multiple linear regression models and argue that new theory from sparse signal representation offers the key to addressing this problem. Based on a sparse representation computed by C1-minimization, we propose a general classification algorithm for (image-based) object recognition. This new framework provides new insights into two crucial issues in face recognition: feature extraction and robustness to occlusion. For feature extraction, we show that if sparsity in the recognition problem is properly harnessed, the choice of features is no longer critical. What is critical, however, is whether the number of features is sufficiently large and whether the sparse representation is correctly computed. Unconventional features such as downsampled images and random projections perform just as well as conventional features such as eigenfaces and Laplacianfaces, as long as the dimension of the feature space surpasses certain threshold, predicted by the theory of sparse representation. This framework can handle errors due to occlusion and corruption uniformly by exploiting the fact that these errors are often sparse with respect to the standard (pixel) basis. The theory of sparse representation helps predict how much occlusion the recognition algorithm can handle and how to choose the training images to maximize robustness to occlusion. We conduct extensive experiments on publicly available databases to verify the efficacy of the proposed algorithm and corroborate the above claims. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Natural Image Statistics and Low-Complexity Feature Selection

    Publication Year: 2009 , Page(s): 228 - 244
    Cited by:  Papers (12)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (2325 KB) |  | HTML iconHTML  

    Low-complexity feature selection is analyzed in the context of visual recognition. It is hypothesized that high-order dependences of bandpass features contain little information for discrimination of natural images. This hypothesis is characterized formally by the introduction of the concepts of conjunctive interference and decomposability order of a feature set. Necessary and sufficient conditions for the feasibility of low-complexity feature selection are then derived in terms of these concepts. It is shown that the intrinsic complexity of feature selection is determined by the decomposability order of the feature set and not its dimension. Feature selection algorithms are then derived for all levels of complexity and are shown to be approximated by existing information-theoretic methods, which they consistently outperform. The new algorithms are also used to objectively test the hypothesis of low decomposability order through comparison of classification performance. It is shown that, for image classification, the gain of modeling feature dependencies has strongly diminishing returns: best results are obtained under the assumption of decomposability order 1. This suggests a generic law for bandpass features extracted from natural images: that the effect, on the dependence of any two features, of observing any other feature is constant across image classes. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • An Analysis of Ensemble Pruning Techniques Based on Ordered Aggregation

    Publication Year: 2009 , Page(s): 245 - 259
    Cited by:  Papers (16)
    Multimedia
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (3435 KB) |  | HTML iconHTML  

    Several pruning strategies that can be used to reduce the size and increase the accuracy of bagging ensembles are analyzed. These heuristics select subsets of complementary classifiers that, when combined, can perform better than the whole ensemble. The pruning methods investigated are based on modifying the order of aggregation of classifiers in the ensemble. In the original bagging algorithm, the order of aggregation is left unspecified. When this order is random, the generalization error typically decreases as the number of classifiers in the ensemble increases. If an appropriate ordering for the aggregation process is devised, the generalization error reaches a minimum at intermediate numbers of classifiers. This minimum lies below the asymptotic error of bagging. Pruned ensembles are obtained by retaining a fraction of the classifiers in the ordered ensemble. The performance of these pruned ensembles is evaluated in several benchmark classification tasks under different training conditions. The results of this empirical investigation show that ordered aggregation can be used for the efficient generation of pruned ensembles that are competitive, in terms of performance and robustness of classification, with computationally more costly methods that directly select optimal or near-optimal subensembles. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Geometric Mean for Subspace Selection

    Publication Year: 2009 , Page(s): 260 - 274
    Cited by:  Papers (77)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (3933 KB) |  | HTML iconHTML  

    Subspace selection approaches are powerful tools in pattern classification and data visualization. One of the most important subspace approaches is the linear dimensionality reduction step in the Fisher's linear discriminant analysis (FLDA), which has been successfully employed in many fields such as biometrics, bioinformatics, and multimedia information management. However, the linear dimensionality reduction step in FLDA has a critical drawback: for a classification task with c classes, if the dimension of the projected subspace is strictly lower than c - 1, the projection to a subspace tends to merge those classes, which are close together in the original feature space. If separate classes are sampled from Gaussian distributions, all with identical covariance matrices, then the linear dimensionality reduction step in FLDA maximizes the mean value of the Kullback-Leibler (KL) divergences between different classes. Based on this viewpoint, the geometric mean for subspace selection is studied in this paper. Three criteria are analyzed: 1) maximization of the geometric mean of the KL divergences, 2) maximization of the geometric mean of the normalized KL divergences, and 3) the combination of 1 and 2. Preliminary experimental results based on synthetic data, UCI Machine Learning Repository, and handwriting digits show that the third criterion is a potential discriminative subspace selection method, which significantly reduces the class separation problem in comparing with the linear dimensionality reduction step in FLDA and its several representative extensions. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Semisupervised Learning of Hidden Markov Models via a Homotopy Method

    Publication Year: 2009 , Page(s): 275 - 287
    Cited by:  Papers (3)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (907 KB) |  | HTML iconHTML  

    Hidden Markov model (HMM) classifier design is considered for the analysis of sequential data, incorporating both labeled and unlabeled data for training; the balance between the use of labeled and unlabeled data is controlled by an allocation parameter lambda isin (0, 1), where lambda = 0 corresponds to purely supervised HMM learning (based only on the labeled data) and lambda = 1 corresponds to unsupervised HMM-based clustering (based only on the unlabeled data). The associated estimation problem can typically be reduced to solving a set of fixed-point equations in the form of a "natural-parameter homotopy." This paper applies a homotopy method to track a continuous path of solutions, starting from a local supervised solution (lambda = 0) to a local unsupervised solution (lambda = 1). The homotopy method is guaranteed to track with probability one from lambda = 0 to lambda = 1 if the lambda = 0 solution is unique; this condition is not satisfied for the HMM since the maximum likelihood supervised solution (lambda = 0) is characterized by many local optima. A modified form of the homotopy map for HMMs assures a track from lambda = 0 to lambda = 1. Following this track leads to a formulation for selecting lambda isin (0, 1) for a semisupervised solution and it also provides a tool for selection from among multiple local-optimal supervised solutions. The results of applying the proposed method to measured and synthetic sequential data verify its robustness and feasibility compared to the conventional EM approach for semisupervised HMM training. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Outlier Detection with the Kernelized Spatial Depth Function

    Publication Year: 2009 , Page(s): 288 - 305
    Cited by:  Papers (8)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (4591 KB) |  | HTML iconHTML  

    Statistical depth functions provide from the deepest point a center-outward ordering of multidimensional data. In this sense, depth functions can measure the extremeness or outlyingness of a data point with respect to a given data set. Hence, they can detect outliers observations that appear extreme relative to the rest of the observations. Of the various statistical depths, the spatial depth is especially appealing because of its computational efficiency and mathematical tractability. In this article, we propose a novel statistical depth, the kernelized spatial depth (KSD), which generalizes the spatial depth via positive definite kernels. By choosing a proper kernel, the KSD can capture the local structure of a data set while the spatial depth fails. We demonstrate this by the half-moon data and the ring-shaped data. Based on the KSD, we propose a novel outlier detection algorithm, by which an observation with a depth value less than a threshold is declared as an outlier. The proposed algorithm is simple in structure: the threshold is the only one parameter for a given kernel. It applies to a one-class learning setting, in which normal observations are given as the training data, as well as to a missing label scenario, where the training set consists of a mixture of normal observations and outliers with unknown labels. We give upper bounds on the false alarm probability of a depth-based detector. These upper bounds can be used to determine the threshold. We perform extensive experiments on synthetic data and data sets from real applications. The proposed outlier detector is compared with existing methods. The KSD outlier detector demonstrates a competitive performance. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Time Warp Edit Distance with Stiffness Adjustment for Time Series Matching

    Publication Year: 2009 , Page(s): 306 - 318
    Cited by:  Papers (13)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1973 KB) |  | HTML iconHTML  

    In a way similar to the string-to-string correction problem, we address discrete time series similarity in light of a time-series-to-time-series-correction problem for which the similarity between two time series is measured as the minimum cost sequence of edit operations needed to transform one time series into another. To define the edit operations, we use the paradigm of a graphical editing process and end up with a dynamic programming algorithm that we call time warp edit distance (TWED). TWED is slightly different in form from dynamic time warping (DTW), longest common subsequence (LCSS), or edit distance with real penalty (ERP) algorithms. In particular, it highlights a parameter that controls a kind of stiffness of the elastic measure along the time axis. We show that the similarity provided by TWED is a potentially useful metric in time series retrieval applications since it could benefit from the triangular inequality property to speed up the retrieval process while tuning the parameters of the elastic measure. In that context, a lower bound is derived to link the matching of time series into down sampled representation spaces to the matching into the original space. The empiric quality of the TWED distance is evaluated on a simple classification task. Compared to edit distance, DTW, LCSS, and ERP, TWED has proved to be quite effective on the considered experimental task. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Framework for Performance Evaluation of Face, Text, and Vehicle Detection and Tracking in Video: Data, Metrics, and Protocol

    Publication Year: 2009 , Page(s): 319 - 336
    Cited by:  Papers (81)  |  Patents (2)
    Multimedia
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (2350 KB) |  | HTML iconHTML  

    Common benchmark data sets, standardized performance metrics, and baseline algorithms have demonstrated considerable impact on research and development in a variety of application domains. These resources provide both consumers and developers of technology with a common framework to objectively compare the performance of different algorithms and algorithmic improvements. In this paper, we present such a framework for evaluating object detection and tracking in video: specifically for face, text, and vehicle objects. This framework includes the source video data, ground-truth annotations (along with guidelines for annotation), performance metrics, evaluation protocols, and tools including scoring software and baseline algorithms. For each detection and tracking task and supported domain, we developed a 50-clip training set and a 50-clip test set. Each data clip is approximately 2.5 minutes long and has been completely spatially/temporally annotated at the I-frame level. Each task/domain, therefore, has an associated annotated corpus of approximately 450,000 frames. The scope of such annotation is unprecedented and was designed to begin to support the necessary quantities of data for robust machine learning approaches, as well as a statistically significant comparison of the performance of algorithms. The goal of this work was to systematically address the challenges of object detection and tracking through a common evaluation framework that permits a meaningful objective comparison of techniques, provides the research community with sufficient data for the exploration of automatic modeling techniques, encourages the incorporation of objective evaluation into the development process, and contributes useful lasting resources of a scale and magnitude that will prove to be extremely useful to the computer vision research community for years to come. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Information Geometry for Landmark Shape Analysis: Unifying Shape Representation and Deformation

    Publication Year: 2009 , Page(s): 337 - 350
    Cited by:  Papers (2)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (3082 KB) |  | HTML iconHTML  

    Shape matching plays a prominent role in the comparison of similar structures. We present a unifying framework for shape matching that uses mixture models to couple both the shape representation and deformation. The theoretical foundation is drawn from information geometry wherein information matrices are used to establish intrinsic distances between parametric densities. When a parameterized probability density function is used to represent a landmark-based shape, the modes of deformation are automatically established through the information matrix of the density. We first show that given two shapes parameterized by Gaussian mixture models (GMMs), the well-known Fisher information matrix of the mixture model is also a Riemannian metric (actually, the Fisher-Rao Riemannian metric) and can therefore be used for computing shape geodesics. The Fisher-Rao metric has the advantage of being an intrinsic metric and invariant to reparameterization. The geodesic-computed using this metric-establishes an intrinsic deformation between the shapes, thus unifying both shape representation and deformation. A fundamental drawback of the Fisher-Rao metric is that it is not available in closed form for the GMM. Consequently, shape comparisons are computationally very expensive. To address this, we develop a new Riemannian metric based on generalized phi-entropy measures. In sharp contrast to the Fisher-Rao metric, the new metric is available in closed form. Geodesic computations using the new metric are considerably more efficient. We validate the performance and discriminative capabilities of these new information geometry-based metrics by pairwise matching of corpus callosum shapes. We also study the deformations of fish shapes that have various topological properties. A comprehensive comparative analysis is also provided using other landmark-based distances, including the Hausdorff distance, the Procrustes metric, landmark-based diffeomorphisms, and the bending energies of the th- - in-plate (TPS) and Wendland splines. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Principal Angles Separate Subject Illumination Spaces in YDB and CMU-PIE

    Publication Year: 2009 , Page(s): 351 - 363
    Cited by:  Papers (5)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (668 KB) |  | HTML iconHTML  

    The theory of illumination subspaces is well developed and has been tested extensively on the Yale Face Database B (YDB) and CMU-PIE (PIE) data sets. This paper shows that if face recognition under varying illumination is cast as a problem of matching sets of images to sets of images, then the minimal principal angle between subspaces is sufficient to perfectly separate matching pairs of image sets from nonmatching pairs of image sets sampled from YDB and PIE. This is true even for subspaces estimated from as few as six images and when one of the subspaces is estimated from as few as three images if the second subspace is estimated from a larger set (10 or more). This suggests that variation under illumination may be thought of as useful discriminating information rather than unwanted noise. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • High-Precision Boundary Length Estimation by Utilizing Gray-Level Information

    Publication Year: 2009 , Page(s): 357 - 363
    Cited by:  Papers (5)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (777 KB) |  | HTML iconHTML  

    We present a novel method that provides an accurate and precise estimate of the length of the boundary (perimeter) of an object by taking into account gray levels on the boundary of the digitization of the same object. Assuming a model where pixel intensity is proportional to the coverage of a pixel, we show that the presented method provides error-free measurements of the length of straight boundary segments in the case of nonquantized pixel values. For a more realistic situation, where pixel values are quantized, we derive optimal estimates that minimize the maximal estimation error. We show that the estimate converges toward a correct value as the number of gray levels tends toward infinity. The method is easy to implement; we provide the complete pseudocode. Since the method utilizes only a small neighborhood, it is very easy to parallelize. We evaluate the estimator on a set of concave and convex shapes with known perimeters, digitized at increasing resolution. In addition, we provide an example of applicability of the method on real images, by suggesting appropriate preprocessing steps and presenting results of a comparison of the suggested method with other local approaches. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Statistical Instance-Based Pruning in Ensembles of Independent Classifiers

    Publication Year: 2009 , Page(s): 364 - 369
    Cited by:  Papers (5)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1795 KB) |  | HTML iconHTML  

    The global prediction of a homogeneous ensemble of classifiers generated in independent applications of a randomized learning algorithm on a fixed training set is analyzed within a Bayesian framework. Assuming that majority voting is used, it is possible to estimate with a given confidence level the prediction of the complete ensemble by querying only a subset of classifiers. For a particular instance that needs to be classified, the polling of ensemble classifiers can be halted when the probability that the predicted class will not change when taking into account the remaining votes is above the specified confidence level. Experiments on a collection of benchmark classification problems using representative parallel ensembles, such as bagging and random forests, confirm the validity of the analysis and demonstrate the effectiveness of the instance-based ensemble pruning method proposed. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Camera Displacement via Constrained Minimization of the Algebraic Error

    Publication Year: 2009 , Page(s): 370 - 375
    Cited by:  Papers (14)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1400 KB) |  | HTML iconHTML  

    This paper proposes a new approach to estimate the camera displacement of stereo vision systems via minimization of the algebraic error over the essential matrices manifold. The proposed approach is based on the use of homogeneous forms and linear matrix inequality (LMI) optimizations, and has the advantages of not presenting local minima and not introducing approximations of nonlinear terms. Numerical investigations carried out with both synthetic and real data show that the proposed approach provides significantly better results than SVD methods as well as minimizations of the algebraic error over the essential matrices manifold via both gradient descent and simplex search algorithms. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • High-Accuracy and Robust Localization of Large Control Markers for Geometric Camera Calibration

    Publication Year: 2009 , Page(s): 376 - 383
    Cited by:  Papers (3)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (2772 KB) |  | HTML iconHTML  

    Accurate measurement of the position of features in an image is subject to a fundamental compromise: The features must be both small, to limit the effect of nonlinear distortions, and large, to limit the effect of noise and discretization. This constrains both the accuracy and the robustness of image measurements, which play an important role in geometric camera calibration as well as in all subsequent measurements based on that calibration. In this paper, we present a new geometric camera calibration technique that exploits the complete camera model during the localization of control markers, thereby abolishing the marker size compromise. Large markers allow a dense pattern to be used instead of a simple disc, resulting in a significant increase in accuracy and robustness. When highly planar markers are used, geometric camera calibration based on synthetic images leads to true errors of 0.002 pixels, even in the presence of artifacts such as noise, illumination gradients, compression, blurring, and limited dynamic range. The camera parameters are also accurately recovered, even for complex camera models. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Join the IEEE Computer Society [advertisement]

    Publication Year: 2009 , Page(s): 384
    Save to Project icon | Request Permissions | PDF file iconPDF (45 KB)  
    Freely Available from IEEE
  • TPAMI Information for authors

    Publication Year: 2009 , Page(s): c3
    Save to Project icon | Request Permissions | PDF file iconPDF (82 KB)  
    Freely Available from IEEE
  • [Back cover]

    Publication Year: 2009 , Page(s): c4
    Save to Project icon | Request Permissions | PDF file iconPDF (179 KB)  
    Freely Available from IEEE

Aims & Scope

The IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI) is published monthly. Its editorial board strives to present most important research results in areas within TPAMI's scope.

Full Aims & Scope

Meet Our Editors

Editor-in-Chief
David A. Forsyth
University of Illinois