Scheduled System Maintenance on May 29th, 2015:
IEEE Xplore will be upgraded between 11:00 AM and 10:00 PM EDT. During this time there may be intermittent impact on performance. For technical support, please contact us at onlinesupport@ieee.org. We apologize for any inconvenience.
By Topic

Pattern Analysis and Machine Intelligence, IEEE Transactions on

Issue 7 • Date July 2007

Filter Results

Displaying Results 1 - 21 of 21
  • [Front cover]

    Publication Year: 2007 , Page(s): c1
    Save to Project icon | Request Permissions | PDF file iconPDF (147 KB)  
    Freely Available from IEEE
  • [Inside front cover]

    Publication Year: 2007 , Page(s): c2
    Save to Project icon | Request Permissions | PDF file iconPDF (82 KB)  
    Freely Available from IEEE
  • Active Shape Models with Invariant Optimal Features: Application to Facial Analysis

    Publication Year: 2007 , Page(s): 1105 - 1117
    Cited by:  Papers (15)  |  Patents (1)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (2797 KB) |  | HTML iconHTML  

    This work is framed in the field of statistical face analysis. In particular, the problem of accurate segmentation of prominent features of the face in frontal view images is addressed. We propose a method that generalizes linear active shape models (ASMs)l which have already been used for this task. The technique is built upon the development of a nonlinear intensity model, incorporating a reduced set of differential invariant features as local image descriptors. These features are invariant to rigid transformations, and a subset of them is chosen by sequential feature selection for each landmark and resolution level. The new approach overcomes the unimodality and Gaussianity assumptions of classical ASMs regarding the distribution of the intensity values across the training set. Our methodology has demonstrated a significant improvement in segmentation precision as compared to the linear ASM and optimal features ASM (a nonlinear extension of the pioneer algorithm) in the tests performed on AR, XM2VTS, and EQUINOX databases. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Value-Directed Human Behavior Analysis from Video Using Partially Observable Markov Decision Processes

    Publication Year: 2007 , Page(s): 1118 - 1132
    Cited by:  Papers (3)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1590 KB) |  | HTML iconHTML  

    This paper presents a method for learning decision theoretic models of human behaviors from video data. Our system learns relationships between the movements of a person, the context in which they are acting, and a utility function. This learning makes explicit that the meaning of a behavior to an observer is contained in its relationship to actions and outcomes. An agent wishing to capitalize on these relationships must learn to distinguish the behaviors according to how they help the agent to maximize utility. The model we use is a partially observable Markov decision process, or POMDP. The video observations are integrated into the POMDP using a dynamic Bayesian network that creates spatial and temporal abstractions amenable to decision making at the high level. The parameters of the model are learned from training data using an a posteriori constrained optimization technique based on the expectation-maximization algorithm. The system automatically discovers classes of behaviors and determines which are important for choosing actions that optimize over the utility of possible outcomes. This type of learning obviates the need for labeled data from expert knowledge about which behaviors are significant and removes bias about what behaviors may be useful to recognize in a particular situation. We show results in three interactions: a single player imitation game, a gestural robotic control problem, and a card game played by two people. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Learning and Removing Cast Shadows through a Multidistribution Approach

    Publication Year: 2007 , Page(s): 1133 - 1146
    Cited by:  Papers (37)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (3782 KB) |  | HTML iconHTML  

    Moving cast shadows are a major concern for foreground detection algorithms. The processing of foreground images in surveillance applications typically requires that such shadows be identified and removed from the detected foreground. This paper presents a novel pixel-based statistical approach to model moving cast shadows of nonuniform and varying intensity. This approach uses the Gaussian mixture model (GMM) learning ability to build statistical models describing moving cast shadows on surfaces. This statistical modeling can deal with scenes with complex and time-varying illumination, including light saturated areas, and prevent false detection in regions where shadows cannot be detected. The proposed approach can be used with pixel-based descriptions of shadowed surfaces found in the literature. It significantly reduces their false detection rate without increasing the missed detection rate. Results obtained with different scene types and shadow models show the robustness of the approach. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Robust Image Segmentation Using Resampling and Shape Constraints

    Publication Year: 2007 , Page(s): 1147 - 1164
    Cited by:  Papers (4)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (5381 KB) |  | HTML iconHTML  

    Automated segmentation of images has been considered an important intermediate processing task to extract semantic meaning from pixels. We propose an integrated approach for image segmentation based on a generative clustering model combined with coarse shape information and robust parameter estimation. The sensitivity of segmentation solutions to image variations is measured by image resampling. Shape information is included in the inference process to guide ambiguous groupings of color and texture features. Shape and similarity-based grouping information is combined into a semantic likelihood map in the framework of Bayesian statistics. Experimental evidence shows that semantically meaningful segments are inferred even when image data alone gives rise to ambiguous segmentations. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A Linear Programming Approach to Max-Sum Problem: A Review

    Publication Year: 2007 , Page(s): 1165 - 1179
    Cited by:  Papers (47)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1332 KB) |  | HTML iconHTML  

    The max-sum labeling problem, defined as maximizing a sum of binary (i.e., pairwise) functions of discrete variables, is a general NP-hard optimization problem with many applications, such as computing the MAP configuration of a Markov random field. We review a not widely known approach to the problem, developed by Ukrainian researchers Schlesinger et al. in 1976, and show how it contributes to recent results, most importantly, those on the convex combination of trees and tree-reweighted max-product. In particular, we review Schlesinger et al.'s upper bound on the max-sum criterion, its minimization by equivalent transformations, its relation to the constraint satisfaction problem, the fact that this minimization is dual to a linear programming relaxation of the original problem, and the three kinds of consistency necessary for optimality of the upper bound. We revisit problems with Boolean variables and supermodular problems. We describe two algorithms for decreasing the upper bound. We present an example application for structural image analysis. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Algorithmic Differentiation: Application to Variational Problems in Computer Vision

    Publication Year: 2007 , Page(s): 1180 - 1193
    Cited by:  Papers (1)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1175 KB) |  | HTML iconHTML  

    Many vision problems can be formulated as minimization of appropriate energy functionals. These energy functionals are usually minimized, based on the calculus of variations (Euler-Lagrange equation). Once the Euler-Lagrange equation has been determined, it needs to be discretized in order to implement it on a digital computer. This is not a trivial task and, is moreover, error- prone. In this paper, we propose a flexible alternative. We discretize the energy functional and, subsequently, apply the mathematical concept of algorithmic differentiation to directly derive algorithms that implement the energy functional's derivatives. This approach has several advantages: First, the computed derivatives are exact with respect to the implementation of the energy functional. Second, it is basically straightforward to compute second-order derivatives and, thus, the Hessian matrix of the energy functional. Third, algorithmic differentiation is a process which can be automated. We demonstrate this novel approach on three representative vision problems (namely, denoising, segmentation, and stereo) and show that state-of-the-art results are obtained with little effort. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Weighted Minimal Hypersurface Reconstruction

    Publication Year: 2007 , Page(s): 1194 - 1208
    Cited by:  Papers (5)  |  Patents (1)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1278 KB) |  | HTML iconHTML  

    Many problems in computer vision can be formulated as a minimization problem for an energy functional. If this functional is given as an integral of a scalar-valued weight function over an unknown hypersurface, then the sought-after minimal surface can be determined as a solution of the functional's Euler-Lagrange equation. This paper deals with a general class of weight functions that may depend on surface point coordinates as well as surface orientation. We derive the Euler-Lagrange equation in arbitrary dimensional space without the need for any surface parameterization, generalizing existing proofs. Our work opens up the possibility of solving problems involving minimal hypersurfaces in a dimension higher than three, which were previously impossible to solve in practice. We also introduce two applications of our new framework: We show how to reconstruct temporally coherent geometry from multiple video streams, and we use the same framework for the volumetric reconstruction of refractive and transparent natural phenomena, bodies of flowing water. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Conformal Geometry and Its Applications on 3D Shape Matching, Recognition, and Stitching

    Publication Year: 2007 , Page(s): 1209 - 1220
    Cited by:  Papers (19)  |  Patents (3)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (3827 KB) |  | HTML iconHTML  

    Three-dimensional shape matching is a fundamental issue in computer vision with many applications such as shape registration, 3D object recognition, and classification. However, shape matching with noise, occlusion, and clutter is a challenging problem. In this paper, we analyze a family of quasi-conformal maps including harmonic maps, conformal maps, and least-squares conformal maps with regards to 3D shape matching. As a result, we propose a novel and computationally efficient shape matching framework by using least-squares conformal maps. According to conformal geometry theory, each 3D surface with disk topology can be mapped to a 2D domain through a global optimization and the resulting map is a diffeomorphism, i.e., one-to-one and onto. This allows us to simplify the 3D shape-matching problem to a 2D image-matching problem, by comparing the resulting 2D parametric maps, which are stable, insensitive to resolution changes and robust to occlusion, and noise. Therefore, highly accurate and efficient 3D shape matching algorithms can be achieved by using the above three parametric maps. Finally, the robustness of least-squares conformal maps is evaluated and analyzed comprehensively in 3D shape matching with occlusion, noise, and resolution variation. In order to further demonstrate the performance of our proposed method, we also conduct a series of experiments on two computer vision applications, i.e., 3D face recognition and 3D nonrigid surface alignment and stitching. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • An Approximate and Efficient Method for Optimal Rotation Alignment of 3D Models

    Publication Year: 2007 , Page(s): 1221 - 1229
    Cited by:  Papers (4)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1539 KB) |  | HTML iconHTML  

    In many shape analysis applications, the ability to find the best rotation that aligns two models is an essential first step in the analysis process. In the past, methods for model alignment have either used normalization techniques, such as PCA alignment, or have performed an exhaustive search over the space of rotation to find the best optimal alignment. While normalization techniques have the advantage of efficiency, providing a quick method for registering two shapes, they are often imprecise and can give rise to poor alignments. Conversely, exhaustive search is guaranteed to provide the correct answer, but, even using efficient signal processing techniques, this type of approach can be prohibitively slow. In this paper, we present a new method for aligning two 3D shapes. We show that the method is markedly faster than existing approaches based on efficient signal processing and we provide registration results demonstrating that the alignments obtained using our method have a high degree of precision and are markedly better than those obtained using normalization. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A Two-Level Generative Model for Cloth Representation and Shape from Shading

    Publication Year: 2007 , Page(s): 1230 - 1243
    Cited by:  Papers (1)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1693 KB) |  | HTML iconHTML  

    In this paper, we present a two-level generative model for representing the images and surface depth maps of drapery and clothes. The upper level consists of a number of folds which will generate the high contrast (ridge) areas with a dictionary of shading primitives (for 2D images) and fold primitives (for 3D depth maps). These primitives are represented in parametric forms and are learned in a supervised learning phase using 3D surfaces of clothes acquired through photometric stereo. The lower level consists of the remaining flat areas which fill between the folds with a smoothness prior (Markov random field). We show that the classical ill-posed problem-shape from shading (SFS) can be much improved by this two-level model for its reduced dimensionality and incorporation of middle-level visual knowledge, i.e., the dictionary of primitives. Given an input image, we first infer the folds and compute a sketch graph using a sketch pursuit algorithm as in the primal sketch (Guo et al., 2003). The 3D folds are estimated by parameter fitting using the fold dictionary and they form the "skeleton" of the drapery/cloth surfaces. Then, the lower level is computed by conventional SFS method using the fold areas as boundary conditions. The two levels interact at the final stage by optimizing a joint Bayesian posterior probability on the depth map. We show a number of experiments which demonstrate more robust results in comparison with state-of-the-art work. In a broader scope, our representation can be viewed as a two-level inhomogeneous MRF model which is applicable to general shape-from-X problems. Our study is an attempt to revisit Marr's idea (Marr and Freeman, 1982) of computing the 2frac12D sketch from primal sketch. In a companion paper (Barbu and Zhu, 2005), we study shape from stereo based on a similar two-level generative sketch representation. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Extraction and Analysis of Multiple Periodic Motions in Video Sequences

    Publication Year: 2007 , Page(s): 1244 - 1261
    Cited by:  Papers (13)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (2578 KB) |  | HTML iconHTML  

    The analysis of periodic or repetitive motions is useful in many applications, such as the recognition and classification of human and animal activities. Existing methods for the analysis of periodic motions first extract motion trajectories using spatial information and then determine if they are periodic. These approaches are mostly based on feature matching or spatial correlation, which are often infeasible, unreliable, or computationally demanding. In this paper, we present a new approach, based on the time- frequency analysis of the video sequence as a whole. Multiple periodic trajectories are extracted and their periods are estimated simultaneously. The objects that are moving in a periodic manner are extracted using the spatial domain information. Experiments with synthetic and real sequences display the capabilities of this approach. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • On the Dimensionality of Face Space

    Publication Year: 2007 , Page(s): 1262 - 1267
    Cited by:  Papers (2)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1840 KB) |  | HTML iconHTML  

    The dimensionality of face space is measured objectively in a psychophysical study. Within this framework, we obtain a measurement of the dimension for the human visual system. Using an eigenface basis, evidence is presented that talented human observers are able to identify familiar faces that lie in a space of roughly 100 dimensions and the average observer requires a space of between 100 and 200 dimensions. This is below most current estimates. It is further argued that these estimates give an upper bound for face space dimension and this might be lowered by better constructed "eigenfaces" and by talented observers. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Multiple Collaborative Kernel Tracking

    Publication Year: 2007 , Page(s): 1268 - 1273
    Cited by:  Papers (10)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1609 KB) |  | HTML iconHTML  

    Those motion parameters that cannot be recovered from image measurements are unobservable in the visual dynamic system. This paper studies this important issue of singularity in the context of kernel-based tracking and presents a novel approach that is based on a motion field representation which employs redundant but sparsely correlated local motion parameters instead of compact but uncorrelated global ones. This approach makes it easy to design fully observable kernel-based motion estimators. This paper shows that these high-dimensional motion fields can be estimated efficiently by the collaboration among a set of simpler local kernel-based motion estimators, which makes the new approach very practical. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Minimizing Nonsubmodular Functions with Graph Cuts-A Review

    Publication Year: 2007 , Page(s): 1274 - 1279
    Cited by:  Papers (53)  |  Patents (1)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1193 KB) |  | HTML iconHTML  

    Optimization techniques based on graph cuts have become a standard tool for many vision applications. These techniques allow to minimize efficiently certain energy functions corresponding to pairwise Markov random fields (MRFs). Currently, there is an accepted view within the computer vision community that graph cuts can only be used for optimizing a limited class of MRF energies (e.g., submodular functions). In this survey, we review some results that show that graph cuts can be applied to a much larger class of energy functions (in particular, nonsubmodular functions). While these results are well-known in the optimization community, to our knowledge they were not used in the context of computer vision and MRF optimization. We demonstrate the relevance of these results to vision on the problem of binary texture restoration. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Analytical Results on Style-Constrained Bayesian Classification of Pattern Fields

    Publication Year: 2007 , Page(s): 1280 - 1285
    Cited by:  Papers (4)  |  Patents (2)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (402 KB) |  | HTML iconHTML  

    We formalize the notion of style context, which accounts for the increased accuracy of the field classifiers reported in this journal recently. We argue that style context forms the basis of all order-independent field classification schemes. We distinguish between intraclass style, which underlies most adaptive classifiers, and interclass style, which is a manifestation of interpattern dependence between the features of the patterns of a field. We show how style-constrained classifiers can be optimized either for field error (useful for short fields like zip codes) or for singlet error (for long fields, like business letters). We derive bounds on the reduction of error rate with field length and show that the error rate of the optimal style-constrained field classifier converges asymptotically to the error rate of a style-aware Bayesian singlet classifier. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Snapshots: A Novel Local Surface Descriptor and Matching Algorithm for Robust 3D Surface Alignment

    Publication Year: 2007 , Page(s): 1285 - 1290
    Cited by:  Papers (8)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (882 KB) |  | HTML iconHTML  

    In this paper, a novel local surface descriptor is proposed and applied to the problem of aligning partial views of a 3D object. The descriptor is based on taking "snapshots" of the surface over each point using a virtual camera oriented perpendicularly to the surface. This representation has the advantage of imposing minimal loss of information be robust to self-occlusions and also be very efficient to compute. Then, we describe an efficient search technique to deal with the rotation ambiguity of our representation and experimentally demonstrate the benefits of our approaches which are pronounced especially when we align views with small overlap. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Distinct Multicolored Region Descriptors for Object Recognition

    Publication Year: 2007 , Page(s): 1291 - 1296
    Cited by:  Papers (3)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1064 KB) |  | HTML iconHTML  

    The problem of object recognition has been considered here. Color descriptions from distinct regions covering multiple segments are considered for object representation. Distinct multicolored regions are detected using edge maps and clustering. Performance of the proposed methodologies has been evaluated on three data sets and the results are found to be better than existing methods when a small number of training views is considered. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • [Back inside cover]

    Publication Year: 2007 , Page(s): c3
    Save to Project icon | Request Permissions | PDF file iconPDF (82 KB)  
    Freely Available from IEEE
  • [Back cover]

    Publication Year: 2007 , Page(s): c4
    Save to Project icon | Request Permissions | PDF file iconPDF (147 KB)  
    Freely Available from IEEE

Aims & Scope

The IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI) is published monthly. Its editorial board strives to present most important research results in areas within TPAMI's scope.

Full Aims & Scope

Meet Our Editors

Editor-in-Chief
David A. Forsyth
University of Illinois