By Topic

Pattern Analysis and Machine Intelligence, IEEE Transactions on

Issue 3 • Date March 2012

Filter Results

Displaying Results 1 - 22 of 22
  • [Front cover]

    Publication Year: 2012 , Page(s): c1
    Save to Project icon | Request Permissions | PDF file iconPDF (178 KB)  
    Freely Available from IEEE
  • [Cover 2]

    Publication Year: 2012 , Page(s): c2
    Save to Project icon | Request Permissions | PDF file iconPDF (201 KB)  
    Freely Available from IEEE
  • Prototype Selection for Nearest Neighbor Classification: Taxonomy and Empirical Study

    Publication Year: 2012 , Page(s): 417 - 435
    Cited by:  Papers (20)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (4726 KB) |  | HTML iconHTML  

    The nearest neighbor classifier is one of the most used and well-known techniques for performing recognition tasks. It has also demonstrated itself to be one of the most useful algorithms in data mining in spite of its simplicity. However, the nearest neighbor classifier suffers from several drawbacks such as high storage requirements, low efficiency in classification response, and low noise tolerance. These weaknesses have been the subject of study for many researchers and many solutions have been proposed. Among them, one of the most promising solutions consists of reducing the data used for establishing a classification rule (training data) by means of selecting relevant prototypes. Many prototype selection methods exist in the literature and the research in this area is still advancing. Different properties could be observed in the definition of them, but no formal categorization has been established yet. This paper provides a survey of the prototype selection methods proposed in the literature from a theoretical and empirical point of view. Considering a theoretical point of view, we propose a taxonomy based on the main characteristics presented in prototype selection and we analyze their advantages and drawbacks. Empirically, we conduct an experimental study involving different sizes of data sets for measuring their performance in terms of accuracy, reduction capabilities, and runtime. The results obtained by all the methods studied have been verified by nonparametric statistical tests. Several remarks, guidelines, and recommendations are made for the use of prototype selection for nearest neighbor classification. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Slow Feature Analysis for Human Action Recognition

    Publication Year: 2012 , Page(s): 436 - 450
    Cited by:  Papers (41)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (5494 KB) |  | HTML iconHTML  

    Slow Feature Analysis (SFA) extracts slowly varying features from a quickly varying input signal [1]. It has been successfully applied to modeling the visual receptive fields of the cortical neurons. Sufficient experimental results in neuroscience suggest that the temporal slowness principle is a general learning principle in visual perception. In this paper, we introduce the SFA framework to the problem of human action recognition by incorporating the discriminative information with SFA learning and considering the spatial relationship of body parts. In particular, we consider four kinds of SFA learning strategies, including the original unsupervised SFA (U-SFA), the supervised SFA (S-SFA), the discriminative SFA (D-SFA), and the spatial discriminative SFA (SD--SFA), to extract slow feature functions from a large amount of training cuboids which are obtained by random sampling in motion boundaries. Afterward, to represent action sequences, the squared first order temporal derivatives are accumulated over all transformed cuboids into one feature vector, which is termed the Accumulated Squared Derivative (ASD) feature. The ASD feature encodes the statistical distribution of slow features in an action sequence. Finally, a linear support vector machine (SVM) is trained to classify actions represented by ASD features. We conduct extensive experiments, including two sets of control experiments, two sets of large scale experiments on the KTH and Weizmann databases, and two sets of experiments on the CASIA and UT-interaction databases, to demonstrate the effectiveness of SFA for human action recognition. Experimental results suggest that the SFA-based approach (1) is able to extract useful motion patterns and improves the recognition performance, (2) requires less intermediate processing steps but achieves comparable or even better performance, and (3) has good potential to recognize complex multiperson activities. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Altered Fingerprints: Analysis and Detection

    Publication Year: 2012 , Page(s): 451 - 464
    Cited by:  Papers (16)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (5616 KB) |  | HTML iconHTML  

    The widespread deployment of Automated Fingerprint Identification Systems (AFIS) in law enforcement and border control applications has heightened the need for ensuring that these systems are not compromised. While several issues related to fingerprint system security have been investigated, including the use of fake fingerprints for masquerading identity, the problem of fingerprint alteration or obfuscation has received very little attention. Fingerprint obfuscation refers to the deliberate alteration of the fingerprint pattern by an individual for the purpose of masking his identity. Several cases of fingerprint obfuscation have been reported in the press. Fingerprint image quality assessment software (e.g., NFIQ) cannot always detect altered fingerprints since the implicit image quality due to alteration may not change significantly. The main contributions of this paper are: 1) compiling case studies of incidents where individuals were found to have altered their fingerprints for circumventing AFIS, 2) investigating the impact of fingerprint alteration on the accuracy of a commercial fingerprint matcher, 3) classifying the alterations into three major categories and suggesting possible countermeasures, 4) developing a technique to automatically detect altered fingerprints based on analyzing orientation field and minutiae distribution, and 5) evaluating the proposed technique and the NFIQ algorithm on a large database of altered fingerprints provided by a law enforcement agency. Experimental results show the feasibility of the proposed approach in detecting altered fingerprints and highlight the need to further pursue this problem. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Domain Transfer Multiple Kernel Learning

    Publication Year: 2012 , Page(s): 465 - 479
    Cited by:  Papers (29)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1431 KB) |  | HTML iconHTML  

    Cross-domain learning methods have shown promising results by leveraging labeled patterns from the auxiliary domain to learn a robust classifier for the target domain which has only a limited number of labeled samples. To cope with the considerable change between feature distributions of different domains, we propose a new cross-domain kernel learning framework into which many existing kernel methods can be readily incorporated. Our framework, referred to as Domain Transfer Multiple Kernel Learning (DTMKL), simultaneously learns a kernel function and a robust classifier by minimizing both the structural risk functional and the distribution mismatch between the labeled and unlabeled samples from the auxiliary and target domains. Under the DTMKL framework, we also propose two novel methods by using SVM and prelearned classifiers, respectively. Comprehensive experiments on three domain adaptation data sets (i.e., TRECVID, 20 Newsgroups, and email spam data sets) demonstrate that DTMKL-based methods outperform existing cross-domain learning and multiple kernel learning methods. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Efficient Additive Kernels via Explicit Feature Maps

    Publication Year: 2012 , Page(s): 480 - 492
    Cited by:  Papers (58)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1410 KB) |  | HTML iconHTML  

    Large scale nonlinear support vector machines (SVMs) can be approximated by linear ones using a suitable feature map. The linear SVMs are in general much faster to learn and evaluate (test) than the original nonlinear SVMs. This work introduces explicit feature maps for the additive class of kernels, such as the intersection, Hellinger's, and χ2 kernels, commonly used in computer vision, and enables their use in large scale problems. In particular, we: 1) provide explicit feature maps for all additive homogeneous kernels along with closed form expression for all common kernels; 2) derive corresponding approximate finite-dimensional feature maps based on a spectral analysis; and 3) quantify the error of the approximation, showing that the error is independent of the data dimension and decays exponentially fast with the approximation order for selected kernels such as χ2. We demonstrate that the approximations have indistinguishable performance from the full kernels yet greatly reduce the train/test times of SVMs. We also compare with two other approximation methods: Nystrom's approximation of Perronnin et al. [1], which is data dependent, and the explicit map of Maji and Berg [2] for the intersection kernel, which, as in the case of our approximations, is data independent. The approximations are evaluated on a number of standard data sets, including Caltech-101 [3], Daimler-Chrysler pedestrians [4], and INRIA pedestrians [5]. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Fast Joint Estimation of Silhouettes and Dense 3D Geometry from Multiple Images

    Publication Year: 2012 , Page(s): 493 - 505
    Cited by:  Papers (6)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1346 KB) |  | HTML iconHTML  

    We propose a probabilistic formulation of joint silhouette extraction and 3D reconstruction given a series of calibrated 2D images. Instead of segmenting each image separately in order to construct a 3D surface consistent with the estimated silhouettes, we compute the most probable 3D shape that gives rise to the observed color information. The probabilistic framework, based on Bayesian inference, enables robust 3D reconstruction by optimally taking into account the contribution of all views. We solve the arising maximum a posteriori shape inference in a globally optimal manner by convex relaxation techniques in a spatially continuous representation. For an interactively provided user input in the form of scribbles specifying foreground and background regions, we build corresponding color distributions as multivariate Gaussians and find a volume occupancy that best fits to this data in a variational sense. Compared to classical methods for silhouette-based multiview reconstruction, the proposed approach does not depend on initialization and enjoys significant resilience to violations of the model assumptions due to background clutter, specular reflections, and camera sensor perturbations. In experiments on several real-world data sets, we show that exploiting a silhouette coherency criterion in a multiview setting allows for dramatic improvements of silhouette quality over independent 2D segmentations without any significant increase of computational efforts. This results in more accurate visual hull estimation, needed by a multitude of image-based modeling approaches. We made use of recent advances in parallel computing with a GPU implementation of the proposed method generating reconstructions on volume grids of more than 20 million voxels in up to 4.41 seconds. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • IrisCode Decompression Based on the Dependence between Its Bit Pairs

    Publication Year: 2012 , Page(s): 506 - 520
    Cited by:  Papers (3)  |  Patents (1)
    Multimedia
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1903 KB) |  | HTML iconHTML  

    IrisCode is an iris recognition algorithm developed in 1993 and continuously improved by Daugman. Understanding IrisCode's properties is extremely important because over 60 million people have been mathematically enrolled by the algorithm. In this paper, IrisCode is proved to be a compression algorithm, which is to say its templates are compressed iris images. In our experiments, the compression ratio of these images is 1:655. An algorithm is designed to perform this decompression by exploiting a graph composed of the bit pairs in IrisCode, prior knowledge from iris image databases, and the theoretical results. To remove artifacts, two postprocessing techniques that carry out optimization in the Fourier domain are developed. Decompressed iris images obtained from two public iris image databases are evaluated by visual comparison, two objective image quality assessment metrics, and eight iris recognition methods. The experimental results show that the decompressed iris images retain iris texture that their quality is roughly equivalent to a JPEG quality factor of 10 and that the iris recognition methods can match the original images with the decompressed images. This paper also discusses the impacts of these theoretical and experimental findings on privacy and security. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Maximum Margin Bayesian Network Classifiers

    Publication Year: 2012 , Page(s): 521 - 532
    Cited by:  Papers (5)
    Multimedia
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1191 KB) |  | HTML iconHTML  

    We present a maximum margin parameter learning algorithm for Bayesian network classifiers using a conjugate gradient (CG) method for optimization. In contrast to previous approaches, we maintain the normalization constraints on the parameters of the Bayesian network during optimization, i.e., the probabilistic interpretation of the model is not lost. This enables us to handle missing features in discriminatively optimized Bayesian networks. In experiments, we compare the classification performance of maximum margin parameter learning to conditional likelihood and maximum likelihood learning approaches. Discriminative parameter learning significantly outperforms generative maximum likelihood estimation for naive Bayes and tree augmented naive Bayes structures on all considered data sets. Furthermore, maximizing the margin dominates the conditional likelihood approach in terms of classification performance in most cases. We provide results for a recently proposed maximum margin optimization approach based on convex relaxation [1]. While the classification results are highly similar, our CG-based optimization is computationally up to orders of magnitude faster. Margin-optimized Bayesian network classifiers achieve classification performance comparable to support vector machines (SVMs) using fewer parameters. Moreover, we show that unanticipated missing feature values during classification can be easily processed by discriminatively optimized Bayesian network classifiers, a case where discriminative classifiers usually require mechanisms to complete unknown feature values in the data first. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Recognizing Human Actions by Learning and Matching Shape-Motion Prototype Trees

    Publication Year: 2012 , Page(s): 533 - 547
    Cited by:  Papers (14)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (2645 KB) |  | HTML iconHTML  

    A shape-motion prototype-based approach is introduced for action recognition. The approach represents an action as a sequence of prototypes for efficient and flexible action matching in long video sequences. During training, an action prototype tree is learned in a joint shape and motion space via hierarchical K-means clustering and each training sequence is represented as a labeled prototype sequence; then a look-up table of prototype-to-prototype distances is generated. During testing, based on a joint probability model of the actor location and action prototype, the actor is tracked while a frame-to-prototype correspondence is established by maximizing the joint probability, which is efficiently performed by searching the learned prototype tree; then actions are recognized using dynamic prototype sequence matching. Distance measures used for sequence matching are rapidly obtained by look-up table indexing, which is an order of magnitude faster than brute-force computation of frame-to-frame distances. Our approach enables robust action matching in challenging situations (such as moving cameras, dynamic backgrounds) and allows automatic alignment of action sequences. Experimental results demonstrate that our approach achieves recognition rates of 92.86 percent on a large gesture data set (with dynamic backgrounds), 100 percent on the Weizmann action data set, 95.77 percent on the KTH action data set, 88 percent on the UCF sports data set, and 87.27 percent on the CMU action data set. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Robust Active Stereo Vision Using Kullback-Leibler Divergence

    Publication Year: 2012 , Page(s): 548 - 563
    Cited by:  Papers (3)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (2152 KB) |  | HTML iconHTML  

    Active stereo vision is a method of 3D surface scanning involving the projecting and capturing of a series of light patterns where depth is derived from correspondences between the observed and projected patterns. In contrast, passive stereo vision reveals depth through correspondences between textured images from two or more cameras. By employing a projector, active stereo vision systems find correspondences between two or more cameras, without ambiguity, independent of object texture. In this paper, we present a hybrid 3D reconstruction framework that supplements projected pattern correspondence matching with texture information. The proposed scheme consists of using projected pattern data to derive initial correspondences across cameras and then using texture data to eliminate ambiguities. Pattern modulation data are then used to estimate error models from which Kullback-Leibler divergence refinement is applied to reduce misregistration errors. Using only a small number of patterns, the presented approach reduces measurement errors versus traditional structured light and phase matching methodologies while being insensitive to gamma distortion, projector flickering, and secondary reflections. Experimental results demonstrate these advantages in terms of enhanced 3D reconstruction performance in the presence of noise, deterministic distortions, and conditions of texture and depth contrast. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Sampling for Shape from Focus in Optical Microscopy

    Publication Year: 2012 , Page(s): 564 - 573
    Cited by:  Papers (2)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1821 KB) |  | HTML iconHTML  

    Shape from focus (SFF), which relies on image focus as a cue within sequenced images, represents a passive technique in recovering object shapes in scenes. Although numerous methods have been recently proposed, less attention has been paid to particular factors affecting them. In regard to SFF, one such critical factor impacting system application is the total number of images. A large data set requires a huge amount of computation power, whereas decreasing the number of images causes shape reconstruction to be crude and erroneous. The total number of images is inversely proportional to interframe distance or sampling step size. In this paper, interframe distance (or sampling step size) criteria for SFF systems have been formulated. In particular, light ray focusing is approximated by the use of a Gaussian beam followed by the formulation of a sampling expression using Nyquist sampling. Consequently, a fitting function for focus curves is also obtained. Experiments are performed on simulated and real objects to validate the proposed schemes. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Texture Classification from Random Features

    Publication Year: 2012 , Page(s): 574 - 586
    Cited by:  Papers (34)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (2905 KB) |  | HTML iconHTML  

    Inspired by theories of sparse representation and compressed sensing, this paper presents a simple, novel, yet very powerful approach for texture classification based on random projection, suitable for large texture database applications. At the feature extraction stage, a small set of random features is extracted from local image patches. The random features are embedded into a bag--of-words model to perform texture classification; thus, learning and classification are carried out in a compressed domain. The proposed unconventional random feature extraction is simple, yet by leveraging the sparse nature of texture images, our approach outperforms traditional feature extraction methods which involve careful design and complex steps. We have conducted extensive experiments on each of the CUReT, the Brodatz, and the MSRC databases, comparing the proposed approach to four state-of-the-art texture classification methods: Patch, Patch-MRF, MR8, and LBP. We show that our approach leads to significant improvements in classification accuracy and reductions in feature dimensionality. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Tracking Mobile Users in Wireless Networks via Semi-Supervised Colocalization

    Publication Year: 2012 , Page(s): 587 - 600
    Cited by:  Papers (6)  |  Patents (1)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1872 KB) |  | HTML iconHTML  

    Recent years have witnessed the growing popularity of sensor and sensor-network technologies, supporting important practical applications. One of the fundamental issues is how to accurately locate a user with few labeled data in a wireless sensor network, where a major difficulty arises from the need to label large quantities of user location data, which in turn requires knowledge about the locations of signal transmitters or access points. To solve this problem, we have developed a novel machine learning-based approach that combines collaborative filtering with graph-based semi-supervised learning to learn both mobile users' locations and the locations of access points. Our framework exploits both labeled and unlabeled data from mobile devices and access points. In our two-phase solution, we first build a manifold-based model from a batch of labeled and unlabeled data in an offline training phase and then use a weighted k-nearest-neighbor method to localize a mobile client in an online localization phase. We extend the two-phase colocalization to an online and incremental model that can deal with labeled and unlabeled data that come sequentially and adapt to environmental changes. Finally, we embed an action model to the framework such that additional kinds of sensor signals can be utilized to further boost the performance of mobile tracking. Compared to other state-of-the-art systems, our framework has been shown to be more accurate while requiring less calibration effort in our experiments performed on three different testbeds. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Weakly Supervised Learning of Interactions between Humans and Objects

    Publication Year: 2012 , Page(s): 601 - 614
    Cited by:  Papers (17)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (3265 KB) |  | HTML iconHTML  

    We introduce a weakly supervised approach for learning human actions modeled as interactions between humans and objects. Our approach is human-centric: We first localize a human in the image and then determine the object relevant for the action and its spatial relation with the human. The model is learned automatically from a set of still images annotated only with the action label. Our approach relies on a human detector to initialize the model learning. For robustness to various degrees of visibility, we build a detector that learns to combine a set of existing part detectors. Starting from humans detected in a set of images depicting the action, our approach determines the action object and its spatial relation to the human. Its final output is a probabilistic model of the human-object interaction, i.e., the spatial relation between the human and the object. We present an extensive experimental evaluation on the sports action data set from [1], the PASCAL Action 2010 data set [2], and a new human-object interaction data set. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • The Action Similarity Labeling Challenge

    Publication Year: 2012 , Page(s): 615 - 621
    Cited by:  Papers (10)
    Multimedia
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1314 KB) |  | HTML iconHTML  

    Recognizing actions in videos is rapidly becoming a topic of much research. To facilitate the development of methods for action recognition, several video collections, along with benchmark protocols, have previously been proposed. In this paper, we present a novel video database, the “Action Similarity LAbeliNg” (ASLAN) database, along with benchmark protocols. The ASLAN set includes thousands of videos collected from the web, in over 400 complex action classes. Our benchmark protocols focus on action similarity (same/not-same), rather than action classification, and testing is performed on never-before-seen actions. We propose this data set and benchmark as a means for gaining a more principled understanding of what makes actions different or similar, rather than learning the properties of particular action classes. We present baseline results on our benchmark, and compare them to human performance. To promote further study of action similarity techniques, we make the ASLAN database, benchmarks, and descriptor encodings publicly available to the research community. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • TPAMI Seeks Applications for EIC for 2013-2014 Term

    Publication Year: 2012 , Page(s): 622
    Save to Project icon | Request Permissions | PDF file iconPDF (90 KB)  
    Freely Available from IEEE
  • IEEE Computer Society OnlinePlus Video Tutorial

    Publication Year: 2012 , Page(s): 623
    Save to Project icon | Request Permissions | PDF file iconPDF (769 KB)  
    Freely Available from IEEE
  • What's new in Transactions [advertisement]

    Publication Year: 2012 , Page(s): 624
    Save to Project icon | Request Permissions | PDF file iconPDF (764 KB)  
    Freely Available from IEEE
  • [Cover3]

    Publication Year: 2012 , Page(s): c3
    Save to Project icon | Request Permissions | PDF file iconPDF (201 KB)  
    Freely Available from IEEE
  • [Cover 4]

    Publication Year: 2012 , Page(s): c4
    Save to Project icon | Request Permissions | PDF file iconPDF (178 KB)  
    Freely Available from IEEE

Aims & Scope

The IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI) is published monthly. Its editorial board strives to present most important research results in areas within TPAMI's scope.

Full Aims & Scope

Meet Our Editors

Editor-in-Chief
David A. Forsyth
University of Illinois