By Topic

Pattern Analysis and Machine Intelligence, IEEE Transactions on

Issue 9 • Date Sept. 2006

Filter Results

Displaying Results 1 - 21 of 21
  • [Front cover]

    Page(s): c1
    Save to Project icon | Request Permissions | PDF file iconPDF (131 KB)  
    Freely Available from IEEE
  • [Inside front cover]

    Page(s): c2
    Save to Project icon | Request Permissions | PDF file iconPDF (95 KB)  
    Freely Available from IEEE
  • Determination of the method of construction of 1650 B.C. wall paintings

    Page(s): 1361 - 1371
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (2615 KB) |  | HTML iconHTML  

    In this paper, a methodology of general applicability is presented for answering the question if an artist used a number of archetypes to draw a painting or if he drew it freehand. In fact, the contour line parts of the drawn objects that potentially correspond to archetypes are initially spotted. Subsequently, the exact form of these archetypes and their appearance throughout the painting is determined. The method has been applied to celebrated Thera Late Bronze Age wall paintings with full success. It has been demonstrated that the artist or group of artists has used seven geometrical archetypes and seven corresponding well-constructed stencils (four hyperbolae, two ellipses, and one Archimedes' spiral) to draw the wall painting "Gathering of Crocus" in 1650 B.C. This method of drawing seems to be unique in the history of arts and of great importance for archaeology, and the history of mathematics and sciences, as well View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Model-based hand tracking using a hierarchical Bayesian filter

    Page(s): 1372 - 1384
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (2215 KB) |  | HTML iconHTML  

    This paper sets out a tracking framework, which is applied to the recovery of three-dimensional hand motion from an image sequence. The method handles the issues of initialization, tracking, and recovery in a unified way. In a single input image with no prior information of the hand pose, the algorithm is equivalent to a hierarchical detection scheme, where unlikely pose candidates are rapidly discarded. In image sequences, a dynamic model is used to guide the search and approximate the optimal filtering equations. A dynamic model is given by transition probabilities between regions in parameter space and is learned from training data obtained by capturing articulated motion. The algorithm is evaluated on a number of image sequences, which include hand motion with self-occlusion in front of a cluttered background View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Feature extraction using information-theoretic learning

    Page(s): 1385 - 1392
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (646 KB) |  | HTML iconHTML  

    A classification system typically consists of both a feature extractor (preprocessor) and a classifier. These two components can be trained either independently or simultaneously. The former option has an implementation advantage since the extractor need only be trained once for use with any classifier, whereas the latter has an advantage since it can be used to minimize classification error directly. Certain criteria, such as minimum classification error, are better suited for simultaneous training, whereas other criteria, such as mutual information, are amenable for training the feature extractor either independently or simultaneously. Herein, an information-theoretic criterion is introduced and is evaluated for training the extractor independently of the classifier. The proposed method uses nonparametric estimation of Renyi's entropy to train the extractor by maximizing an approximation of the mutual information between the class labels and the output of the feature extractor. The evaluations show that the proposed method, even though it uses independent training, performs at least as well as three feature extraction methods that train the extractor and classifier simultaneously View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Diffusion maps and coarse-graining: a unified framework for dimensionality reduction, graph partitioning, and data set parameterization

    Page(s): 1393 - 1403
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1378 KB) |  | HTML iconHTML  

    We provide evidence that nonlinear dimensionality reduction, clustering, and data set parameterization can be solved within one and the same framework. The main idea is to define a system of coordinates with an explicit metric that reflects the connectivity of a given data set and that is robust to noise. Our construction, which is based on a Markov random walk on the data, offers a general scheme of simultaneously reorganizing and subsampling graphs and arbitrarily shaped data sets in high dimensions using intrinsic geometry. We show that clustering in embedding spaces is equivalent to compressing operators. The objective of data partitioning and clustering is to coarse-grain the random walk on the data while at the same time preserving a diffusion operator for the intrinsic geometry or connectivity of the data set up to some accuracy. We show that the quantization distortion in diffusion space bounds the error of compression of the operator, thus giving a rigorous justification for k-means clustering in diffusion space and a precise measure of the performance of general clustering algorithms View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Pose and motion recovery from feature correspondences and a digital terrain map

    Page(s): 1404 - 1417
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1555 KB) |  | HTML iconHTML  

    A novel algorithm for pose and motion estimation using corresponding features and a digital terrain map is proposed. Using a digital terrain (or digital elevation) map (DTM/DEM) as a global reference enables the elimination of the ambiguity present in vision-based algorithms for motion recovery. As a consequence, the absolute position and orientation of a camera can be recovered with respect to the external reference frame. In order to do this, the DTM is used to formulate a constraint between corresponding features in two consecutive frames. Explicit reconstruction of the 3D world is not required. When considering a number of feature points, the resulting constraints can be solved using nonlinear optimization in terms of position, orientation, and motion. Such a procedure requires an initial guess of these parameters, which can be obtained from dead-reckoning or any other source. The feasibility of the algorithm is established through extensive experimentation. Performance is compared with a state-of-the-art alternative algorithm, which intermediately reconstructs the 3D structure and then registers it to the DTM. A clear advantage for the novel algorithm is demonstrated in variety of scenarios View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Error analysis of robust optical flow estimation by least median of squares methods for the varying illumination model

    Page(s): 1418 - 1435
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (2795 KB) |  | HTML iconHTML  

    The apparent pixel motion in an image sequence, called optical flow, is a useful primitive for automatic scene analysis and various other applications of computer vision. In general, however, the optical flow estimation suffers from two significant problems: the problem of illumination that varies with time and the problem of motion discontinuities induced by objects moving with respect to either other objects or with respect to the background. Various integrated approaches for solving these two problems simultaneously have been proposed. Of these, those that are based on the LMedS (least median of squares) appear to be the most robust. The goal of this paper is to carry out an error analysis of two different LMedS-based approaches, one based on the standard LMedS regression and the other using a modification thereof as proposed by us recently. While it is to be expected that the estimation accuracy of any approach would decrease with increasing levels of noise, for LMedS-like methods, it is not always clear as to how much of that decrease in performance can be attributed to the fact that only a small number of randomly selected samples is used for forming temporary solutions. To answer this question, our study here includes a baseline implementation in which all of the image data is used for forming motion estimates. We then compare the estimation errors of the two LMedS-based methods with the baseline implementation. Our error analysis demonstrates that, for the case of Gaussian noise, our modified LMedS approach yields better estimates at moderate levels of noise, but is outperformed by the standard LMedS method as the level of noise increases. For the case of salt-and-pepper noise, the modified LMedS method consistently performs better than the standard LMedS method View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Approximate Bayesian multibody tracking

    Page(s): 1436 - 1449
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (755 KB) |  | HTML iconHTML  

    Visual tracking of multiple targets is a challenging problem, especially when efficiency is an issue. Occlusions, if not properly handled, are a major source of failure. Solutions supporting principled occlusion reasoning have been proposed but are yet unpractical for online applications. This paper presents a new solution which effectively manages the trade-off between reliable modeling and computational efficiency. The hybrid joint-separable (HJS) filter is derived from a joint Bayesian formulation of the problem, and shown to be efficient while optimal in terms of compact belief representation. Computational efficiency is achieved by employing a Markov random field approximation to joint dynamics and an incremental algorithm for posterior update with an appearance likelihood that implements a physically-based model of the occlusion process. A particle filter implementation is proposed which achieves accurate tracking during partial occlusions, while in cases of complete occlusion, tracking hypotheses are bound to estimated occlusion volumes. Experiments show that the proposed algorithm is efficient, robust, and able to resolve long-term occlusions between targets with identical appearance View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A system for learning statistical motion patterns

    Page(s): 1450 - 1464
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (2096 KB) |  | HTML iconHTML  

    Analysis of motion patterns is an effective approach for anomaly detection and behavior prediction. Current approaches for the analysis of motion patterns depend on known scenes, where objects move in predefined ways. It is highly desirable to automatically construct object motion patterns which reflect the knowledge of the scene. In this paper, we present a system for automatically learning motion patterns for anomaly detection and behavior prediction based on a proposed algorithm for robustly tracking multiple objects. In the tracking algorithm, foreground pixels are clustered using a fast accurate fuzzy k-means algorithm. Growing and prediction of the cluster centroids of foreground pixels ensure that each cluster centroid is associated with a moving object in the scene. In the algorithm for learning motion patterns, trajectories are clustered hierarchically using spatial and temporal information and then each motion pattern is represented with a chain of Gaussian distributions. Based on the learned statistical motion patterns, statistical methods are used to detect anomalies and predict behaviors. Our system is tested using image sequences acquired, respectively, from a crowded real traffic scene and a model traffic scene. Experimental results show the robustness of the tracking algorithm, the efficiency of the algorithm for learning motion patterns, and the encouraging performance of algorithms for anomaly detection and behavior prediction View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Keypoint recognition using randomized trees

    Page(s): 1465 - 1479
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (3722 KB) |  | HTML iconHTML  

    In many 3D object-detection and pose-estimation problems, runtime performance is of critical importance. However, there usually is time to train the system, which we would show to be very useful. Assuming that several registered images of the target object are available, we developed a keypoint-based approach that is effective in this context by formulating wide-baseline matching of keypoints extracted from the input images to those found in the model images as a classification problem. This shifts much of the computational burden to a training phase, without sacrificing recognition performance. As a result, the resulting algorithm is robust, accurate, and fast-enough for frame-rate performance. This reduction in runtime computational complexity is our first contribution. Our second contribution is to show that, in this context, a simple and fast keypoint detector suffices to support detection and tracking even under large perspective and scale variations. While earlier methods require a detector that can be expected to produce very repeatable results, in general, which usually is very time-consuming, we simply find the most repeatable object keypoints for the specific target object during the training phase. We have incorporated these ideas into a real-time system that detects planar, nonplanar, and deformable objects. It then estimates the pose of the rigid ones and the deformations of the others View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Probabilistic fusion of stereo with color and contrast for bilayer segmentation

    Page(s): 1480 - 1492
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (3470 KB) |  | HTML iconHTML  

    This paper describes models and algorithms for the real-time segmentation of foreground from background layers in stereo video sequences. Automatic separation of layers from color/contrast or from stereo alone is known to be error-prone. Here, color, contrast, and stereo matching information are fused to infer layers accurately and efficiently. The first algorithm, layered dynamic programming (LDP), solves stereo in an extended six-state space that represents both foreground/background layers and occluded regions. The stereo-match likelihood is then fused with a contrast-sensitive color model that is learned on-the-fly and stereo disparities are obtained by dynamic programming. The second algorithm, layered graph cut (LGC), does not directly solve stereo. Instead, the stereo match likelihood is marginalized over disparities to evaluate foreground and background hypotheses and then fused with a contrast-sensitive color model like the one used in LDP. Segmentation is solved efficiently by ternary graph cut. Both algorithms are evaluated with respect to ground truth data and found to have similar performance, substantially better than either stereo or color/contrast alone. However, their characteristics with respect to computational efficiency are rather different. The algorithms are demonstrated in the application of background substitution and shown to give good quality composite video output View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Polarimetric image segmentation via maximum-likelihood approximation and efficient multiphase level-sets

    Page(s): 1493 - 1500
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (3185 KB) |  | HTML iconHTML  

    This study investigates a level set method for complex polarimetric image segmentation. It consists of minimizing a functional containing an original observation term derived from maximum-likelihood approximation and a complex Wishart/Gaussian image representation and a classical boundary length prior. The minimization is carried out efficiently by a new multiphase method which embeds a simple partition constraint directly in curve evolution to guarantee a partition of the image domain from an arbitrary initial partition. Results are shown on both synthetic and real images. Quantitative performance evaluation and comparisons are also given View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A new convexity measure based on a probabilistic interpretation of images

    Page(s): 1501 - 1512
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1032 KB) |  | HTML iconHTML  

    In this paper, we present a novel convexity measure for object shape analysis. The proposed method is based on the idea of generating pairs of points from a set and measuring the probability that a point dividing the corresponding line segments belongs to the same set. The measure is directly applicable to image functions representing shapes and also to gray-scale images which approximate image binarizations. The approach introduced gives rise to a variety of convexity measures which make it possible to obtain more information about the object shape. The proposed measure turns out to be easy to implement using the fast Fourier transform and we would consider this in detail. Finally, we illustrate the behavior of our measure in different situations and compare it to other similar ones View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Camera calibration from video of a walking human

    Page(s): 1513 - 1518
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1069 KB) |  | HTML iconHTML  

    A self-calibration method to estimate a camera's intrinsic and extrinsic parameters from vertical line segments of the same height is presented. An algorithm to obtain the needed line segments by detecting the head and feet positions of a walking human in his leg-crossing phases is described. Experimental results show that the method is accurate and robust with respect to various viewing angles and subjects View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Total variation models for variable lighting face recognition

    Page(s): 1519 - 1524
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (794 KB) |  | HTML iconHTML  

    In this paper, we present the logarithmic total variation (LTV) model for face recognition under varying illumination, including natural lighting conditions, where we rarely know the strength, direction, or number of light sources. The proposed LTV model has the ability to factorize a single face image and obtain the illumination invariant facial structure, which is then used for face recognition. Our model is inspired by the SQI model but has better edge-preserving ability and simpler parameter selection. The merit of this model is that neither does it require any lighting assumption nor does it need any training. The LTV model reaches very high recognition rates in the tests using both Yale and CMU PIE face databases as well as a face database containing 765 subjects under outdoor lighting conditions View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Multicue HMM-UKF for real-time contour tracking

    Page(s): 1525 - 1529
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (485 KB) |  | HTML iconHTML  

    We propose an HMM model for contour detection based on multiple visual cues in spatial domain and improve it by joint probabilistic matching to reduce background clutter. It is further integrated with unscented Kalman filter to exploit object dynamics in nonlinear systems for robust contour tracking View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Statistical analysis of dynamic actions

    Page(s): 1530 - 1535
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (833 KB) |  | HTML iconHTML  

    Real-world action recognition applications require the development of systems which are fast, can handle a large variety of actions without a priori knowledge of the type of actions, need a minimal number of parameters, and necessitate as short as possible learning stage. In this paper, we suggest such an approach. We regard dynamic activities as long-term temporal objects, which are characterized by spatio-temporal features at multiple temporal scales. Based on this, we design a simple statistical distance measure between video sequences which captures the similarities in their behavioral content. This measure is nonparametric and can thus handle a wide range of complex dynamic actions. Having a behavior-based distance measure between sequences, we use it for a variety of tasks, including: video indexing, temporal segmentation, and action-based video clustering. These tasks are performed without prior knowledge of the types of actions, their models, or their temporal extents View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • IEEE Computer Society celebrates two 60-year anniversaries

    Page(s): 1536
    Save to Project icon | Request Permissions | PDF file iconPDF (104 KB)  
    Freely Available from IEEE
  • TPAMI Information for authors

    Page(s): c3
    Save to Project icon | Request Permissions | PDF file iconPDF (95 KB)  
    Freely Available from IEEE
  • [Back cover]

    Page(s): c4
    Save to Project icon | Request Permissions | PDF file iconPDF (131 KB)  
    Freely Available from IEEE

Aims & Scope

The IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI) is published monthly. Its editorial board strives to present most important research results in areas within TPAMI's scope.

Full Aims & Scope

Meet Our Editors

Editor-in-Chief
David A. Forsyth
University of Illinois