By Topic

Pattern Analysis and Machine Intelligence, IEEE Transactions on

Issue 10 • Date Oct. 2005

Filter Results

Displaying Results 1 - 21 of 21
  • [Front cover]

    Publication Year: 2005 , Page(s): c1
    Save to Project icon | Request Permissions | PDF file iconPDF (137 KB)  
    Freely Available from IEEE
  • [Inside front cover]

    Publication Year: 2005 , Page(s): c2
    Save to Project icon | Request Permissions | PDF file iconPDF (76 KB)  
    Freely Available from IEEE
  • Recognition and verification of unconstrained handwritten words

    Publication Year: 2005 , Page(s): 1509 - 1522
    Cited by:  Papers (20)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1327 KB) |  | HTML iconHTML  

    This paper presents a novel approach for the verification of the word hypotheses generated by a large vocabulary, offline handwritten word recognition system. Given a word image, the recognition system produces a ranked list of the N-best recognition hypotheses consisting of text transcripts, segmentation boundaries of the word hypotheses into characters, and recognition scores. The verification consists of an estimation of the probability of each segment representing a known class of character. Then, character probabilities are combined to produce word confidence scores which are further integrated with the recognition scores produced by the recognition system. The N-best recognition hypothesis list is reranked based on such composite scores. In the end, rejection rules are invoked to either accept the best recognition hypothesis of such a list or to reject the input word image. The use of the verification approach has improved the word recognition rate as well as the reliability of the recognition system, while not causing significant delays in the recognition process. Our approach is described in detail and the experimental results on a large database of unconstrained handwritten words extracted from postal envelopes are presented. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Guided-MLESAC: faster image transform estimation by using matching priors

    Publication Year: 2005 , Page(s): 1523 - 1535
    Cited by:  Papers (41)  |  Patents (2)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1692 KB) |  | HTML iconHTML  

    MLESAC is an established algorithm for maximum-likelihood estimation by random sampling consensus, devised for computing multiview entities like the fundamental matrix from correspondences between image features. A shortcoming of the method is that it assumes that little is known about the prior probabilities of the validities of the correspondences. This paper explains the consequences of that omission and describes how the algorithm's theoretical standing and practical performance can be enhanced by deriving estimates of these prior probabilities. Using the priors in guided-MLESAC is found to give an order of magnitude speed increase for problems where the correspondences are described by one image transformation and clutter. This paper describes two further modifications to guided-MLESAC. The first shows how all putative matches, rather than just the best, from a particular feature can be taken forward into the sampling stage, albeit at the expense of additional computation. The second suggests how to propagate the output from one frame forward to successive frames. The additional information makes guided-MLESAC computationally realistic at video-rates for correspondence sets modeled by two transformations and clutter. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Integrating relevance feedback techniques for image retrieval using reinforcement learning

    Publication Year: 2005 , Page(s): 1536 - 1551
    Cited by:  Papers (16)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1859 KB) |  | HTML iconHTML  

    Relevance feedback (RF) is an interactive process which refines the retrievals to a particular query by utilizing the user's feedback on previously retrieved results. Most researchers strive to develop new RF techniques and ignore the advantages of existing ones. In this paper, we propose an image relevance reinforcement learning (IRRL) model for integrating existing RF techniques in a content-based image retrieval system. Various integration schemes are presented and a long-term shared memory is used to exploit the retrieval experience from multiple users. Also, a concept digesting method is proposed to reduce the complexity of storage demand. The experimental results manifest that the integration of multiple RF approaches gives better retrieval performance than using one RF technique alone, and that the sharing of relevance knowledge between multiple query sessions significantly improves the performance. Further, the storage demand is significantly reduced by the concept digesting technique. This shows the scalability of the proposed model with the increasing-size of database. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Adaptive smoothing via contextual and local discontinuities

    Publication Year: 2005 , Page(s): 1552 - 1567
    Cited by:  Papers (49)  |  Patents (1)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1130 KB) |  | HTML iconHTML  

    A novel adaptive smoothing approach is proposed for noise removal and feature preservation where two distinct measures are simultaneously adopted to detect discontinuities in an image. Inhomogeneity underlying an image is employed as a multiscale measure to detect contextual discontinuities for feature preservation and control of the smoothing speed, while local spatial gradient is used for detection of variable local discontinuities during smoothing. Unlike previous adaptive smoothing approaches, two discontinuity measures are combined in our algorithm for synergy in preserving nontrivial features, which leads to a constrained anisotropic diffusion process that inhomogeneity offers intrinsic constraints for selective smoothing. Thanks to the use of intrinsic constraints, our smoothing scheme is insensitive to termination times and the resultant images in a wide range of iterations are applicable to achieve nearly identical results for various early vision tasks. Our algorithm is formally analyzed and related to anisotropic diffusion. Comparative results indicate that our algorithm yields favorable smoothing results, and its application in extraction of hydrographic objects demonstrates its usefulness as a tool for early vision. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Point processes for unsupervised line network extraction in remote sensing

    Publication Year: 2005 , Page(s): 1568 - 1579
    Cited by:  Papers (40)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (709 KB) |  | HTML iconHTML  

    This paper addresses the problem of unsupervised extraction of line networks (for example, road or hydrographic networks) from remotely sensed images. We model the target line network by an object process, where the objects correspond to interacting line segments. The prior model, called "quality candy," is designed to exploit as fully as possible the topological properties of the network under consideration, while the radiometric properties of the network are modeled using a data term based on statistical tests. Two techniques are used to compute this term: one is more accurate, the other more efficient. A calibration technique is used to choose the model parameters. Optimization is done via simulated annealing using a reversible jump Markov chain Monte Carlo (RJMCMC) algorithm. We accelerate convergence of the algorithm by using appropriate proposal kernels. The results obtained on satellite and aerial images are quantitatively evaluated with respect to manual extractions. A comparison with the results obtained using a previous model, called the "candy" model, shows the interest of adding quality coefficients with respect to interactions in the prior density. The relevance of using an offline computation of the data potential is shown, in particular, when a proposal kernel based on this computation is added in the RJMCMC algorithm. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Estimates of error probability for complex Gaussian channels with generalized likelihood ratio detection

    Publication Year: 2005 , Page(s): 1580 - 1591
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1097 KB) |  | HTML iconHTML  

    We derive approximate expressions for the probability of error in a two-class hypothesis testing problem in which the two hypotheses are characterized by zero-mean complex Gaussian distributions. These error expressions are given in terms of the moments of the test statistic employed and we derive these moments for both the likelihood ratio test, appropriate when class densities are known, and the generalized likelihood ratio test, appropriate when class densities must be estimated from training data. These moments are functions of class distribution parameters which are generally unknown so we develop unbiased moment estimators in terms of the training data. With these, accurate estimates of probability of error can be calculated quickly for both the optimal and plug-in rules from available training data. We present a detailed example of the behavior of these estimators and demonstrate their application to common pattern recognition problems, which include quantifying the incremental value of larger training data collections, evaluating relative geometry in data fusion from multiple sensors, and selecting a good subset of available features. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • On visualization and aggregation of nearest neighbor classifiers

    Publication Year: 2005 , Page(s): 1592 - 1602
    Cited by:  Papers (17)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (647 KB) |  | HTML iconHTML  

    Nearest neighbor classification is one of the simplest and most popular methods for statistical pattern recognition. A major issue in k-nearest neighbor classification is how to find an optimal value of the neighborhood parameter k. In practice, this value is generally estimated by the method of cross-validation. However, the ideal value of k in a classification problem not only depends on the entire data set, but also on the specific observation to be classified. Instead of using any single value of k, this paper studies results for a finite sequence of classifiers indexed by k. Along with the usual posterior probability estimates, a new measure, called the Bayesian measure of strength, is proposed and investigated in this paper as a measure of evidence for different classes. The results of these classifiers and their corresponding estimated misclassification probabilities are visually displayed using shaded strips. These plots provide an effective visualization of the evidence in favor of different classes when a given data point is to be classified. We also propose a simple weighted averaging technique that aggregates the results of different nearest neighbor classifiers to arrive at the final decision. Based on the analysis of several benchmark data sets, the proposed method is found to be better than using a single value of k. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • On the use of error propagation for statistical validation of computer vision software

    Publication Year: 2005 , Page(s): 1603 - 1614
    Cited by:  Papers (1)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1178 KB) |  | HTML iconHTML  

    Computer vision software is complex, involving many tens of thousands of lines of code. Coding mistakes are not uncommon. When the vision algorithms are run on controlled data which meet all the algorithm assumptions, the results are often statistically predictable. This renders it possible to statistically validate the computer vision software and its associated theoretical derivations. In this paper, we review the general theory for some relevant kinds of statistical testing and then illustrate this experimental methodology to validate our building parameter estimation software. This software estimates the 3D positions of buildings vertices based on the input data obtained from multi-image photogrammetric resection calculations and 3D geometric information relating some of the points, lines and planes of the buildings to each other. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A performance evaluation of local descriptors

    Publication Year: 2005 , Page(s): 1615 - 1630
    Cited by:  Papers (1471)  |  Patents (83)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (4874 KB) |  | HTML iconHTML  

    In this paper, we compare the performance of descriptors computed for local interest regions, as, for example, extracted by the Harris-Affine detector [Mikolajczyk, K and Schmid, C, 2004]. Many different descriptors have been proposed in the literature. It is unclear which descriptors are more appropriate and how their performance depends on the interest region detector. The descriptors should be distinctive and at the same time robust to changes in viewing conditions as well as to errors of the detector. Our evaluation uses as criterion recall with respect to precision and is carried out for different image transformations. We compare shape context [Belongie, S, et al., April 2002], steerable filters [Freeman, W and Adelson, E, Setp. 1991], PCA-SIFT [Ke, Y and Sukthankar, R, 2004], differential invariants [Koenderink, J and van Doorn, A, 1987], spin images [Lazebnik, S, et al., 2003], SIFT [Lowe, D. G., 1999], complex filters [Schaffalitzky, F and Zisserman, A, 2002], moment invariants [Van Gool, L, et al., 1996], and cross-correlation for different types of interest regions. We also propose an extension of the SIFT descriptor and show that it outperforms the original method. Furthermore, we observe that the ranking of the descriptors is mostly independent of the interest region detector and that the SIFT-based descriptors perform best. Moments and steerable filters show the best performance among the low dimensional descriptors. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Online selection of discriminative tracking features

    Publication Year: 2005 , Page(s): 1631 - 1643
    Cited by:  Papers (337)  |  Patents (5)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (3085 KB) |  | HTML iconHTML  

    This paper presents an online feature selection mechanism for evaluating multiple features while tracking and adjusting the set of features used to improve tracking performance. Our hypothesis is that the features that best discriminate between object and background are also best for tracking the object. Given a set of seed features, we compute log likelihood ratios of class conditional sample densities from object and background to form a new set of candidate features tailored to the local object/background discrimination task. The two-class variance ratio is used to rank these new features according to how well they separate sample distributions of object and background pixels. This feature evaluation mechanism is embedded in a mean-shift tracking system that adaptively selects the top-ranked discriminative features for tracking. Examples are presented that demonstrate how this method adapts to changing appearances of both tracked object and scene background. We note susceptibility of the variance ratio feature selection method to distraction by spatially correlated background clutter and develop an additional approach that seeks to minimize the likelihood of distraction. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Motion layer extraction in the presence of occlusion using graph cuts

    Publication Year: 2005 , Page(s): 1644 - 1659
    Cited by:  Papers (47)  |  Patents (1)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (3869 KB) |  | HTML iconHTML  

    Extracting layers from video is very important for video representation, analysis, compression, and synthesis. Assuming that a scene can be approximately described by multiple planar regions, this paper describes a robust and novel approach to automatically extract a set of affine or projective transformations induced by these regions, detect the occlusion pixels over multiple consecutive frames, and segment the scene into several motion layers. First, after determining a number of seed regions using correspondences in two frames, we expand the seed regions and reject the outliers employing the graph cuts method integrated with level set representation. Next, these initial regions are merged into several initial layers according to the motion similarity. Third, an occlusion order constraint on multiple frames is explored, which enforces that the occlusion area increases with the temporal order in a short period and effectively maintains segmentation consistency over multiple consecutive frames. Then, the correct layer segmentation is obtained by using a graph cuts algorithm and the occlusions between the overlapping layers are explicitly determined. Several experimental results are demonstrated to show that our approach is effective and robust. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Recognizing articulated objects using a region-based invariant transform

    Publication Year: 2005 , Page(s): 1660 - 1665
    Cited by:  Papers (4)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (445 KB) |  | HTML iconHTML  

    In this paper, we present a new method for representing and recognizing objects, based on invariants of the object's regions. We apply the method to articulated objects in low-resolution, noisy range images. Articulated objects such as a back-hoe can have many degrees of freedom, in addition to the unknown variables of viewpoint. Recognizing such an object in an image can involve a search in a high-dimensional space that involves all these unknown variables. Here, we use invariance to reduce this search space to a manageable size. The low resolution of our range images makes it hard to use common features such as edges to find invariants. We have thus developed a new "featureless" method that does not depend on feature detection. Instead of local features, we deal with whole regions of the object. We define a "transform" that converts the image into an invariant representation on a grid, based on invariant descriptors of entire regions centered around the grid points. We use these region-based invariants for indexing and recognition. While the focus here is on articulation, the method can be easily applied to other problems such as the occlusion of fixed objects. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • How to put probabilities on homographies

    Publication Year: 2005 , Page(s): 1666 - 1670
    Cited by:  Papers (13)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (416 KB) |  | HTML iconHTML  

    We present a family of "normal" distributions over a matrix group together with a simple method for estimating its parameters. In particular, the mean of a set of elements can be calculated. The approach is applied to planar projective homographies, showing that using priors defined in this way improves object recognition. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • An improved rotation-invariant thinning algorithm

    Publication Year: 2005 , Page(s): 1671 - 1674
    Cited by:  Papers (15)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (393 KB) |  | HTML iconHTML  

    Ahmed and Ward [Sept. 1995] have recently presented an elegant, rule-based rotation-invariant thinning algorithm to produce a single-pixel wide skeleton from a binary image. We show examples where this algorithm fails on two-pixel wide lines and propose a modified method which corrects this shortcoming based on graph connectivity. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Clustered blockwise PCA for representing visual data

    Publication Year: 2005 , Page(s): 1675 - 1679
    Cited by:  Papers (6)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (365 KB) |  | HTML iconHTML  

    Principal component analysis (PCA) is extensively used in computer vision and image processing. Since it provides the optimal linear subspace in a least-square sense, it has been used for dimensionality reduction and subspace analysis in various domains. However, its scalability is very limited because of its inherent computational complexity. We introduce a new framework for applying PCA to visual data which takes advantage of the spatio-temporal correlation and localized frequency variations that are typically found in such data. Instead of applying PCA to the whole volume of data (complete set of images), we partition the volume into a set of blocks and apply PCA to each block. Then, we group the subspaces corresponding to the blocks and merge them together. As a result, we not only achieve greater efficiency in the resulting representation of the visual data, but also successfully scale PCA to handle large data sets. We present a thorough analysis of the computational complexity and storage benefits of our approach. We apply our algorithm to several types of videos. We show that, in addition to its storage and speed benefits, the algorithm results in a useful representation of the visual data. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Building k edge-disjoint spanning trees of minimum total length for isometric data embedding

    Publication Year: 2005 , Page(s): 1680 - 1683
    Cited by:  Papers (15)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (574 KB) |  | HTML iconHTML  

    Isometric data embedding requires construction of a neighborhood graph that spans all data points so that geodesic distance between any pair of data points could be estimated by distance along the shortest path between the pair on the graph. This paper presents an approach for constructing k-edge-connected neighborhood graphs. It works by finding k edge-disjoint spanning trees the sum of whose total lengths is a minimum. Experiments show that it outperforms the nearest neighbor approach for geodesic distance estimation. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • [Advertisement]

    Publication Year: 2005 , Page(s): 1684
    Save to Project icon | Request Permissions | PDF file iconPDF (402 KB)  
    Freely Available from IEEE
  • TPAMI Information for authors

    Publication Year: 2005 , Page(s): c3
    Save to Project icon | Request Permissions | PDF file iconPDF (76 KB)  
    Freely Available from IEEE
  • [Back cover]

    Publication Year: 2005 , Page(s): c4
    Save to Project icon | Request Permissions | PDF file iconPDF (137 KB)  
    Freely Available from IEEE

Aims & Scope

The IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI) is published monthly. Its editorial board strives to present most important research results in areas within TPAMI's scope.

Full Aims & Scope

Meet Our Editors

Editor-in-Chief
David A. Forsyth
University of Illinois