By Topic

Image Processing, IEEE Transactions on

Issue 11 • Date Nov. 2008

Filter Results

Displaying Results 1 - 25 of 27
  • Table of contents

    Publication Year: 2008 , Page(s): C1 - C4
    Save to Project icon | Request Permissions | PDF file iconPDF (41 KB)  
    Freely Available from IEEE
  • IEEE Transactions on Image Processing publication information

    Publication Year: 2008 , Page(s): C2
    Save to Project icon | Request Permissions | PDF file iconPDF (38 KB)  
    Freely Available from IEEE
  • \varepsilon -Optimal Non-Bayesian Anomaly Detection for Parametric Tomography

    Publication Year: 2008 , Page(s): 1985 - 1999
    Cited by:  Papers (4)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1177 KB) |  | HTML iconHTML  

    The non-Bayesian detection of an anomaly from a single or a few noisy tomographic projections is considered as a statistical hypotheses testing problem. It is supposed that a radiography is composed of an imaged nonanomalous background medium, considered as a deterministic nuisance parameter, with a possibly hidden anomaly. Because the full voxel-by-voxel reconstruction is impossible, an original tomographic method based on the parametric models of the nonanomalous background medium and radiographic process is proposed to fill up the gap in the missing data. Exploiting this ldquoparametric tomography,rdquo a new detection scheme with a limited loss of optimality is proposed as an alternative to the nonlinear generalized likelihood ratio test, which is untractable in the context of nondestructive testing for the objects with uncertainties in their physical/geometrical properties. The theoretical results are illustrated by the processing of real radiographies for the nuclear fuel rod inspection. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Visual Attention on the Sphere

    Publication Year: 2008 , Page(s): 2000 - 2014
    Cited by:  Papers (5)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1933 KB) |  | HTML iconHTML  

    Human visual system makes an extensive use of visual attention in order to select the most relevant information and speed-up the vision process. Inspired by visual attention, several computer models have been developed and many computer vision applications rely today on such models. However, the actual algorithms are not suitable to omnidirectional images, which contain a significant amount of geometrical distorsion. In this paper, we present a novel computational approach that performs in spherical geometry and thus is suitable for omnidirectional images. Following one of the actual models of visual attention, the spherical saliency map is obtained by fusing together intensity, chromatic, and orientation spherical cue conspicuity maps that are themselves obtained through multiscale analysis on the sphere. Finally, the consecutive maxima in the spherical saliency map represent the spots of attention on the sphere. In the experimental part, the proposed method is then compared to the standard one using a synthetic image. Also, we provide examples of spots detection in real omnidirectional scenes which show its advantages. Finally, an experiment illustrates the homogeneity of the detected visual attention in omnidirectional images. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Deinterlacing Using Variational Methods

    Publication Year: 2008 , Page(s): 2015 - 2028
    Cited by:  Papers (11)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (2599 KB) |  | HTML iconHTML  

    We present a variational framework for deinterlacing that was originally used for inpainting and subsequently redeveloped for deinterlacing. From the framework, we derive a motion adaptive (MA) deinterlacer and a motion compensated (MC) deinterlacer and test them together with a selection of known deinterlacers. To illustrate the need for MC deinterlacing, the problem of details in motion (DIM) is introduced. It cannot be solved by MA deinterlacers or any simpler deinterlacers but only by MC deinterlacers. The major problem in MC deinterlacing is computing reliable optical flow [motion estimation (ME)] in interlaced video. We discuss a number of strategies for computing optical flows on interlaced video hoping to shed some light on this problem. We produce results on challenging real world video data with our variational MC deinterlacer that in most cases are indistinguishable from the ground truth. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Localizing Region-Based Active Contours

    Publication Year: 2008 , Page(s): 2029 - 2039
    Cited by:  Papers (166)  |  Patents (1)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1243 KB) |  | HTML iconHTML  

    In this paper, we propose a natural framework that allows any region-based segmentation energy to be re-formulated in a local way. We consider local rather than global image statistics and evolve a contour based on local information. Localized contours are capable of segmenting objects with heterogeneous feature profiles that would be difficult to capture correctly using a standard global method. The presented technique is versatile enough to be used with any global region-based active contour energy and instill in it the benefits of localization. We describe this framework and demonstrate the localization of three well-known energies in order to illustrate how our framework can be applied to any energy. We then compare each localized energy to its global counterpart to show the improvements that can be achieved. Next, an in-depth study of the behaviors of these energies in response to the degree of localization is given. Finally, we show results on challenging images to illustrate the robust and accurate segmentations that are possible with this new class of active contour models. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • The Pairing of a Wavelet Basis With a Mildly Redundant Analysis via Subband Regression

    Publication Year: 2008 , Page(s): 2040 - 2052
    Cited by:  Papers (10)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (2863 KB) |  | HTML iconHTML  

    A distinction is usually made between wavelet bases and wavelet frames. The former are associated with a one-to-one representation of signals, which is somewhat constrained but most efficient computationally. The latter are over-complete, but they offer advantages in terms of flexibility (shape of the basis functions) and shift-invariance. In this paper, we propose a framework for improved wavelet analysis based on an appropriate pairing of a wavelet basis with a mildly redundant version of itself (frame). The processing is accomplished in four steps: 1) redundant wavelet analysis, 2) wavelet-domain processing, 3) projection of the results onto the wavelet basis, and 4) reconstruction of the signal from its nonredundant wavelet expansion. The wavelet analysis is pyramid-like and is obtained by simple modification of Mallat's filterbank algorithm (e.g., suppression of the down-sampling in the wavelet channels only). The key component of the method is the subband regression filter (Step 3) which computes a wavelet expansion that is maximally consistent in the least squares sense with the redundant wavelet analysis. We demonstrate that this approach significantly improves the performance of soft-threshold wavelet denoising with a moderate increase in computational cost. We also show that the analysis filters in the proposed framework can be adjusted for improved feature detection; in particular, a new quincunx Mexican-hat-like wavelet transform that is fully reversible and essentially behaves the (gamma/2)th Laplacian of a Gaussian. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Compactly Supported Orthogonal and Biorthogonal \sqrt 5 -Refinement Wavelets With 4-Fold Symmetry

    Publication Year: 2008 , Page(s): 2053 - 2062
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1151 KB) |  | HTML iconHTML  

    Recently, radic5 -refinement hierarchical sampling has been studied and radic5-refinement has been used for surface subdivision. Compared with other refinements, such as the dyadic or quincunx refinement, radic5-refinement has a special property that the nodes in a refined lattice form groups of five nodes with these five nodes having different x and y coordinates. This special property has been shown to be very useful to represent adaptively and render complex and procedural geometry. When radic5-refinement is used for multiresolution data processing, radic5-refinement filter banks and wavelets are required. While the construction of 2-D nonseparable (bi)orthogonal wavelets with the dyadic or quincunx refinement has been studied by many researchers, the construction of (bi)orthogonal wavelets with radic5-refinement has not been investigated. The main goal of this paper is to construct compactly supported orthogonal and biorthogonal wavelets with radic5 -refinement. In this paper, we obtain block structures of orthogonal and biorthogonal radic5-refinement FIR filter banks with 4-fold rotational symmetry. We construct compactly supported orthogonal and biorthogonal wavelets based on these block structures. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Complex Wavelet Bases, Steerability, and the Marr-Like Pyramid

    Publication Year: 2008 , Page(s): 2063 - 2080
    Cited by:  Papers (14)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (3690 KB) |  | HTML iconHTML  

    Our aim in this paper is to tighten the link between wavelets, some classical image-processing operators, and David Marr's theory of early vision. The cornerstone of our approach is a new complex wavelet basis that behaves like a smoothed version of the Gradient-Laplace operator. Starting from first principles, we show that a single-generator wavelet can be defined analytically and that it yields a semi-orthogonal complex basis of L 2(R2), irrespective of the dilation matrix used. We also provide an efficient FFT-based filterbank implementation. We then propose a slightly redundant version of the transform that is nearly translation-invariant and that is optimized for better steerability (Gaussian-like smoothing kernel). We call it the Marr-like wavelet pyramid because it essentially replicates the processing steps in Marr's theory of early vision. We use it to derive a primal wavelet sketch which is a compact description of the image by a multiscale, subsampled edge map. Finally, we provide an efficient iterative algorithm for the reconstruction of an image from its primal wavelet sketch. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Efficient Total Variation Minimization Methods for Color Image Restoration

    Publication Year: 2008 , Page(s): 2081 - 2088
    Cited by:  Papers (15)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (699 KB) |  | HTML iconHTML  

    In this paper, we consider and study a total variation minimization model for color image restoration. In the proposed model, we use the color total variation minimization scheme to denoise the deblurred color image. An alternating minimization algorithm is employed to solve the proposed total variation minimization problem. We show the convergence of the alternating minimization algorithm and demonstrate that the algorithm is very efficient. Our experimental results show that the quality of restored color images by the proposed method are competitive with the other tested methods. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Image Modeling and Denoising With Orientation-Adapted Gaussian Scale Mixtures

    Publication Year: 2008 , Page(s): 2089 - 2101
    Cited by:  Papers (14)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (2904 KB) |  | HTML iconHTML  

    We develop a statistical model to describe the spatially varying behavior of local neighborhoods of coefficients in a multiscale image representation. Neighborhoods are modeled as samples of a multivariate Gaussian density that are modulated and rotated according to the values of two hidden random variables, thus allowing the model to adapt to the local amplitude and orientation of the signal. A third hidden variable selects between this oriented process and a nonoriented scale mixture of Gaussians process, thus providing adaptability to the local orientedness of the signal. Based on this model, we develop an optimal Bayesian least squares estimator for denoising images and show through simulations that the resulting method exhibits significant improvement over previously published results obtained with Gaussian scale mixtures. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Lossless Compression of Color Sequences Using Optimal Linear Prediction Theory

    Publication Year: 2008 , Page(s): 2102 - 2111
    Cited by:  Papers (4)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (489 KB) |  | HTML iconHTML  

    In this paper, we present a novel technique that uses the optimal linear prediction theory to exploit all the existing redundancies in a color video sequence for lossless compression purposes. The main idea is to introduce the spatial, the spectral, and the temporal correlations in the autocorrelation matrix estimate. In this way, we calculate the cross correlations between adjacent frames and adjacent color components to improve the prediction, i.e., reduce the prediction error energy. The residual image is then coded using a context-based Golomb-Rice coder, where the error modeling is provided by a quantized version of the local prediction error variance. Experimental results show that the proposed algorithm achieves good compression ratios and it is robust against the scene change problem. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Turbo Codes-Based Image Transmission for Channels With Multiple Types of Distortion

    Publication Year: 2008 , Page(s): 2112 - 2121
    Cited by:  Papers (4)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (381 KB) |  | HTML iconHTML  

    Product codes are generally used for progressive image transmission when random errors and packet loss (or burst errors) co-exist. However, the optimal rate allocation considering both component codes gives rise to high-optimization complexity. In addition, the decoding performance may be degraded quickly when the channel varies beyond the design point. In this paper, we propose a new unequal error protection (UEP) scheme for progressive image transmission by using rate-compatible punctured Turbo codes (CAPTOR) and cyclic redundancy check (CRC) codes only. By sophisticatedly interleaving each coded frame, the packet loss can be converted into randomly punctured bits in a Turbo code. Therefore, error control in noisy channels with different types of errors is equivalent to dealing with random bit errors only, with reduced turbo code rates. A genetic algorithm-based method is presented to further reduce the optimization complexity. This proposed method not only gives a better performance than product codes in given channel conditions but is also more robust to the channel variation. Finally, to break down the error floor of turbo decoding, we further extend the above RCPT/CRC protection to a product code scheme by adding a Reed-Solomon (RS) code across the frames. The associated rate allocation is discussed and further improvement is demonstrated. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Sampling-Based Correlation Estimation for Distributed Source Coding Under Rate and Complexity Constraints

    Publication Year: 2008 , Page(s): 2122 - 2137
    Cited by:  Papers (13)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1361 KB) |  | HTML iconHTML  

    In many practical distributed source coding (DSC) applications, correlation information has to be estimated at the encoder in order to determine the encoding rate. Coding efficiency depends strongly on the accuracy of this correlation estimation. While error in estimation is inevitable, the impact of estimation error on compression efficiency has not been sufficiently studied for the DSC problem. In this paper, we study correlation estimation subject to rate and complexity constraints, and its impact on coding efficiency in a DSC framework for practical distributed image and video applications. We focus on, in particular, applications where binary correlation models are exploited for Slepian-Wolf coding and sampling techniques are used to estimate the correlation, while extensions to other correlation models would also be briefly discussed. In the first part of this paper, we investigate the compression of binary data. We first propose a model to characterize the relationship between the number of samples used in estimation and the coding rate penalty, in the case of encoding of a single binary source. The model is then extended to scenarios where multiple binary sources are compressed, and based on the model we propose an algorithm to determine the number of samples allocated to different sources so that the overall rate penalty can be minimized, subject to a constraint on the total number of samples. The second part of this paper studies compression of continuous-valued data. We propose a model-based estimation for the particular but important situations where binary bit-planes are extracted from a continuous-valued input source, and each bit-plane is compressed using DSC. The proposed model-based method first estimates the source and correlation noise models using continuous-valued samples, and then uses the models to derive the bit-plane statistics analytically. We also extend the model-based estimation to the cases when bit-planes are extracted based on the sig- - nificance of the data, similar to those commonly used in wavelet-based applications. Experimental results, including some based on hyperspectral image compression, demonstrate the effectiveness of the proposed algorithms. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Hierarchical Color Correction for Camera Cell Phone Images

    Publication Year: 2008 , Page(s): 2138 - 2155
    Cited by:  Papers (8)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (2452 KB) |  | HTML iconHTML  

    In this paper, we propose a hierarchical color correction algorithm for enhancing the color of digital images obtained from low-quality digital image capture devices such as cell phone cameras. The proposed method is based on a multilayer hierarchical stochastic framework whose parameters are learned in an offline training procedure using the well-known expectation maximization (EM) algorithm. This hierarchical framework functions by first making soft assignments of images into defect classes and then processing the images in each defect class with an optimized algorithm. The hierarchical color correction is performed in three stages. In the first stage, global color attributes of the low-quality input image are used in a Gaussian mixture model (GMM) framework to perform a soft classification of the image into M predefined global image classes. In the second stage, the input image is processed with a nonlinear color correction algorithm that is designed for each of the M global classes. This color correction algorithm, which we refer to as resolution synthesis color correction (RSCC), applies a spatially varying color correction determined by the local color attributes of the input image. In the third stage, the outputs of the RSCC predictors are combined using the global classification weights to yield the color corrected output image. We compare the performance of the proposed method to other commercial color correction algorithms on cell phone camera images obtained from different sources. Both subjective and objective measures of quality indicate that the new color correction algorithm improves quality over the existing methods. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Synthetic Aperture Hitchhiker Imaging

    Publication Year: 2008 , Page(s): 2156 - 2173
    Cited by:  Papers (11)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (4623 KB) |  | HTML iconHTML  

    We introduce a novel synthetic-aperture imaging method for radar systems that relies on sources of opportunity. We consider receivers that fly along arbitrary, but known, flight trajectories and develop a spatio-temporal correlation-based filtered-backprojection-type image reconstruction method. The method involves first correlating the measurements from two different receiver locations. This leads to a forward model where the radiance of the target scene is projected onto the intersection of certain hyperboloids with the surface topography. We next use microlocal techniques to develop a filtered-backprojection-type inversion method to recover the scene radiance. The method is applicable to both stationary and mobile, and cooperative and noncooperative sources of opportunity. Additionally, it is applicable to nonideal imaging scenarios such as those involving arbitrary flight trajectories, and has the desirable property of preserving the visible edges of the scene radiance. We present an analysis of the computational complexity of the image reconstruction method and demonstrate its performance in numerical simulations for single and multiple transmitters of opportunity. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Automatic Detection of Magnetic Flux Emergings in the Solar Atmosphere From Full-Disk Magnetogram Sequences

    Publication Year: 2008 , Page(s): 2174 - 2185
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (3534 KB) |  | HTML iconHTML  

    In this paper, we present a novel method to detect Emerging Flux Regions (EFRs) in the solar atmosphere from consecutive full-disk Michelson Doppler Imager (MDI) magnetogram sequences. To our knowledge, this is the first developed technique for automatically detecting EFRs. The method includes several steps. First, the projection distortion on the MDI magnetograms is corrected. Second, the bipolar regions are extracted by applying multiscale circular harmonic filters. Third, the extracted bipolar regions are traced in consecutive MDI frames by Kalman filter as candidate EFRs. Fourth, the properties, such as positive and negative magnetic fluxes and distance between two polarities, are measured in each frame. Finally, a feature vector is constructed for each bipolar region using the measured properties, and the Support Vector Machine (SVM) classifier is applied to distinguish EFRs from other regions. Experimental results show that the detection rate of EFRs is 96.4% and of non-EFRs is 98.0%, and the false alarm rate is 25.7%, based on all the available MDI magnetograms in 2001 and 2002. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Learning the Dynamics and Time-Recursive Boundary Detection of Deformable Objects

    Publication Year: 2008 , Page(s): 2186 - 2200
    Cited by:  Papers (2)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (934 KB) |  | HTML iconHTML  

    We propose a principled framework for recursively segmenting deformable objects across a sequence of frames. We demonstrate the usefulness of this method on left ventricular segmentation across a cardiac cycle. The approach involves a technique for learning the system dynamics together with methods of particle-based smoothing as well as nonparametric belief propagation on a loopy graphical model capturing the temporal periodicity of the heart. The dynamic system state is a low-dimensional representation of the boundary, and the boundary estimation involves incorporating curve evolution into recursive state estimation. By formulating the problem as one of state estimation, the segmentation at each particular time is based not only on the data observed at that instant, but also on predictions based on past and future boundary estimates. Although this paper focuses on left ventricle segmentation, the method generalizes to temporally segmenting any deformable object. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Binary Partition Trees for Object Detection

    Publication Year: 2008 , Page(s): 2201 - 2216
    Cited by:  Papers (31)  |  Patents (1)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1698 KB) |  | HTML iconHTML  

    This paper discusses the use of binary partition trees (BPTs) for object detection. BPTs are hierarchical region-based representations of images. They define a reduced set of regions that covers the image support and that spans various levels of resolution. They are attractive for object detection as they tremendously reduce the search space. In this paper, several issues related to the use of BPT for object detection are studied. Concerning the tree construction, we analyze the compromise between computational complexity reduction and accuracy. This will lead us to define two parts in the BPT: one providing accuracy and one representing the search space for the object detection task. Then we analyze and objectively compare various similarity measures for the tree construction. We conclude that different similarity criteria should be used for the part providing accuracy in the BPT and for the part defining the search space and specific criteria are proposed for each case. Then we discuss the object detection strategy based on BPT. The notion of node extension is proposed and discussed. Finally, several object detection examples illustrating the generality of the approach and its efficiency are reported. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Multistage Branch-and-Bound Merging for Planar Surface Segmentation in Disparity Space

    Publication Year: 2008 , Page(s): 2217 - 2226
    Cited by:  Papers (1)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (974 KB) |  | HTML iconHTML  

    An iterative split-and-merge framework for the segmentation of planar surfaces in the disparity space is presented. Disparity of a scene is modeled by approximating various surfaces in the scene to be planar. In the split phase, the number of planar surfaces along with the underlying plane parameters is assumed to be known from the initialization or from the previous merge phase. Based on these parameters, planar surfaces in the disparity image are labeled to minimize the residuals between the actual disparity and the modeled disparity. The labeled planar surfaces are separated into spatially continuous regions which are treated as candidates for the merging that follows. The regions are merged together under a maximum variance constraint while maximizing the merged area. A multistage branch-and-bound algorithm is proposed to carry out this optimization efficiently. Each stage of the branch-and-bound algorithm separates a planar surface from the set of spatially continuous regions. The multistage merging estimates the number of planar surfaces and their labeling. The splitting and the multistage merging is repeated till convergence is reached or satisfactory results are achieved. Experimental results are presented for variety of stereo image data. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Fast Template Matching Based on Normalized Cross Correlation With Adaptive Multilevel Winner Update

    Publication Year: 2008 , Page(s): 2227 - 2235
    Cited by:  Papers (20)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (2031 KB) |  | HTML iconHTML  

    In this paper, we propose a fast pattern matching algorithm based on the normalized cross correlation (NCC) criterion by combining adaptive multilevel partition with the winner update scheme to achieve very efficient search. This winner update scheme is applied in conjunction with an upper bound for the cross correlation derived from Cauchy-Schwarz inequality. To apply the winner update scheme in an efficient way, we partition the summation of cross correlation into different levels with the partition order determined by the gradient energies of the partitioned regions in the template. Thus, this winner update scheme in conjunction with the upper bound for NCC can be employed to skip unnecessary calculation. Experimental results show the proposed algorithm is very efficient for image matching under different lighting conditions. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • 3-D Object Recognition Using 2-D Views

    Publication Year: 2008 , Page(s): 2236 - 2255
    Cited by:  Papers (6)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1218 KB) |  | HTML iconHTML  

    We consider the problem of recognizing 3-D objects from 2-D images using geometric models and assuming different viewing angles and positions. Our goal is to recognize and localize instances of specific objects (i.e., model-based) in a scene. This is in contrast to category-based object recognition methods where the goal is to search for instances of objects that belong to a certain visual category (e.g., faces or cars). The key contribution of our work is improving 3-D object recognition by integrating algebraic functions of views (AFoVs), a powerful framework for predicting the geometric appearance of an object due to viewpoint changes, with indexing and learning. During training, we compute the space of views that groups of object features can produce under the assumption of 3-D linear transformations, by combining a small number of reference views that contain the object features using AFoVs. Unrealistic views (e.g., due to the assumption of 3-D linear transformations) are eliminated by imposing a pair of rigidity constraints based on knowledge of the transformation between the reference views of the object. To represent the space of views that an object can produce compactly while allowing efficient hypothesis generation during recognition, we propose combining indexing with learning in two stages. In the first stage, we sample the space of views of an object sparsely and represent information about the samples using indexing. In the second stage, we build probabilistic models of shape appearance by sampling the space of views of the object densely and learning the manifold formed by the samples. Learning employs the expectation-maximization (EM) algorithm and takes place in a ldquouniversal,rdquo lower-dimensional, space computed through random projection (RP). During recognition, we extract groups of point features from the scene and we use indexing to retrieve the most feasible model groups that might have produced them (i.e., hypothesis generation). - - The likelihood of each hypothesis is then computed using the probabilistic models of shape appearance. Only hypotheses ranked high enough are considered for further verification with the most likely hypotheses verified first. The proposed approach has been evaluated using both artificial and real data, illustrating promising performance. We also present preliminary results illustrating extentions of the AFoVs framework to predict the intensity appearance of an object. In this context, we have built a hybrid recognition framework that exploits geometric knowledge to hypothesize the location of an object in the scene and both geometrical and intensity information to verify the hypotheses. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Face Recognition Using Spatially Constrained Earth Mover's Distance

    Publication Year: 2008 , Page(s): 2256 - 2260
    Cited by:  Papers (12)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (448 KB) |  | HTML iconHTML  

    Face recognition is a challenging problem, especially when the face images are not strictly aligned (e.g., images can be captured from different viewpoints or the faces may not be accurately cropped by a human or automatic algorithm). In this correspondence, we investigate face recognition under the scenarios with potential spatial misalignments. First, we formulate an asymmetric similarity measure based on Spatially constrained Earth Mover's Distance (SEMD), for which the source image is partitioned into nonoverlapping local patches while the destination image is represented as a set of overlapping local patches at different positions. Assuming that faces are already roughly aligned according to the positions of their eyes, one patch in the source image can be matched only to one of its neighboring patches in the destination image under the spatial constraint of reasonably small misalignments. Because the similarity measure as defined by SEMD is asymmetric, we propose two schemes to combine the two similarity measures computed in both directions. Moreover, we adopt a distance-as-feature approach by treating the distances to the reference images as features in a kernel discriminant analysis (KDA) framework. Experiments on three benchmark face databases, namely the CMU PIE, FERET, and FRGC databases, demonstrate the effectiveness of the proposed SEMD. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • IEEE Transactions on Image Processing EDICS

    Publication Year: 2008 , Page(s): 2261
    Save to Project icon | Request Permissions | PDF file iconPDF (20 KB)  
    Freely Available from IEEE
  • IEEE Transactions on Image Processing information for authors

    Publication Year: 2008 , Page(s): 2262 - 2263
    Save to Project icon | Request Permissions | PDF file iconPDF (46 KB)  
    Freely Available from IEEE

Aims & Scope

IEEE Transactions on Image Processing focuses on signal-processing aspects of image processing, imaging systems, and image scanning, display, and printing.

Full Aims & Scope

Meet Our Editors

Editor-in-Chief
Scott Acton
University of Virginia
Charlottesville, VA, USA
E-mail: acton@virginia.edu 
Phone: +1 434-982-2003