By Topic

Pattern Analysis and Machine Intelligence, IEEE Transactions on

Issue 9 • Date Sept. 1995

Filter Results

Displaying Results 1 - 11 of 11
  • Extracting a valid boundary representation from a segmented range image

    Page(s): 920 - 924
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (729 KB)  

    A new approach is presented for extracting an explicit 3D shape model from a single range image. One novel aspect is that the model represents both observed object surfaces, and surfaces which bound the volume of occluded space. Another novel aspect is that the approach does not require that the range image segmentation be perfect. The low-level segmentation may be such that the model-building process encounters topology versus geometry conflicts. The model-building process is designed to be "fail soft" in the face of such problems. The portion of the 3D model where a problem presents itself is "glued" together in a manner meant to minimize the disturbance in the 3D shape. The goal is to produce a valid boundary-representation which can be processed by higher-level routines. A third novel aspect of this work is that the implementation has been evaluated on over 200 real range images of polyhedral objects, with no operator intervention and all parameters held constant, and obtained a 97% success rate in creating valid b-reps.<> View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Fast computation of normalized edit distances

    Page(s): 899 - 902
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (376 KB)  

    The normalized edit distance (NED) between two strings X and Y is defined as the minimum quotient between the sum of weights of the edit operations required to transform X into Y and the length of the editing path corresponding to these operations. An algorithm for computing the NED was introduced by Marzal and Vidal (1993) that exhibits 0(mn2 ) computing complexity, where m and n are the lengths of X and Y. We propose here an algorithm that is observed to require in practice the same 0(mn) computing resources as the conventional unnormalized edit distance algorithm does. The performance of this algorithm is illustrated through computational experiments with synthetic data, as well as with real data consisting of OCR chain-coded strings View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Character recognition without segmentation

    Page(s): 903 - 909
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (812 KB)  

    A segmentation-free approach to OCR is presented as part of a knowledge-based word interpretation model. It is based on the recognition of subgraphs homeomorphic to previously defined prototypes of characters. Gaps are identified as potential parts of characters by implementing a variant of the notion of relative neighborhood used in computational perception. Each subgraph of strokes that matches a previously defined character prototype is recognized anywhere in the word even if it corresponds to a broken character or to a character touching another one. The characters are detected in the order defined by the matching quality. Each subgraph that is recognized is introduced as a node in a directed net that compiles different alternatives of interpretation of the features in the feature graph. A path in the net represents a consistent succession of characters. A final search for the optimal path under certain criteria gives the best interpretation of the word features. Broken characters are recognized by looking for gaps between features that may be interpreted as part of a character. Touching characters are recognized because the matching allows nonmatched adjacent strokes. The recognition results for over 24,000 printed numeral characters belonging to a USPS database and on some hand-printed words confirmed the method's high robustness level View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Principal component analysis with missing data and its application to polyhedral object modeling

    Page(s): 854 - 867
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1180 KB)  

    Observation-based object modeling often requires integration of shape descriptions from different views. To overcome the problems of errors and their accumulation, we have developed a weighted least-squares (WLS) approach which simultaneously recovers object shape and transformation among different views without recovering interframe motion. We show that object modeling from a range image sequence is a problem of principal component analysis with missing data (PCAMD), which can be generalized as a WLS minimization problem. An efficient algorithm is devised. After we have segmented planar surface regions in each view and tracked them over the image sequence, we construct a normal measurement matrix of surface normals, and a distance measurement matrix of normal distances to the origin for all visible regions over the whole sequence of views, respectively. These two matrices, which have many missing elements due to noise, occlusion, and mismatching, enable us to formulate multiple view merging as a combination of two WLS problems. A two-step algorithm is presented. After surface equations are extracted, spatial connectivity among the surfaces is established to enable the polyhedral object model to be constructed. Experiments using synthetic data and real range images show that our approach is robust against noise and mismatching and generates accurate polyhedral object models View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Active/dynamic stereo vision

    Page(s): 868 - 879
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1220 KB)  

    Visual navigation is a challenging issue in automated robot control. In many robot applications, like object manipulation in hazardous environments or autonomous locomotion, it is necessary to automatically detect and avoid obstacles while planning a safe trajectory. In this context the detection of corridors of free space along the robot trajectory is a very important capability which requires nontrivial visual processing. In most cases it is possible to take advantage of the active control of the cameras. In this paper we propose a cooperative schema in which motion and stereo vision are used to infer scene structure and determine free space areas. Binocular disparity, computed on several stereo images over time, is combined with optical flow from the same sequence to obtain a relative-depth map of the scene. Both the time to impact and depth scaled by the distance of the camera from the fixation point in space are considered as good, relative measurements which are based on the viewer, but centered on the environment. The need for calibrated parameters is considerably reduced by using an active control strategy. The cameras track a point in space independently of the robot motion and the full rotation of the head, which includes the unknown robot motion, is derived from binocular image data. The feasibility of the approach in real robotic applications is demonstrated by several experiments performed on real image data acquired from an autonomous vehicle and a prototype camera head View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Integrating vision modules: stereo, shading, grouping, and line labeling

    Page(s): 831 - 842
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1236 KB)  

    It is generally agreed that individual visual cues are fallible and often ambiguous. This has generated a lot of interest in design of integrated vision systems which are expected to give a reliable performance in practical situations. The design of such systems is challenging since each vision module works under a different and possibly conflicting set of assumptions. We have proposed and implemented a multiresolution system which integrates perceptual organization (grouping), segmentation, stereo, shape from shading, and line labeling modules. We demonstrate the efficacy of our approach using images of several different realistic scenes. The output of the integrated system is shown to be insensitive to the constraints imposed by the individual modules. The numerical accuracy of the recovered depth is assessed in case of synthetically generated data. Finally, we have qualitatively evaluated our approach by reconstructing geons from the depth data obtained from the integrated system. These results indicate that integrated vision systems are likely to produce better reconstruction of the input scene than the individual modules View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Arrangement: a spatial relation between parts for evaluating similarity of tomographic section

    Page(s): 880 - 893
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1492 KB)  

    Medical tomographic images are formed by the intersection of the image plane and an object. As the image plane changes, different parts of the object come in view or drop out of view. However, for small changes of the image plane, most parts continue to remain visible and their qualitative embedding in the image remains similar. Therefore, similarity of part embeddings can be used to infer similarity of image planes. Part embeddings are useful features for other vision applications as well. In view of this, a spatial relation called “arrangement” is proposed to describe part embeddings. The relation describes how each part is surrounded by its neighbors. Further, a metric for arrangements is formulated by expressing arrangements in terms of the Voronoi diagram of the parts. Arrangements and their metric are used to retrieve images by image plane similarity in a cardiac magnetic resonance image database. Experiments with the database are reported which (1) validate the observation that similarity of image planes can be inferred from similarity of part embeddings, and (2) compare the performance of arrangement based image retrieval with that of expert radiologists View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A search technique for pattern recognition using relative distances

    Page(s): 910 - 914
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (388 KB)  

    A technique for creating and searching a tree of patterns using relative distances is presented. The search is conducted to find patterns which are nearest neighbors of a given test pattern. The structure of the tree is such that the search time is proportional to the distance between the test pattern and its nearest neighbor, which suggests the anomalous possibility that a larger tree, which can be expected on average to contain closer neighbors, can be searched faster than a smaller tree. The technique has been used to recognize OCR digit samples derived from NIST data at an accuracy rate of 97% using a tree of 7,000 patterns View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • VLSI architectures for high-speed range estimation

    Page(s): 894 - 899
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (564 KB)  

    Depth recovery from gray-scale images is an important topic in the field of computer and robot vision. Intensity gradient analysis (IGA) is a robust technique for inferring depth information from a sequence of images acquired by a sensor undergoing translational motion. IGA obviates the need for explicitly solving the correspondence problem and hence is an efficient technique for range estimation. Many applications require real time processing at very high frame rates. The design of special purpose hardware could significantly speed up the computations in IGA. In this paper, we propose two VLSI architectures for high-speed range estimation based on IGA. The architectures fully utilize the principles of pipelining and parallelism in order to obtain high speed and throughput. The designs are conceptually simple and suitable for implementation in VLSI View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A correlation-relaxation-labeling framework for computing optical flow-template matching from a new perspective

    Page(s): 843 - 853
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1120 KB)  

    Optical flow estimation is discussed based on a model for time-varying images more general than that implied by Horn and Schunk (1981). The emphasis is on applications where low contrast imagery, nonrigid or evolving object patterns movement, as well as large interframe displacements are encountered. Template matching is identified as having advantages over point correspondence and the gradient-based approach in dealing with such applications. The two fundamental uncertainties in feature matching, whether template matching or feature point correspondences, are discussed. Correlation template matching procedures are established based on likelihood measurement. A method for determining optical flow is developed by combining template matching and relaxation labeling. A number of candidate displacements for each template and their respective likelihood measures are determined. Then, relaxation labeling is employed to iteratively update each candidate's likelihood by requiring smoothness within a motion field. Real cloud images from satellites are used to test the method View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • An evaluation of parallel thinning algorithms for character recognition

    Page(s): 914 - 919
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (628 KB)  

    Skeletonization algorithms have played an important role in the preprocessing phase of OCR systems. In this paper we report on the performance of 10 parallel thinning algorithms from this perspective by gathering statistics from their performance on large sets of data and examining the effects of the different thinning algorithms on an OCR system View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.

Aims & Scope

The IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI) is published monthly. Its editorial board strives to present most important research results in areas within TPAMI's scope.

Full Aims & Scope

Meet Our Editors

Editor-in-Chief
David A. Forsyth
University of Illinois