Scheduled System Maintenance:
On Monday, April 27th, IEEE Xplore will undergo scheduled maintenance from 1:00 PM - 3:00 PM ET (17:00 - 19:00 UTC). No interruption in service is anticipated.
By Topic

Pattern Analysis and Machine Intelligence, IEEE Transactions on

Issue 11 • Date Nov 2002

Filter Results

Displaying Results 1 - 12 of 12
  • Lexicon-driven segmentation and recognition of handwritten character strings for Japanese address reading

    Publication Year: 2002 , Page(s): 1425 - 1437
    Cited by:  Papers (44)  |  Patents (1)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (426 KB) |  | HTML iconHTML  

    This paper describes a handwritten character string recognition system for Japanese mail address reading on a very large vocabulary. The address phrases are recognized as a whole because there is no extra space between words. The lexicon contains 111,349 address phrases, which are stored in a trie structure. In recognition, the text line image is matched with the lexicon entries (phrases) to obtain reliable segmentation and retrieve valid address phrases. The paper first introduces some effective techniques for text line image preprocessing and presegmentation. In presegmentation, the text line image is separated into primitive segments by connected component analysis and touching pattern splitting based on contour shape analysis. In lexicon matching, consecutive segments are dynamically combined into candidate character patterns. An accurate character classifier is embedded in lexicon matching to select characters matched with a candidate pattern from a dynamic category set. A beam search strategy is used to control the lexicon matching so as to achieve real-time recognition. In experiments on 3,589 live mail images, the proposed method achieved correct rate of 83.68 percent while the error rate is less than 1 percent. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Automatic recognition of handwritten numerical strings: a recognition and verification strategy

    Publication Year: 2002 , Page(s): 1438 - 1454
    Cited by:  Papers (62)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (663 KB) |  | HTML iconHTML  

    A modular system to recognize handwritten numerical strings is proposed. It uses a segmentation-based recognition approach and a recognition and verification strategy. The approach combines the outputs from different levels such as segmentation, recognition, and postprocessing in a probabilistic model. A new verification scheme which contains two verifiers to deal with the problems of oversegmentation and undersegmentation is presented. A new feature set is also introduced to feed the oversegmentation verifier. A postprocessor based on a deterministic automaton is used and the global decision module makes an accept/reject decision. Finally, experimental results on two databases are presented: numerical amounts on Brazilian bank checks and NIST SD19. The latter aims at validating the concept of modular system and showing the robustness of the system using a well-known database. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Matching free trees, maximal cliques, and monotone game dynamics

    Publication Year: 2002 , Page(s): 1535 - 1541
    Cited by:  Papers (9)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (388 KB) |  | HTML iconHTML  

    Motivated by our recent work on rooted tree matching, in this paper we provide a solution to the problem of matching two free (i.e., unrooted) trees by constructing an association graph whose maximal cliques are in one-to-one correspondence with maximal common subtrees. We then solve the problem using simple payoff-monotonic dynamics from evolutionary game theory. We illustrate the power of the approach by matching articulated and deformed shapes described by shape-axis trees. Experiments on hundreds of larger, uniformly random trees are also presented. The results are impressive: despite the inherent inability of these simple dynamics to escape from local optima, they always returned a globally optimal solution. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Disclaimer: "Relative fuzzy connectedness and object definition: theory, algorithms, and applications in image segmentation"

    Publication Year: 2002 , Page(s): I - 1500
    Cited by:  Papers (45)  |  Patents (1)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1961 KB) |  | HTML iconHTML  

    The notion of fuzzy connectedness captures the idea of "hanging-togetherness" of image elements in an object by assigning a strength of connectedness to every possible path between every possible pair of image elements. This concept leads to powerful image segmentation algorithms based on dynamic programming whose effectiveness has been demonstrated on 1,000s of images in a variety of applications. In the previous framework, a fuzzy connected object is defined with a threshold on the strength of connectedness. We introduce the notion of relative connectedness that overcomes the need for a threshold and that leads to more effective segmentations. The central idea is that an object gets defined in an image because of the presence of other co-objects. Each object is initialized by a seed element. An image element c is considered to belong to that object with respect to whose reference image element c has the highest strength of connectedness. In this fashion, objects compete among each other utilizing fuzzy connectedness to grab membership of image elements. We present a theoretical and algorithmic framework for defining objects via relative connectedness and demonstrate utilizing the theory that the objects defined are independent of reference elements chosen as long as they are not in the fuzzy boundary between objects. An iterative strategy is also introduced wherein the strongest relative connected core parts are first defined and iteratively relaxed to conservatively capture the more fuzzy parts subsequently. Examples from medical imaging are presented to illustrate visually the effectiveness of relative fuzzy connectedness. A quantitative mathematical phantom study involving 160 images is conducted to demonstrate objectively the effectiveness of relative fuzzy connectedness.

    Disclaimer

    A claim of priority in research and publication appeared on page 1486 in the paper "Relative Fuzzy Connectedness and Object Definition: Theory, Algorithms, and Applications in Image Segmentation" by J.K. Udupa, P.K. Saha, and' R.A. Lotufo (IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 24, no. 11, pp. 1485-1500,Nov. 2002) with respect to the paper "Multiseeded Segmentation Using Fuzzy Connectedness by G.T. Herman and B.M. Carva- lho" (IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 23, no. 5, pp. 460-474, May 2001). Furthermore, the wording of this claim suggests professional misconduct on the part of G.T. Herman and B.M. Carvalho.

    Responsibility for the content of published papers rests with the authors. The peer review process is intended to determine the overall significance of the technical contribution of a manuscript. The peer review process does not provide a way to validate every statement in a manuscript. In particular, the IEEE has not validated the claim referred to in the first paragraph above. The IEEE regrets publishing this unauthenticated statement and the pain that such publication may have caused to G.T. Herman and B.M. Carvalho. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A frequency domain technique for range data registration

    Publication Year: 2002 , Page(s): 1468 - 1484
    Cited by:  Papers (39)  |  Patents (2)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (3427 KB) |  | HTML iconHTML  

    This work introduces an original method for registering pairs of 3D views consisting of range data sets which operates in the frequency domain. The Fourier transform allows the decoupling of the estimate of the rotation parameters from the estimate of the translation parameters, our algorithm exploits this well-known property by suggesting a three-step procedure. The rotation parameters are estimated by the first two steps through convenient representations and projections of the Fourier transforms' magnitudes and the translational displacement is recovered by the third step by means of a standard phase correlation technique after compensating one of the two views for rotation. The performance of the algorithm, which is well-suited for unsupervised registration, is clearly assessed through extensive testing with several objects and shows that good and robust estimates of 3D rigid motion are achievable. Our algorithm can be used as a prealignment tool for more accurate space-domain registration techniques, like the ICP algorithm. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • On the use of SDF-type filters for distortion parameter estimation

    Publication Year: 2002 , Page(s): 1521 - 1528
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1107 KB) |  | HTML iconHTML  

    Synthetic discriminant functions have been used to locate objects irrespective of distortions and to estimate the extent of the distortion. It was recognized from the beginning that accurate estimates are only possible provided the training set is constructed carefully. We obtain conditions that will ensure the accuracy of the estimates. The conditions also suggest efficient ways of constructing the training sets and the results are extended to a wide class SDF-type filters. The theoretical results are illustrated with (idealized) examples and are also applied to the more realistic problem of accurate facial location. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Support vector machines for texture classification

    Publication Year: 2002 , Page(s): 1542 - 1550
    Cited by:  Papers (97)  |  Patents (1)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1318 KB) |  | HTML iconHTML  

    This paper investigates the application of support vector machines (SVMs) in texture classification. Instead of relying on an external feature extractor, the SVM receives the gray-level values of the raw pixels, as SVMs can generalize well even in high-dimensional spaces. Furthermore, it is shown that SVMs can incorporate conventional texture feature extraction methods within their own architecture, while also providing solutions to problems inherent in these methods. One-against-others decomposition is adopted to apply binary SVMs to multitexture classification, plus a neural network is used as an arbitrator to make final classifications from several one-against-others SVM outputs. Experimental results demonstrate the effectiveness of SVMs in texture classification. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Matching and retrieval of distorted and occluded shapes using dynamic programming

    Publication Year: 2002 , Page(s): 1501 - 1516
    Cited by:  Papers (70)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (704 KB) |  | HTML iconHTML  

    We propose an approach for matching distorted and possibly occluded shapes using dynamic programming (DP). We distinguish among various cases of matching such as cases where the shapes are scaled with respect to each other and cases where an open shape matches the whole or only a part of another open or closed shape. Our algorithm treats noise and shape distortions by allowing matching of merged sequences of consecutive small segments in a shape with larger segments of another shape, while being invariant to translation, scale, orientation, and starting point selection. We illustrate the effectiveness of our algorithm in retrieval of shapes on two data sets of two-dimensional open and closed shapes of marine life species. We demonstrate the superiority of our approach over traditional approaches to shape matching and retrieval based on Fourier descriptors and moments. We also compare our method with SQUID, a well-known method which is available on the Internet. Our evaluation is based on human relevance judgments following a well-established methodology from the information retrieval field. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Direct recovery of planar-parallax from multiple frames

    Publication Year: 2002 , Page(s): 1528 - 1534
    Cited by:  Papers (18)  |  Patents (1)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (592 KB) |  | HTML iconHTML  

    We present an algorithm that estimates dense planar-parallax motion from multiple uncalibrated views of a 3D scene. This generalizes the "plane+parallax" recovery methods to more than two frames. The parallax motion of pixels across multiple frames (relative to a planar surface) is related to the 3D scene structure and the camera epipoles. The parallax field, the epipoles, and the 3D scene structure are estimated directly from image brightness variations across multiple frames, without precomputing correspondences. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Spatio-temporal alignment of sequences

    Publication Year: 2002 , Page(s): 1409 - 1424
    Cited by:  Papers (43)  |  Patents (2)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (4159 KB) |  | HTML iconHTML  

    This paper studies the problem of sequence-to-sequence alignment, namely, establishing correspondences in time and in space between two different video sequences of the same dynamic scene. The sequences are recorded by uncalibrated video cameras which are either stationary or jointly moving, with fixed (but unknown) internal parameters and relative intercamera external parameters. Temporal variations between image frames (such as moving objects or changes in scene illumination) are powerful cues for alignment, which cannot be exploited by standard image-to-image alignment techniques. We show that, by folding spatial and temporal cues into a single alignment framework, situations which are inherently ambiguous for traditional image-to-image alignment methods, are often uniquely resolved by sequence-to-sequence alignment. Furthermore, the ability to align and integrate information across multiple video sequences both in time and in space gives rise to new video applications that are not possible when only image-to-image alignment is used. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Recognizing mathematical expressions using tree transformation

    Publication Year: 2002 , Page(s): 1455 - 1467
    Cited by:  Papers (51)  |  Patents (3)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (406 KB) |  | HTML iconHTML  

    We describe a robust and efficient system for recognizing typeset and handwritten mathematical notation. From a list of symbols with bounding boxes the system analyzes an expression in three successive passes. The Layout Pass constructs a Baseline Structure Tree (BST) describing the two-dimensional arrangement of input symbols. Reading order and operator dominance are used to allow efficient recognition of symbol layout even when symbols deviate greatly from their ideal positions. Next, the Lexical Pass produces a Lexed BST from the initial BST by grouping tokens comprised of multiple input symbols; these include decimal numbers, function names, and symbols comprised of nonoverlapping primitives such as "=". The Lexical Pass also labels vertical structures such as fractions and accents. The Lexed BST is translated into LATEX. Additional processing, necessary for producing output for symbolic algebra systems, is carried out in the Expression Analysis Pass. The Lexed BST is translated into an Operator Tree, which describes the order and scope of operations in the input expression. The tree manipulations used in each pass are represented compactly using tree transformations. The compiler-like architecture of the system allows robust handling of unexpected input, increases the scalability of the system, and provides the groundwork for handling dialects of mathematical notation. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Approximate Bayes factors for image segmentation: the Pseudolikelihood Information Criterion (PLIC)

    Publication Year: 2002 , Page(s): 1517 - 1520
    Cited by:  Papers (18)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (301 KB) |  | HTML iconHTML  

    We propose a method for choosing the number of colors or true gray levels in an image; this allows fully automatic segmentation of images. Our underlying probability model is a hidden Markov random field. Each number of colors considered is viewed as corresponding to a statistical model for the image, and the resulting models are compared via approximate Bayes factors. The Bayes factors are approximated using BIC (Bayesian Information Criterion), where the required maximized likelihood is approximated by the Qian-Titterington (1991) pseudolikelihood. We call the resulting criterion PLIC (Pseudolikelihood Information Criterion). We also discuss a simpler approximation, MMIC (Marginal Mixture Information Criterion), which is based only on the marginal distribution of pixel values. This turns out to be useful for initialization and it also has moderately good performance by itself when the amount of spatial dependence in an image is low. We apply PLIC and MMIC to a medical image segmentation problem. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.

Aims & Scope

The IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI) is published monthly. Its editorial board strives to present most important research results in areas within TPAMI's scope.

Full Aims & Scope

Meet Our Editors

Editor-in-Chief
David A. Forsyth
University of Illinois