By Topic

Using n-grams for the definition of a training set for cursive handwriting recognition

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$33 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

1 Author(s)
V. Pflug ; Siemens AG, Germany

The use of n-grams for the selection of a minimal set of words from a lexicon for use as training words for a handwriting recognizer is presented. The test words selected should cover all or at least most of the graphemes in the section of the language considered. The algorithm reduces the number of test words by up to 70% of the original lexicon size when considering quadgrams. A further reduction is achieved by neglecting rare n-grams. The reduction comes up to 80% for quadgrams. Thus, only 20% of the number of words in the original lexicon have to be trained. Another aspect that may be considered when building the n-grams is that in natural handwriting a word ending Is usually less carefully written than the part of a word. Therefore, n-grams should be longer at the end than at the beginning of a word

Published in:

Document Analysis and Recognition, 1993., Proceedings of the Second International Conference on

Date of Conference:

20-22 Oct 1993