By Topic

A system for recognizing Vietnamese document images based on HMM and linguistics

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$31 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

6 Author(s)
Vu Hai Quan ; Fac. of Inf. Technol, Univ. of Natural Sci, Ho Chi Minh, Viet Nam ; Hoang Kiem ; Pham Nam Trung ; Lam Tri Tin
more authors

The authors present a system for recognizing Vietnamese document images and propose a method to increase the accuracy for this system. Based on features of the Vietnamese language, we can minimize the number of characters and integrate spell-checking in the recognition process. We also explain how to combine HMMs and our method in the recognition systems. Finally, based on statistical models for word frequency, a dictionary of Vietnamese word frequency was built to predict the next words to be recognized and to aid in post processing. The performance of the proposed approach was evaluated on Vietnamese literature from 1990 to 1997 with a total of 3469518 words (about 16866511 characters). Experimental results show that our method was effective

Published in:

Document Analysis and Recognition, 2001. Proceedings. Sixth International Conference on

Date of Conference: