Abstract:
The objective of post-processing of OCR is to correct error from OCR result. It is important to use a spell checker tool to detect and to correct misspelled words. This p...Show MoreMetadata
Abstract:
The objective of post-processing of OCR is to correct error from OCR result. It is important to use a spell checker tool to detect and to correct misspelled words. This paper proposes statistical method to find unexpectedly frequent character sequences without relying on the dictionary. It is a flexible method to detect the out of vocabulary words. The corpus that used to create 3-grams is belongs to NECTEC (National Electronic and Computer Technology Center). The result is 3-grams are selected to use as the spelling checker for Thai documents. The ArnThai software is OCR software, which used to evaluate the proposed technique.
Published in: TENCON 2005 - 2005 IEEE Region 10 Conference
Date of Conference: 21-24 November 2005
Date Added to IEEE Xplore: 05 February 2007
ISBN Information: