Skip to Main Content
In this paper, we present an approach to separate digits and non-digits for numerical string extraction in Farsi/Arabic handwritten or machine-printed document images. Each connected component is labeled as it belongs to a numerical string or not. For this purpose we introduce a set of features which firstly based on the maximum difference between digits and non-digits in Farsi. Secondly their complexity and extraction time are much less than those features used for connected components recognition. For feature classification, a fuzzy rule-based classifier is utilized. Experimental results show an acceptable detection rate with low false positive rate.
Computational Intelligence and Industrial Applications, 2009. PACIIA 2009. Asia-Pacific Conference on (Volume:1 )
Date of Conference: 28-29 Nov. 2009