Skip to Main Content
India is a multi-lingual and multi-script country, where eighteen official scripts are accepted and there are over hundred regional languages. In this paper we propose a zone-based hybrid feature extraction system. The character centroid is computed and the image (character/numeral) is further divided into n equal zones. An average angle from the character centroid to the pixels present in the zone, is computed (one feature). Similarly, the zone centroid is also computed (two features). The average angle from the zone centroid to the pixels present in the zone is computed (one feature). This procedure is sequentially repeated for all the zones/grids/boxes present in the numeral image. There could be some zones that are empty; then, the value of that particular zone image in the feature vector is zero. Finally, 4*n such features are extracted. The nearest neighbor and support vector machine classifiers are used for subsequent classification and recognition purposes. We obtained 97.85 %, 96.8 %, 95.1% and 95 % recognition rates for Kannada, Telugu, Tamil and Malayalam numerals respectively, using support vector machine.