Skip to Main Content
In this paper, we present a methodology for off-line character recognition that mainly focuses on handling the difficult cases of historical fonts and styles. The proposed methodology relies on a new feature extraction technique based on recursive subdivisions of the image as well as on calculation of the centre of masses of each sub-image with sub-pixel accuracy. Feature extraction is followed by a hierarchical classification scheme based on the level of granularity of the feature extraction method. Pairs of classes with high values in the confusion matrix are merged at a certain level and higher level granularity features are employed for distinguishing them. Several historical documents were used in order to demonstrate the efficiency of the proposed technique.
Date of Conference: 26-29 July 2009