Skip to Main Content
The present paper describes a complete system for the recognition of unconstrained handwritten Arabic words using over-segmentation of characters and variable duration hidden Markov model (VDHMM). First, a segmentation algorithm is used to translate the 2-D image into 1-D sequence of sub-character symbols. This sequence of symbols is modeled by the VDHMM. The shape information of character and sub-character symbols is compactly represented by forty-five features in the feature space. The feature vector is modeled as an independently distributed multivariate discrete distribution. The linguistic knowledge about character transition is modeled as a Markov chain where each character in the alphabet is a state and bigram probabilities are the state transition probabilities. In this context, the variable duration state is used to resolve the segmentation ambiguity among the consecutive characters. Using Arabic handwritten data from two different sources, detailed experimental results are described to demonstrate the success of the proposed scheme.