The authors describe a communication theory approach to document image reconstruction, patterned after the use of hidden Markov models in speech recognition. A document recognition problem is viewed as consisting of three elements-an image generator, a noisy channel, and an image decoder. A document image generator is a Markov source which combines a message source with an imager. The message source produces a string of symbols which contains the information to be transmitted. The imager is modeled as a finite-state transducer, which converts the message into an ideal bitmap. The channel transforms the ideal image into a noisy observed image. The decoder estimates the message from the observed image by finding the a posteriori most probable path through the combined source and channel models using a Viterbi-like algorithm. Application of the proposed method to decoding telephone yellow pages is described.<
Published in:
Acoustics, Speech, and Signal Processing, 1993. ICASSP-93., 1993 IEEE International Conference on
(Volume:5
)
Date of Conference: 27-30 April 1993