When one scans a document page from a thick bound volume, the curvature of the page to be scanned results in two kinds of distortion in the scanned document images: i) shade along the 'spine' of the book; and ii) warping in the shade area. In this paper, we propose an efficient restoration method based on the discovery of the 3D shape of a book surface from the shading information in a scanned document image. We first build practical models namely a 3D geometric model and a 3D optical model for the practical scanning conditions to reconstruct the 3D shape of book surface. We next restore the scanned document image using this shape based on de-shading and de-warping models. Finally, we evaluate the restoration results by comparing the OCR (optical character recognition) performance on the original and restored document images. The experiments show that the geometric and photometric distortions are mostly removed and the OCR results are improved markedly.
Published in:
Computer Vision and Pattern Recognition, 2005. CVPR 2005. IEEE Computer Society Conference on
(Volume:1
)
Date of Conference: 20-25 June 2005