With the paper as the medium of electronic information, traditional books, magazines, newspapers, etc are scanned into the images, and changed into electronic documents through OCR (optical character recognition) technology, layout analysis as an important part of OCR has played a greater role. This paper presents a Chinese document layout analysis based on non-text images, solve the deformed image of the issue of text extraction, and there is great value in practice.
Published in:
Computer Science-Technology and Applications, 2009. IFCSTA '09. International Forum on
(Volume:1
)
Date of Conference: 25-27 Dec. 2009