Skip to Main Content
In this paper we present a page segmentation algorithm based on mathematical morphology. Usually, text and picture regions in a document image are different in size and continuity. Making use of these distinctive differences, a set of morphology operations are applied to segment document images. The whole algorithm contains mainly two stages: coarse text extraction and post processing. Experimental results show that this algorithm performances effectively on regular document image. It is also robust to irregular images, such as degraded images and images whose picture and text are intertwined or skewed.