Skip to Main Content
A fast algorithm is proposed in this paper to label connected components in binary document images. Runs are extracted from the image row by row. The positional relations among the runs of current rows and the runs of their preceding rows are represented utilizing trees, where each tree corresponds to a connected component. Only one-pass scan is required for the proposed approach to obtain the characteristics of the connected components, such as bounding rectangle, area, number of pixels. It is thus a fast and effective algorithm. Experimental results have shown that the efficiency of the present algorithm is superior to that of the conventional algorithms in terms of computational speed.