Skip to Main Content
Robust extraction of text from scene images is essential for successful scene text recognition. Scene images usually have nonuniform illumination, complex background, and text-like objects. In this paper, we propose a text extraction algorithm by combining the adaptive binarization and perceptual color clustering method. Adaptive binarization method can handle gradual illumination changes on character regions, so it can extract whole character regions even though shadows and/or light variations affect the image quality. However, image binarization on gray-scale images cannot distinguish different color components having the same luminance. Perceptual color clustering method complementary can extract text regions which have similar color distances, so that it can prevent the problem of the binarization method. Text verification based on local information of a single component and global relationship between multiple components is used to determine the true text components. It is demonstrated that the proposed method achieved reasonabe accuracy of the text extraction for the moderately difficult examples from the ICDAR 2003 database.