Skip to Main Content
Recognizing texts from camera images is a known hard problem because of the difficulties in text segmentation from the varied and complicated backgrounds. In this paper, we propose an algorithm that employs two novel filters and a basic component-based text detection framework. The framework uses the Niblack algorithm to threshold images and groups components into regions with commonly used geometry features. The intensity filter considers the overlap between the intensity histogram of a component and that of its adjoining area. For non-text regions, we have found that this overlap is large, and so we can prune out components with large values of this measure. The shape filter, on the other hand, deletes regions whose constituent components come from a same object, as most words consist of different characters. The proposed method is evaluated with the text locating database with 249 images used in the ICDAR2003 robust reading competition. The result shows that the algorithm is robust to both indoor images and outdoor images, even for the images of complex background, which usually is a hard factor to overcome for traditional component-based algorithms. In terms of performance statistics, we tested the algorithm on the ICDAR 2003 challenge experiment, and the algorithm achieves 66% precision rate (p), 46% recall rate (r), and 54% the combined rate ( f ), which is the best reported in the literature on this dataset.