Skip to Main Content
Image mining deals with the extraction of implicit knowledge, image data relationship, or other patterns not explicitly stored in the images. This paper proposes an enhanced image classifier to extract patterns from images containing text using a combination of features. Image containing text can be divided into the following types: scene text image, caption text image and document image. A total of eight features including intensity histogram features and GLCM texture features are used to classify the images. In the first level of classification, the histogram features are extracted from grayscale images to separate document image from the others. In the second stage, the GLCM features are extracted from binary images to classify scene text and caption text images. In both stages, the decision tree classifier (DTC) is used for the classification. Experimental results have been obtained for a dataset of about 60 images of different types. This technique of classification has not been attempted before and its applications include preprocessing for indexing of images, for simplifying and speeding up content based image retrieval (CBIR) techniques and in areas of machine vision.
Date of Conference: 6-7 March 2009