Skip to Main Content
The mining of images from several categories is a problem arisen naturally under a wide range of circumstances. Image mining concerns with extraction of image data relationships, or other patterns of images which are not explicitly stored in the images. And Image classification is a large and growing field within image processing. Image Classification is useful in CBIR (Content Based Image Retrieval).There are many type of images that can be classified according to their nature, content or domain. In this paper, we present a novel unsupervised method for the image classification based on various feature's distribution of textual images. From these various features, differences between images can be computed, and these can be used to classify the textual images which are of three types i.e. Document image, Caption Text image or Scene Text image. Based on various low level features like mean, skewness, energy, contrast, homogeneity, we can classify various textual images. In first level of classification, image is converted into gray scale image then histogram features like mean variance and skewness are extracted and using weka J48 decision tree classifier, images are classified as Doc and Non-Doc image. In second level of classification, we slice gray scale image in binary form. From that GLCM (Gray Level Co-occurrence Matrix) features are classified. GLCM feature as Energy, Entropy, Contrast, Homogeneity are used to classify Non-Doc images. We have experimented on 60 images of different types.