Skip to Main Content
A problem of children being exposed to pornographic Web sites on the Internet has led to their safety issues. To prevent the children from these inappropriate materials, an effective Web filtering system is essential. Content-based Web filtering is one of the important techniques to handle and filter inappropriate information on the web. In this paper, we examine a content-based analysis technique to filter the pornographic Web sites. Then, our system consists of two primary content-based filtering techniques such as text and image. For text analysis, the support vector machine (SVM) algorithm and N-gram model based on Bayes' theorem is applied and experimented to filter pornographic text for both Thai and English language web sites. Meanwhile, we build and examine an image filtering system with a hierarchical image filtering method. It consists of two main processes such as normalized R/G ratio which is using the pixel ratios (red and green color channels) and human composition matrix (HCM) based on skin detection. The empirical results show that our analysis methods of text and image are more effective for pornographic Web filtering. Finally, we have modeled a pornographic web filter using content-based analysis into our Anti-X system.