Skip to Main Content
In this paper, an incremental spam images filtering (ISIF) approach based on visual similarity is proposed as one solution to two important realistic problems not dealt well by the existing spam image filtering techniques. One problem is how to update a model efficiently. Another is how to deal with the lack of normal email images. The basic idea of the ISIF approach is to incrementally learn what spam images look like through clustering spam images and selecting their representative images (RI), and then use the RI to classify unknown images. An ISIF filter can be updated by adding new RI, which can be done efficiently because the retraining process only focuses on the missed spam images rather than on expanded training data. Since the ISIF approach only cares about spam images, it avoids the difficulty of collecting enough normal email images. The experimental results on a real dataset for spam image filtering problem show that the incremental filter based on the ISIF approach can effectively detect spam images with high accuracy along with low false positive rate.