Skip to Main Content
The majority of used kernels in SVMs concern continuous data, and neglect the structure of the text. In contrast to classical kernels, we propose the use of various string kernels for spam filtering. On the other hand, data preprocessing is a vital part of text classification where the objective is to generate feature vectors usable by SVM kernels. We detail a feature mapping variant in text classification (TC) that yields improved performance for the standard SVM in filtering task. Furthermore, we propose an online active framework for spam filtering.