Spam filtering is a classical online learning problem. When the size of training sample set becomes larger and larger, the speed of Online SVM is becoming slower and slower. Therefore, we relax the constraints of Online SVM and get the Relaxed Online SVM (ROSVM) model, which can not only improve the speed, but also can ensure the performance. In this paper, we applied this model to Chinese spam filter. Our model outperforms the best system of TREC 2006 Chinese spam filter track. Our filter also participated in the SEWM 2010 spam filter track, and got the best 1-ROCA% of the delayed feedback task and the active learning task.
Published in:
Asian Language Processing (IALP), 2010 International Conference on
Date of Conference: 28-30 Dec. 2010