Skip to Main Content
SpamAssassin is a widely-used open source heuristic-based spam filter that applies a large number of weighted tests to a message, sums the results of the tests, and labels the message as spam if the sum exceeds a user-defined threshold. Due to the large number of tests and the interactions between them, defining good weights for SpamAssassin is difficult: moreover, users with different needs may desire different sets of weights to be used. We have built a multi-objective evolutionary algorithm MOSF that evolves weights for the tests in SpamAssassin according to two independent objectives: minimising the number of false positives (legitimate messages mislabeled as spam), and minimising the number of false negatives (spam messages mislabeled as legitimate). We show that MOSF returns a set of solutions offering a range of setups for SpamAssassin satisfying different userspsila needs, and also that MOSF can derive solutions which beat the existing SpamAssassin weights in both objectives simultaneously. Applying these ideas could substantially increase the usefulness of SpamAssassin and similar systems.