Skip to Main Content
The aim of this paper is to evaluate the potential usefulness of the reject option for text categorisation (TC) tasks. The reject option is a technique used in statistical pattern recognition for improving classification reliability. Our work is motivated by the fact that, although the reject option proved to be useful in several pattern recognition problems, it has not yet been considered for TC tasks. Since TC tasks differ from usual pattern recognition problems in the performance measures used and in the fact that documents can belong to more than one category, we developed a specific rejection technique for TC problems. The performance improvement achievable by using the reject option was experimentally evaluated on the Reuters dataset, which is a standard benchmark for TC systems.