This paper describes an application of SVM (support vector machines) to interactive document retrieval using active learning. Some works have been done to apply classification learning like SVM to relevance feedback and obtained successful results. However they did not fully utilize characteristic of example distribution in document retrieval. We propose heuristics to bias document showing according to distribution of examples in document retrieval. This heuristic is executed by selecting examples to show a user in neighbors of positive support vectors, and it improves learning efficiency. We implemented a SVM-based interactive document retrieval system using our proposed heuristic, and compare it with conventional systems like Rocchio-based system and a SVM-based system without the heuristic. We conducted systematic experiments using large data sets including over 500,000 paper articles and confirmed our system outperformed other ones
Date of Conference: 18-22 Dec. 2006