Skip to Main Content
In this paper, we investigate the use of data mining, in particular the text classification and co-training techniques, to identify more relevant passages based on a small set of labeled passages obtained from the blind feedback of a retrieval system. The data mining results are used to expand query terms and to re-estimate some of the parameters used in a probabilistic weighting function. We evaluate the data mining based feedback method on the TREC HARD data set. The results show that data mining can be successfully applied to improve the text retrieval performance. We report our experimental findings in detail.