Skip to Main Content
Data fusion in information retrieval has been investigated by many researchers and quite a few data fusion methods have been proposed. However, their impact on effectiveness has not been well understood. In this paper, we apply statistical principles to data fusion and present a statistical data fusion model, which specifies the algorithm for fusion and conditions to be satisfied. The statistical model can be used as a guideline for data fusion methods. Based on this analysis, we compare CombSum and CombMNZ, which are the two best-known data fusion methods. We explain why sometimes CombMNZ does outperform Comb- Sum and what can be done to make CombSum more effective. Experimental results with TREC data are reported to support the conclusion that our enhancements to the algorithm improve effectiveness.