Skip to Main Content
Evaluation metric is an essential and integral part of a ranking system. In the past, several evaluation metrics have been proposed in information retrieval and web search, among them Discounted Cumulative Gain (DCG) has emerged as one that is widely adopted for evaluating the performance of ranking functions used in web search. However, the two sets of parameters, the gain values and discount factors, used in DCG are usually determined in a rather ad-hoc way, and their impacts have not been carefully analyzed. In this paper, we first show that DCG is generally not coherent, i.e., comparing the performance of ranking functions using DCG very much depends on the particular gain values and discount factors used. We then propose a novel methodology that can learn the gain values and discount factors from user preferences over rankings, modeled as a special case of learning linear utility functions. We also discuss how to extend our methods to handle tied preference pairs and how to explore active learning to reduce preference labeling. Numerical simulations illustrate the effectiveness of our proposed methods. Moreover, experiments are also conducted over a side-by-side comparison data set from a commercial search engine to validate the proposed methods on real-world data.