Skip to Main Content
Online evaluation is one of the most common approaches to measure the effectiveness of an information retrieval system. It involves fielding the information retrieval system to real users, and observing these users' interactions in situ while they engage with the system. This allows actual users with real world information needs to play an important part in assessing retrieval quality. Online Evaluation for Information Retrieval provides the reader with a comprehensive overview of the topic. It shows how online evaluation is used for controlled experiments, segmenting them into experiment designs that allow absolute or relative quality assessments. The presentation of different metrics further partitions online evaluation based on different sized experimental units commonly of interest: documents, lists, and sessions. It also includes an extensive discussion of recent work on data re-use, and experiment estimation based on historical data. Online Evaluation for Information Retrieval ays particular attention to practical issues: How to run evaluations in practice, how to select experimental parameters, how to take into account ethical considerations inherent in online evaluations, and limitations that experimenters should be aware of. While most published work on online experimentation today is on a large scale in systems with millions of users, this monograph also emphasizes that the same techniques can be applied on a small scale. To this end, it highlights recent work that makes it easier to use at smaller scales and encourages studying real-world information seeking in a wide range of scenarios. The monograph concludes with a summary of the most recent work in the area, and outlines some open problems, as well as postulating future directions.