Skip to Main Content
Utility-based learning is a key technique for addressing many real world data mining applications, where the costs/benefits are not uniform across the domain of the target variable. Still, most of the existing research has been focused on classification problems. In this paper we address a related problem. There are many relevant domains (e.g. ecological, meteorological, finance) where decisions are based on the forecast of a numeric quantity (i.e. the result of a regression model). The goal of the work on this paper is to present an evaluation framework for applications where the numeric outcome of a regression model may lead to different costs/benefits as a consequence of the actions it entails. The new metric provides a more informed estimate of the utility of any regression model, given the application-specific preference biases, and hence makes more reliable the comparison and selection between alternative regression models. We illustrate the objective of our evaluation methodology on a real-life application and also carry a set of experiments over a subset of our target regression tasks: the prediction of rare and extreme values. Results show the effectiveness of our proposed utility metric for identifying the models that perform better on this type of applications.