1. Introduction
With the growth of social media, people are increasingly willing to provide and avail themselves of recommendations on the Internet. An example of image ranking occurs in the process of image search. In this case, similarity to a query is based on the output of a classifier, where the more similar the image is, the higher the rank. We often would like computer generated rankings to reflect human preferences and many ranking tasks have compared their outputs to human results [1], [2]. Collecting human rankings, however, is a difficult task. In order to obtain a full ranking of a set of images, a user must consider all pairs of images [3], [4], which is extremely time consuming and tedious. In addition to the large number of comparisons, there are often cases where humans are unable or unwilling to assert a preference between two images. In past crowd sourced experiments, researchers have sometimes provided an “I don't know” option. This allows users to confer equality or ambiguity to pairs of images.