Skip to Main Content
Automatic organization of large, unordered image collections is an extremely challenging problem with many potential applications. Often, what is required is that images taken in the same place, of the same thing, or of the same person be conceptually grouped together. This work focuses on grouping images containing the same object, despite significant changes in scale, viewpoint and partial occlusions, in very large (1M+) image collections automatically gathered from Flicker. The scale of the data and the extreme variation in imaging conditions makes the problem very challenging. We describe a scalable method that first computes a matching graph over all the images. Image groups can then be mined from this graph using standard clustering techniques. The novelty we bring is that both the matching graph and the clustering methods are able to use the spatial consistency between the images arising from the common object (if there is one). We demonstrate our methods on a publicly available dataset of 5 K images of Oxford, a 37 K image dataset containing images of the Statue of Liberty, and a much larger 1M image dataset of Rome. This is, to our knowledge, the largest dataset to which image-based data mining has been applied.