1 Introduction
The goal of image geotagging is to assign GPS coordinates (i.e., latitude, longitude) to a given image using its visual content. It is a very challenging task even for humans. Considering 20 example images in Fig. 1, can human easily identify where they were taken? Some of them are extremely easy. For instance, the four landmark images in the fourth row. We may easily identify that the image containing the temple was taken in Beijing. However, others are very difficult, for example, the non-landmark images in the last row. We may wonder why some of them are easy to identify but some of them are hard?