Loading [MathJax]/extensions/MathMenu.js
Deep Geo-Constrained Auto-Encoder for Non-Landmark GPS Estimation | IEEE Journals & Magazine | IEEE Xplore

Deep Geo-Constrained Auto-Encoder for Non-Landmark GPS Estimation


Abstract:

This paper addresses the problem of geotagging images, i.e., assigning GPS coordinates (i.e., latitude, longitude) to images using image contents. Due to the huge appeara...Show More

Abstract:

This paper addresses the problem of geotagging images, i.e., assigning GPS coordinates (i.e., latitude, longitude) to images using image contents. Due to the huge appearance variability of visual features across the world, the images' contents and their GPS coordinates may be inconsistent. This means images captured from geographically close areas may appear visually distinct; and images with visually similar contents may be taken from geographically distant areas. In this paper, we propose a deep Geo-constrained Auto-encoder (DGAE) to solve these inconsistency problems. Using clustered GPS data and visual data, our approach identifies inconsistent data pairs (i.e., image, GPS). We then propose a novel deep learning framework that can learn similar feature representations for geographically close images and distinct feature representations for geographically distant images. We introduce two new constraints: the same-area constraint and the easy-confusing constraint to our feature learning networks. The former one penalizes images from the same area but with very distinct visual features, and the latter one penalizes images from distant areas but with very similar visual features. A deep architecture is developed to further improve learning discriminative features, which can disambiguate different geometric locations. Our approach is extensively evaluated on a newly-compiled large image geotagging dataset from large-scale community-contributed images with 664,720 images and outperforms comparison approaches.
Published in: IEEE Transactions on Big Data ( Volume: 5, Issue: 2, 01 June 2019)
Page(s): 120 - 133
Date of Publication: 21 November 2017

ISSN Information:

Funding Agency:


1 Introduction

The goal of image geotagging is to assign GPS coordinates (i.e., latitude, longitude) to a given image using its visual content. It is a very challenging task even for humans. Considering 20 example images in Fig. 1, can human easily identify where they were taken? Some of them are extremely easy. For instance, the four landmark images in the fourth row. We may easily identify that the image containing the temple was taken in Beijing. However, others are very difficult, for example, the non-landmark images in the last row. We may wonder why some of them are easy to identify but some of them are hard?

Contact IEEE to Subscribe

References

References is not available for this document.