Skip to Main Content
This paper investigates local invariant features for geographic (overhead) image retrieval. Local features are particularly well suited for the newer generations of aerial and satellite imagery whose increased spatial resolution, often just tens of centimeters per pixel, allows a greater range of objects and spatial patterns to be recognized than ever before. Local invariant features have been successfully applied to a broad range of computer vision problems and, as such, are receiving increased attention from the remote sensing community particularly for challenging tasks such as detection and classification. We perform an extensive evaluation of local invariant features for image retrieval of land-use/land-cover (LULC) classes in high-resolution aerial imagery. We report on the effects of a number of design parameters on a bag-of-visual-words (BOVW) representation including saliency- versus grid-based local feature extraction, the size of the visual codebook, the clustering algorithm used to create the codebook, and the dissimilarity measure used to compare the BOVW representations. We also perform comparisons with standard features such as color and texture. The performance is quantitatively evaluated using a first-of-its-kind LULC ground truth data set which will be made publicly available to other researchers. In addition to reporting on the effects of the core design parameters, we also describe interesting findings such as the performance-efficiency tradeoffs that are possible through the appropriate pairings of different-sized codebooks and dissimilarity measures. While the focus is on image retrieval, we expect our insights to be informative for other applications such as detection and classification.