FlipCAM: A Feature-Level Flipping Augmentation Method for Weakly Supervised Building Extraction From High-Resolution Remote Sensing Imagery | IEEE Journals & Magazine | IEEE Xplore

FlipCAM: A Feature-Level Flipping Augmentation Method for Weakly Supervised Building Extraction From High-Resolution Remote Sensing Imagery


Abstract:

It is time-consuming to collect a huge number of pixel-level annotations for accurately extracting buildings by deep neural networks. Supported by class activation map (C...Show More

Abstract:

It is time-consuming to collect a huge number of pixel-level annotations for accurately extracting buildings by deep neural networks. Supported by class activation map (CAM), weakly supervised semantic segmentation (WSSS) methods with image-level annotations serve as an efficient solution for building extraction. However, it is a great challenge to generate high-quality CAM heatmaps for buildings from high-resolution remote sensing images. On one hand, image-level labels lack spatial information, resulting in partial integrity and hollow phenomenon for building extraction. On the other hand, complex backgrounds in remote sensing images can lead to inaccurate extraction of building boundaries. In this study, we propose a novel weakly supervised building extraction method called FlipCAM to deal with these challenges. The Flip module based on feature-level flipping augmentation is designed to improve the integrity of CAM heatmaps by fusing the original and flipped feature maps. In addition, by combining the Flip module with the slice and merge (SAM) module based on consistency architecture, FlipCAM is able to generate high-quality CAM heatmaps with both boundary fineness and internal integrity in an end-to-end manner, which also alleviates special difficulties for building extraction, including adhesions in dense buildings and confusions with background and shadows, providing reliable pixel-level pseudo masks for training segmentation network to extract buildings. Extensive experiments on three high-resolution datasets show that FlipCAM achieves excellent performance and outperforms other weakly supervised methods in terms of effectiveness and robustness capabilities. Our code is public at https://github.com/NJU-LHRS/FlipCAM-master.
Article Sequence Number: 4402917
Date of Publication: 30 January 2024

ISSN Information:

Funding Agency:


I. Introduction

With the rapid development of high-resolution satellites and remote sensing technology, building extraction from remote sensing imagery is of great significance for geographic applications, such as urban planning [1], [2], population estimation [3], and land cover mapping [4]. As a binary segmentation task, the main purpose of building extraction is to assign each pixel in a remote-sensing image as a building or nonbuilding label. With the increasing building extraction requirements and the growing number of high-resolution remote sensing images, it is crucial to find an efficient way to accurately and automatically extract buildings.

Contact IEEE to Subscribe

References

References is not available for this document.