SCViT: A Spatial-Channel Feature Preserving Vision Transformer for Remote Sensing Image Scene Classification | IEEE Journals & Magazine | IEEE Xplore

SCViT: A Spatial-Channel Feature Preserving Vision Transformer for Remote Sensing Image Scene Classification


Abstract:

Convolutional neural network (CNN)-based methods are widely used in remote sensing image scene classification and can obtain excellent performances. However, the stacked ...Show More

Abstract:

Convolutional neural network (CNN)-based methods are widely used in remote sensing image scene classification and can obtain excellent performances. However, the stacked receptive fields in the CNN-based methods have limitations in modeling the long-range dependencies of local features. The vision transformer (ViT) model provides a good solution as it directly considers the global interactions of local patches by the self-attention mechanism. However, the vanilla ViT model, which simply splits images into fixed-size patches treated as tokens, mainly considers the global information in the spatial domain. In this article, a spatial-channel feature preserving ViT (SCViT) model is proposed, which considers both the detailed geometric information of the high-spatial-resolution (HSR) imagery and the contribution of the different channels contained in the classification token. First, in the proposed method, tokens are generated by progressively aggregating the neighboring overlapping patches to extract the local structural features of the imagery. Second, a multihead self-attention (MSA) mechanism is used to model the global interactions of the tokens in the encoder. A lightweight channel attention (LCA) module is then introduced to consider the importance of the different channels in the classification token. Finally, a multilayer perceptron (MLP) is used to acquire the final results. Compared with the state-of-the-art scene classification methods, the experimental results confirm the potential of using ViT models in remote sensing image scene classification.
Article Sequence Number: 4409512
Date of Publication: 08 March 2022

ISSN Information:

Funding Agency:


Contact IEEE to Subscribe

References

References is not available for this document.