Journals & Magazines >IEEE Journal of Selected Topi... >Volume: 16

From Plane to Hierarchy: Deformable Transformer for Remote Sensing Image Captioning

Abstract:

With the growth of remote sensing images, understanding image content automatically has attracted many researchers' interests in deep learning for remote sensing image. I...Show More

Metadata

Abstract:

With the growth of remote sensing images, understanding image content automatically has attracted many researchers' interests in deep learning for remote sensing image. Inspired from the natural image captioning, the model with convolutional neural network (CNN)-Recurrent neural network (RNN) as the backbone and supplemented by attention has been widely used in remote sensing image captioning. However, it is inefficient for the current attention layer to simultaneously mine hidden foreground from the background of remote sensing image and perform feature interactive learning. Meanwhile, the new mainstream language model has recently surpassed the traditional long short-term memory (LSTM) in sentence generation. For solving the above problems, in this article, we proposed a novel thought to make the flat remote sensing images stereoscopic by separating the foreground and background. Based on hierarchical image information, we designed a novel Deformable Transformer equipped with deformable scaled dot-product attention to learn multiscale feature from foreground and background through the powerful interactive learning ability. Evaluations are conducted on four classic remote sensing image captioning datasets. Compared with the state-of-the-art methods, our Transformer variant achieves higher captioning accuracy.

Published in: IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing ( Volume: 16)

Page(s): 7704 - 7717

Date of Publication: 16 August 2023

ISSN Information:

DOI: 10.1109/JSTARS.2023.3305889

Funding Agency:

Contents

References is not available for this document.

From Plane to Hierarchy: Deformable Transformer for Remote Sensing Image Captioning

Abstract:

Metadata

Abstract:

ISSN Information:

Funding Agency:

References

IEEE Account

Purchase Details

Profile Information

Need Help?

From Plane to Hierarchy: Deformable Transformer for Remote Sensing Image Captioning

Alerts

Abstract:

Metadata

Abstract:

ISSN Information:

Funding Agency:

Authors

Figures

References

Citations

Keywords

Metrics

Footnotes

References

IEEE Account

Purchase Details

Profile Information

Need Help?