Loading [MathJax]/extensions/MathMenu.js
Learning Self-Supervised Vision Transformers from Scratch for Aerial Person Re- Identification | IEEE Conference Publication | IEEE Xplore

Learning Self-Supervised Vision Transformers from Scratch for Aerial Person Re- Identification


Abstract:

In recent years, person re-identification (Re-ID) as a widely studied computer vision task, has reached a saturation state under closed-world setting, which encourages re...Show More

Abstract:

In recent years, person re-identification (Re-ID) as a widely studied computer vision task, has reached a saturation state under closed-world setting, which encourages researchers to further explore more realistic scenarios. Among them, person Re- Idin aerial imagery is proposed and improved due to its unique practical importance in public security. However, since the aerial person images are taken by unmanned aerial vehicles (UAV s), influenced by camera height and angle of view, there are more serious problems such as weak appearance feature and occlusion than ground person images. Most of the current state-of-the-art person Re-ID methods on closed-world datasets are based on local convolution neural network, and hardly works well when applying them to aerial person Re- Idtasks directly. In this paper, we improve the emerging vision transformer (ViT) and apply it to the person Re- Idin aerial imagery. It should be noted that a large amount of data is required to be pretrained for ViTs to achieve competitive performance. Considering the limitations of data, computing power and flexibility in practical scenarios, we improve the pre-training process based on self-supervised learning, and achieve training ViTs from scratch with limited data. Specifically, in pre-training stage, the self-supervised paradigm based on parameter instance discrimination is applied to capture feature alignment and instance similarity, which alleviates the data-hungry of ViTs caused by the lack of inductive bias. Extensive comparative evaluation experiments are conducted on the aerial Re- Iddataset. Our method achieves a Rank-1 accuracy of 65.29% and a mean average precision (mAP) of 57.31%, which proves its effectiveness in aerial person Re-ID tasks.
Date of Conference: 24-26 November 2023
Date Added to IEEE Xplore: 25 April 2024
ISBN Information:

ISSN Information:

Conference Location: Wuyishan, China

Funding Agency:


References

References is not available for this document.