Loading [MathJax]/extensions/MathMenu.js
Rotated and Masked Image Modeling: A Superior Self-Supervised Method for Classification | IEEE Journals & Magazine | IEEE Xplore

Rotated and Masked Image Modeling: A Superior Self-Supervised Method for Classification


Abstract:

Mask image modeling (MIM) has performed excellently as a transformer-based self-supervised method via random masking and reconstruction. However, since the unmasked image...Show More

Abstract:

Mask image modeling (MIM) has performed excellently as a transformer-based self-supervised method via random masking and reconstruction. However, since the unmasked image patches are non-participation in the loss computation, MIM cannot effectively utilize the data and waste much computation. This drawback usually limits the learning ability of the pre-training model when pre-training on small-scale datasets. To solve this problem, we propose a novel self-supervised learning method for small-scale datasets called RotMIM. Unlike MIM, RotMIM has a different pretext task: recognizing the rotation angle that is applied to the unmasked patches. RotMIM can fully utilize data and provide a stronger self-supervised signal. Moreover, to fit RotMIM, we propose a data augmentation method called FeaMix. Our proposal ensures that the mixing area with RotMIM understands that each basic unit of semantic information in an image has the same size. This consistency guarantees clean tokenization during fine-tuning after pre-training. Our proposals outperform state-of-the-art self-supervised methods on three popular datasets, Mini-ImageNet, Caltech256, and Cifar100.
Published in: IEEE Signal Processing Letters ( Volume: 30)
Page(s): 1477 - 1481
Date of Publication: 12 October 2023

ISSN Information:

Funding Agency:


Contact IEEE to Subscribe

References

References is not available for this document.