Conferences >2022 IEEE/CVF Conference on C...

MViTv2: Improved Multiscale Vision Transformers for Classification and Detection

Download PDF
Download References
Request Permissions
Save to
Alerts

Abstract:

In this paper, we study Multiscale Vision Transformers (MViTv2) as a unified architecture for image and video classification, as well as object detection. We present an i...Show More

Metadata

Abstract:

In this paper, we study Multiscale Vision Transformers (MViTv2) as a unified architecture for image and video classification, as well as object detection. We present an improved version of MViT that incorporates decomposed relative positional embeddings and residual pooling connections. We instantiate this architecture in five sizes and evaluate it for ImageNet classification, COCO detection and Kinetics video recognition where it outperforms prior work. We further compare MViTv2s' pooling attention to window attention mechanisms where it outperforms the latter in accuracy/compute. Without bells-and-whistles, MViTv2 has state-of-the-art performance in 3 domains: 88.8% accuracy on ImageNet classification, 58.7 AP^box on COCO object detection as well as 86.1% on Kinetics-400 video classification. Code and models are available at https://github.com/facebookresearch/mvit.

Published in: 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)

Date of Conference: 18-24 June 2022

Date Added to IEEE Xplore: 27 September 2022

ISBN Information:

ISSN Information:

DOI: 10.1109/CVPR52688.2022.00476

Conference Location: New Orleans, LA, USA

Contents

References is not available for this document.

MViTv2: Improved Multiscale Vision Transformers for Classification and Detection

Abstract:

Metadata

Abstract:

ISSN Information:

References

IEEE Account

Purchase Details

Profile Information

Need Help?

MViTv2: Improved Multiscale Vision Transformers for Classification and Detection

Alerts

Abstract:

Metadata

Abstract:

ISSN Information:

Authors

Figures

References

Citations

Keywords

Metrics

Supplemental Items

Footnotes

References

IEEE Account

Purchase Details

Profile Information

Need Help?