A free lunch from ViT: adaptive attention multi-scale fusion Transformer for fine-grained visual recognition | IEEE Conference Publication | IEEE Xplore