Masked Autoencoders for Spatial–Temporal Relationship in Video-Based Group Activity Recognition | IEEE Journals & Magazine | IEEE Xplore