SMILE: Infusing Spatial and Motion Semantics in Masked Video Learning | IEEE Conference Publication | IEEE Xplore