STIT: Spatio-Temporal Interaction Transformers for Human-Object Interaction Recognition in Videos | IEEE Conference Publication | IEEE Xplore