Two-Stage Spatio- Temporal Vision Transformer for the Detection of Violent Scenes | IEEE Conference Publication | IEEE Xplore