FCC-MF: Detecting Violence in Audio-Visual Context with Frame-Wise Cluster Contrast and Modality-Stage Flooding | IEEE Conference Publication | IEEE Xplore