LAVViT: Latent Audio-Visual Vision Transformers for Speaker Verification | IEEE Conference Publication | IEEE Xplore