Multimodal Framework for Deep Fake Detection and Content Moderation Using CNN, ViT, and Audio-Visual Analysis | IEEE Conference Publication | IEEE Xplore