Fooling the Forgers: A Multi-Stage Framework for Audio Deepfake Detection | IEEE Conference Publication | IEEE Xplore

Fooling the Forgers: A Multi-Stage Framework for Audio Deepfake Detection


Abstract:

Audio deepfakes represent a risk to society as they can deteriorate society’s trust in any audio. In this paper, we present a novel approach for audio deepfake detection ...Show More

Abstract:

Audio deepfakes represent a risk to society as they can deteriorate society’s trust in any audio. In this paper, we present a novel approach for audio deepfake detection using Generative Adversarial Networks (GANs) and contrastive learning in a multi-stage detection framework. In our process, we apply the Pre-trained Models (PTM) to extract all suitable audio phonetics, speaker identity, and other spatial prosodic features or contents, which are crucial for the model. We enhance the model’s performance by utilizing a GAN data augmentation strategy in combination with HiFi-GAN. The Contrastive learning approach is then used for improving the model’s ability to discriminate real speech from fake speech. Our experiments demonstrate that this method is superior to existing methodologies in detection and robustness.
Date of Conference: 06-11 April 2025
Date Added to IEEE Xplore: 07 March 2025
ISBN Information:

ISSN Information:

Conference Location: Hyderabad, India

Contact IEEE to Subscribe

References

References is not available for this document.