Abstract:
Partially deepfake audio localization is important in audio forensics. However, existing localization models for partially deepfake audio face two major challenges: distr...Show MoreMetadata
Abstract:
Partially deepfake audio localization is important in audio forensics. However, existing localization models for partially deepfake audio face two major challenges: distribution shifts between training and testing data as well as insufficient utilization of information from both manipulated regions and boundaries. To address these challenges, we propose to use Adversarial training and Gradient Optimization (AGO) to improve partially fake audio localization. Specifically, we apply a gradient reversal layer to reduce the dependence on domain-specific features, enhancing the model’s generalization ability. Additionally, we introduce an alternating update strategy to learn information from both manipulated regions and boundaries, while orthogonal gradient updates minimize conflicts between the two tasks. We evaluated AGO on both the ADD2023 track 2 and PartialSpoof datasets. We achieved a 22.82% relative improvement over the first-ranked method of the ADD2023 track 2. We also achieved state-of-the-art results on the PartialSpoof dataset. Our code is available at https://github.com/Little-dingding/ATGO.
Published in: ICASSP 2025 - 2025 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
Date of Conference: 06-11 April 2025
Date Added to IEEE Xplore: 07 March 2025
ISBN Information: