Loading [MathJax]/extensions/MathZoom.js
PET: High-Frequency Temporal Self-Consistency Learning for Partially Deepfake Audio Localization | IEEE Conference Publication | IEEE Xplore

PET: High-Frequency Temporal Self-Consistency Learning for Partially Deepfake Audio Localization

; ; ;

Abstract:

Partially deepfake audio attacks have attracted the attention recently, and the demand for locating the manipulation regions of partially deepfake audio arises accordingl...Show More

Abstract:

Partially deepfake audio attacks have attracted the attention recently, and the demand for locating the manipulation regions of partially deepfake audio arises accordingly. However, existing methods are usually proposed based on frame-level authenticity detection or splicing boundaries detection, neglecting the temporal self-consistency of audio. In this paper, we propose a novel method for partially deepfake audio localization based on temporal self-consistency learning via high-frequency components, named as PET. The results demonstrates that, in ADD 2023 Track 2 eval set, it could achieve the segment F1-score at 0.7397 without any data augmentation strategies, which is 21.94% higher than that of the system ranked 1st on the leaderboard. It also confirms the effectiveness and well generalization ability of PET.
Date of Conference: 06-11 April 2025
Date Added to IEEE Xplore: 07 March 2025
ISBN Information:

ISSN Information:

Conference Location: Hyderabad, India

Funding Agency:


References

References is not available for this document.