Personatalk: Preserving Personalized Dynamic Speech Style In Talking Face Generation | IEEE Conference Publication | IEEE Xplore

Personatalk: Preserving Personalized Dynamic Speech Style In Talking Face Generation


Abstract:

Recent visual speaker authentication methods claimed their effectiveness against deepfake attacks. However, the success is attributed to the inadequacy of existing talkin...Show More

Abstract:

Recent visual speaker authentication methods claimed their effectiveness against deepfake attacks. However, the success is attributed to the inadequacy of existing talking face generation methods to preserve the dynamic speech style of the speaker, which serves as the key cue for authentication methods in verification. To address this, we propose PersonaTalk, a speaker-specific method utilizing the speaker’s video data to enhance the fidelity of the speaker’s dynamic speech styles in generated videos. Our approach introduces a visual context block to integrate lip motion information into the audio features. Additionally, to enhance reading intelligibility in dubbed videos, a cross dubbing phase is incorporated during training. Experiments on the GRID dataset show the superiority of PersonaTalk over existing SOTA methods. These findings emphasize the need for enhanced defense measures in existing lip-based speaker authentication methods.
Date of Conference: 27-30 October 2024
Date Added to IEEE Xplore: 27 September 2024
ISBN Information:

ISSN Information:

Conference Location: Abu Dhabi, United Arab Emirates

Funding Agency:


Contact IEEE to Subscribe

References

References is not available for this document.