Deepfake Detection and Localization Using Multi-View Inconsistency Measurement | IEEE Journals & Magazine | IEEE Xplore

Deepfake Detection and Localization Using Multi-View Inconsistency Measurement


Abstract:

As deepfake technology advances, forgery detection techniques have evolved beyond simple classification to include fine-grained localization. However, existing deepfake l...Show More

Abstract:

As deepfake technology advances, forgery detection techniques have evolved beyond simple classification to include fine-grained localization. However, existing deepfake localization methods struggle with with real-world deepfake videos, which are often multi-face scenarios with only some parts manipulated. To address the above-mentioned problems, we propose a Multi-View Inconsistency Measurement (MVIM) network that simultaneously measures inconsistencies from noise and temporal view to detect and locate tampered regions. Specifically, considering the noise inconsistencies in multi-face scenarios where fake faces have inconsistent noise patterns compared to real faces and backgrounds, we design a Noise Inconsistency Measurement (Noise-IM) module that measures noise similarity among faces and between faces and backgrounds using a masked attention mechanism to identify suspected tampered regions in noise domain. Since facial jitter of tampered regions in deepfake videos is observed to be more intense than that of real regions, we design a Temporal Inconsistency Measurement (Temporal-IM) module which adopts self-attention mechanism and fine-grained bi-direction convolutions to capture tampering traces between frames in temporal domain. Inconsistency features obtained by the two modules are fused for detecting and locating tampered regions. The superiority of our MVIM network is verified by extensive experiments with many state-of-the-art methods in different benchmark datasets.
Published in: IEEE Transactions on Dependable and Secure Computing ( Volume: 22, Issue: 2, March-April 2025)
Page(s): 1796 - 1809
Date of Publication: 01 October 2024

ISSN Information:

Funding Agency:


I. Introduction

The continuous development of deep learning techniques and the widespread dissemination of multi-media have brought about a situation where seeing is no longer believing. With advancements in generative models represented by Variational Autoencoders (VAEs) [1], Generative Adversarial Networks (GANs) [2], [3] and Diffusion Models (DMs) [4], [5], it has become relatively straightforward to alter one person's face to another while maintaining the original facial expression and head pose easily. However, these forgery techniques [6], [7], [8] are likely to be misused for certain malicious purposes, resulting in serious security and ethical issues (e.g., the promotion of celebrity pornography and political persecution). Therefore, in order to mitigate the negative impact on public safety and personal privacy, it is crucial to develop effective solutions to counteract these face forgery attacks.

Contact IEEE to Subscribe

References

References is not available for this document.