MesoNet-ViT: Meso Network and Vision Transformer for Deepfake Detection | IEEE Conference Publication | IEEE Xplore

MesoNet-ViT: Meso Network and Vision Transformer for Deepfake Detection


Abstract:

Social media tools using deep learning technology are used for entertainment, but they can also pose a danger with forgery outside of their purpose. New solutions and det...Show More

Abstract:

Social media tools using deep learning technology are used for entertainment, but they can also pose a danger with forgery outside of their purpose. New solutions and detection methods are still needed in the challenge against forgery. In this study, the proposed MesoNet-ViT method was used to classify fake and real images. MesoNet-ViT is a combination of the Meso and Vision Transformer (ViT) methods used in deepfake detection. In this hybrid architecture, the face region is extracted from the input file with the BlazeFace method and given as input to the classification module. Feature maps extracted from the face image with the Meso method are fed to the ViT model, which detects whether the video is fake or real. When the experiments are analyzed, our method has shown that it can compete with other methods with an ACC value of 96.2 on the DeepFake Detection Challenge (DFDC) dataset and an ACC value of 98.1 on the DF-TIMIT dataset.
Date of Conference: 19-22 February 2025
Date Added to IEEE Xplore: 21 March 2025
ISBN Information:

ISSN Information:

Conference Location: Zabljak, Montenegro

I. Introduction

There have been many developments in deep learning technologies such as AutoEncoder and GAN recently. These developments have played an important role in the creation and spread of deepfake technologies. Various opensource tools such as FaceSwap [1] , DeepFaceLab [2] , Reface [3] , Reflect and FakeApp, which are used in the production of deepfake videos and images, can be used quite easily by users [4] . Although such tools are generally preferred for the purpose of producing entertaining content, unfortunately, they can also be abused by some malicious people and used for negative purposes [5] . This situation poses a serious threat on social media platforms and can cause the spread of false information. The fact that users do not use these tools consciously increases the risks they create in society and leads to the questioning of reliable sources of information. Therefore, it is understood that more discussion and regulation should be made on the ethical use of technologies such as deepfake. Detection methods have been proposed to keep deepfakes under control. Detection methods can focus on pixel features, biological features, artifacts [6] or inconsistencies in images. In addition, deep learning technologies can be used for forgery classification.

Contact IEEE to Subscribe

References

References is not available for this document.