Self-Supervised Distilled Learning for Multi-modal Misinformation Identification | IEEE Conference Publication | IEEE Xplore

Self-Supervised Distilled Learning for Multi-modal Misinformation Identification


Abstract:

Rapid dissemination of misinformation is a major societal problem receiving increasing attention. Unlike Deep-fake, Out-of-Context misinformation, in which the unaltered ...Show More

Abstract:

Rapid dissemination of misinformation is a major societal problem receiving increasing attention. Unlike Deep-fake, Out-of-Context misinformation, in which the unaltered unimode contents (e.g. image, text) of a multi-modal news sample are combined in an out-of-context manner to generate deception, requires limited technical expertise to create. Therefore, it is more prevalent a means to confuse readers. Most existing approaches extract features from its uni-mode counterparts to concatenate and train a model for the misinformation classification task. In this paper, we design a self-supervised feature representation learning strategy that aims to attain the multi-task objectives: (1) task-agnostic, which evaluates the intra- and inter-mode representational consistencies for improved alignments across related models; (2) task-specific, which estimates the category-specific multi-modal knowledge to enable the classifier to derive more discriminative predictive distributions. To compensate for the dearth of annotated data representing varied types of misinformation, the proposed Self-Supervised Distilled Learner (SSDL) utilizes a Teacher network to weakly guide a Student network to mimic a similar decision pattern as the teacher. The two-phased learning of SSDL can be summarized as: initial pretraining of the Student model using a combination of contrastive self-supervised task-agnostic objective and supervised task-specific adjustment in parallel; finetuning the Student model via self-supervised knowledge distillation blended with the supervised objective of decision alignment. In addition to the consistent out-performances over the existing baselines that demonstrate the feasibility of our approach, the explainability capacity of the proposed SSDL also helps users visualize the reasoning behind a specific prediction made by the model.
Date of Conference: 02-07 January 2023
Date Added to IEEE Xplore: 06 February 2023
ISBN Information:

ISSN Information:

Conference Location: Waikoloa, HI, USA

Funding Agency:


1. Introduction

The spread of misinformation whether in the form of a full-fledged news article or just a small tweet has raised significant concern in various domains e.g., politics, finance, society, and others[1], [2]. According to Weibo’s 2020 annual report, [42], 76, 107 news contents shared on Weibo social media platform were identified as false by the authority all year round. As an emerging field of research, evaluating misinformation has attracted attention of researchers across multiple disciplines (Social Science, Communication, Journalism, Computer Science). To ensure maximum impact in its audience, content creators of such misleading news articles frequently utilize multi-modal information, e.g. texts and images, to describe topics. A specific type of malicious multi-modal manipulation efforts, deep fakes [27], [39], [6], [12], has received significant attention from researchers, who attempt to develop automated methods for detecting such distortions. Nevertheless, a common phenomenon in recent years, popularly known as Out-of-Context images [15], [36], is far more prevalent a means to spread misinformation. It leverages existing unaltered images as is, but represents an irrelevant and misleading fact via newly coupled text.

Contact IEEE to Subscribe

References

References is not available for this document.