Loading [MathJax]/extensions/MathMenu.js
Visual Language Model based Cross-modal Semantic Communication Systems | IEEE Journals & Magazine | IEEE Xplore

Visual Language Model based Cross-modal Semantic Communication Systems


Abstract:

Semantic Communication (SC) has emerged as a novel communication paradigm in recent years. Nevertheless, extant Image Semantic Communication (ISC) systems face several ch...Show More

Abstract:

Semantic Communication (SC) has emerged as a novel communication paradigm in recent years. Nevertheless, extant Image Semantic Communication (ISC) systems face several challenges in dynamic environments, including low information density, catastrophic forgetting, and uncertain Signal-to-Noise Ratio (SNR). To address these challenges, we propose a novel Vision-Language Model-based Cross-modal Semantic Communication (VLM-CSC) system. The VLM-CSC comprises three novel components: (1) Cross-modal Knowledge Base (CKB) is used to extract high-density textual semantics from the semantically sparse image at the transmitter and reconstruct the original image based on textual semantics at the receiver. The transmission of high-density semantics contributes to alleviating bandwidth pressure. (2) Memory-assisted Encoder and Decoder (MED) employ a hybrid long/short-term memory mechanism, enabling the semantic encoder and decoder to overcome catastrophic forgetting in dynamic environments when there is a drift in the distribution of semantic features. (3) Noise Attention Module (NAM) employs attention mechanisms to adaptively adjust the semantic coding and the channel coding based on SNR, ensuring the robustness of the CSC system. The experimental simulations validate the effectiveness, adaptability, and robustness of the CSC system.
Published in: IEEE Transactions on Wireless Communications ( Early Access )
Page(s): 1 - 1
Date of Publication: 04 March 2025

ISSN Information:

Funding Agency:


Contact IEEE to Subscribe