Journals & Magazines >IEEE Transactions on Pattern ... >Volume: 45 Issue: 5

Channel Exchanging Networks for Multimodal and Multitask Dense Image Prediction

Download PDF
Download References
Request Permissions
Save to
Alerts

Abstract:

Multimodal fusion and multitask learning are two vital topics in machine learning. Despite the fruitful progress, existing methods for both problems are still brittle to ...Show More

Metadata

Abstract:

Multimodal fusion and multitask learning are two vital topics in machine learning. Despite the fruitful progress, existing methods for both problems are still brittle to the same challenge—it remains dilemmatic to integrate the common information across modalities (resp. tasks) meanwhile preserving the specific patterns of each modality (resp. task). Besides, while they are actually closely related to each other, multimodal fusion and multitask learning are rarely explored within the same methodological framework before. In this paper, we propose Channel-Exchanging-Network (CEN) which is self-adaptive, parameter-free, and more importantly, applicable for multimodal and multitask dense image prediction. At its core, CEN adaptively exchanges channels between subnetworks of different modalities. Specifically, the channel exchanging process is self-guided by individual channel importance that is measured by the magnitude of Batch-Normalization (BN) scaling factor during training. For the application of dense image prediction, the validity of CEN is tested by four different scenarios: multimodal fusion, cycle multimodal fusion, multitask learning, and multimodal multitask learning. Extensive experiments on semantic segmentation via RGB-D data and image translation through multi-domain input verify the effectiveness of CEN compared to state-of-the-art methods. Detailed ablation studies have also been carried out, which demonstrate the advantage of each component we propose. Our code is available at https://github.com/yikaiw/CEN.

Published in: IEEE Transactions on Pattern Analysis and Machine Intelligence ( Volume: 45, Issue: 5, 01 May 2023)

Page(s): 5481 - 5496

Date of Publication: 30 September 2022

ISSN Information:

PubMed ID: 36178992

DOI: 10.1109/TPAMI.2022.3211086

Funding Agency:

Contents

1 Introduction

Encouraged by the growing availability of low-cost sensors, multimodal fusion that takes advantage of multiple data sources for classification or regression becomes one of the central problems in machine learning [1]. Joining the success of deep learning, multimodal fusion is recently specified as deep multimodal fusion by introducing end-to-end neural integration of multiple modalities [2], and it has exhibited remarkable benefits against the unimodal paradigm in semantic segmentation [3], [4], action recognition [5], [6], [7], visual question answering [8], [9], and many others [10], [11], [12]. Multitask learning [13] is another crucial topic in machine learning. It aims to seek models to solve multiple tasks simultaneously, which enjoys the benefit of model generation and data efficiency against the methods that learn each task independently. Similar to multimodal fusion, multitask learning has also been developed from previously shallow methods [14] to deep variants [15], [16], [17], [18], [19] by taking advantage of deep learning. The successful applications of multitask learning include navigation [20], robot manipulation [21], etc.

References is not available for this document.

Channel Exchanging Networks for Multimodal and Multitask Dense Image Prediction

Abstract:

Metadata

Abstract:

ISSN Information:

Funding Agency:

1 Introduction

References

IEEE Account

Purchase Details

Profile Information

Need Help?

Channel Exchanging Networks for Multimodal and Multitask Dense Image Prediction

Alerts

Abstract:

Metadata

Abstract:

ISSN Information:

Funding Agency:

1 Introduction

Authors

Figures

References

Citations

Keywords

Metrics

Supplemental Items

Footnotes

References

IEEE Account

Purchase Details

Profile Information

Need Help?