Journals & Magazines >IEEE Transactions on Circuits... >Volume: 34 Issue: 7

Allowing Supervision in Unsupervised Deformable- Instances Image-to-Image Translation

Download PDF
Download References
Request Permissions
Save to
Alerts

Abstract:

Replacing objects in images is a practical functionality of Photoshop, e.g., clothes changing. This task is defined as Unsupervised Deformable-Instances Image-to-Image Tr...Show More

Metadata

Abstract:

Replacing objects in images is a practical functionality of Photoshop, e.g., clothes changing. This task is defined as Unsupervised Deformable-Instances Image-to-Image Translation (UDIT), which maps multiple foreground instances of a source domain to a target domain, involving significant changes in shape. Although previous works incorporate instance masks of source domain for instance shape indication, their translation still fails in shape because of inadequate utilization of shape information in masks. To mitigate this issue, we introduce an effective two-stage pipeline for UDIT called Mask-Guided Deformable-instances GAN++ (MGD-GAN++), which generates target masks in the first stage named Mask Morph and utilizes the masks to guide the synthesis of corresponding instances in the second stage named Mask-Guided Image Generation. To further provide sufficient supervision with existing unpaired datasets, an overall set of training schemes is proposed for the two stages of MGD-GAN++, coined as Aligned Supervision and Inpainting Supervision, respectively. Extensive experiments on four datasets demonstrate the significant advantages of our MGD-GAN++ over existing methods both quantitatively and qualitatively. Furthermore, our training time consumption is hugely reduced compared to the state-of-the-art.

Published in: IEEE Transactions on Circuits and Systems for Video Technology ( Volume: 34, Issue: 7, July 2024)

Page(s): 5335 - 5349

Date of Publication: 18 December 2023

ISSN Information:

DOI: 10.1109/TCSVT.2023.3343733

Funding Agency:

Contents

I. Introduction

Image-to-Image (I2I) translation aims to learn the mapping between the source and target domain, and begins to emerge as the proposal of Generative Adversarial Networks [2]. Since then, increasing attention has been paid to this task because several visual tasks could be transformed into I2I translation such as: style transfer [3], [4], super-resolution [5], portrait synthesis [6], [7], [8], label-to-image [9], [10] and image-inpainting [11]. Moreover, great progress has been made in recent years. For example, CycleGAN [12] proposes to exert cycle consistency on the generators during the training process. Furthermore, UNIT [3] extends the Coupled GAN [13] based on the assumption of a shared latent space. To meet the demand of generating diverse and multi-modal images, MUNIT [14], DRIT [15], etc. are introduced by recombining the disentangled image representation. It is noteworthy that the methods above only focus on transferring styles on the whole image without considering the characteristics of instances.

References is not available for this document.

Allowing Supervision in Unsupervised Deformable- Instances Image-to-Image Translation

Abstract:

Metadata

Abstract:

ISSN Information:

Funding Agency:

I. Introduction

References

IEEE Account

Purchase Details

Profile Information

Need Help?

Allowing Supervision in Unsupervised Deformable- Instances Image-to-Image Translation

Alerts

Abstract:

Metadata

Abstract:

ISSN Information:

Funding Agency:

I. Introduction

Authors

Figures

References

Citations

Keywords

Metrics

References

IEEE Account

Purchase Details

Profile Information

Need Help?