Abstract:
Unsupervised domain adaptation (UDA) aims to adapt models learned from a well-annotated source domain to a target domain, where only unlabeled samples are available. To t...Show MoreMetadata
Abstract:
Unsupervised domain adaptation (UDA) aims to adapt models learned from a well-annotated source domain to a target domain, where only unlabeled samples are available. To this end, adversarial training is widely used in conventional UDA methods to reduce the discrepancy between source and target domains. Recently, prompt tuning has emerged as an efficient way to adapt large pre-trained vision-language models like CLIP to a variety of downstream tasks. In this paper, we present a novel method named Adversarial DuAl Prompt Tuning (ADAPT) for UDA, which employs text prompts and visual prompts to guide CLIP simultaneously. Rather than simply performing a joint optimization of text prompts and visual prompts, we integrate text prompt tuning and visual prompt tuning into a collaborative framework where they engage in an adversarial game: text prompt tuning focuses on distinguishing between source and target images, whereas visual prompt tuning seeks to align source and target domains. Unlike most existing adversarial training-based UDA approaches, ADAPT does not require explicit domain discriminators for domain alignment. Instead, the objective is effectively achieved at both global and category levels through modeling the joint probability distribution of images on domains and categories. Extensive experiments on four benchmark datasets demonstrate the effectiveness of our ADAPT method for UDA. We have released our code at https://github.com/Liuziyi1999/ADAPT.
Published in: IEEE Transactions on Image Processing ( Volume: 34)