Abstract:
Visual Emotion Recognition (VER) aims to identify emotions from visual content and has garnered significant attention in recent years due to its wide-ranging applications...Show MoreMetadata
Abstract:
Visual Emotion Recognition (VER) aims to identify emotions from visual content and has garnered significant attention in recent years due to its wide-ranging applications. Although deep learning-based methods have shown success in VER, they require extensive labeled data, which is costly. Unsupervised Domain Adaptation (UDA) methods can reduce reliance on annotated data by transferring models trained on labeled datasets to unlabeled data. However, these methods assume that the source and target domains share the same label space. In practice, this assumption is often violated due to the inherent ambiguity and subjectivity in emotion labeling. To address this limitation, we propose a novel prompt learning paradigm for open-vocabulary visual emotion UDA, termed Domain-specific Ensemble Prompting (DSEP). DSEP leverages psychological emotion models to unify emotion labels into a common space in an ensemble manner, enhancing the open-vocabulary capabilities of UDA. It then combines ensemble label prompts with domain-specific content prompts to achieve open-vocabulary UDA. To our knowledge, we are the first to explore open-vocabulary adaptation for VER. Extensive experiments demonstrate that DSEP consistently outperforms state-of-the-art methods across four public benchmarks.
Published in: ICASSP 2025 - 2025 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
Date of Conference: 06-11 April 2025
Date Added to IEEE Xplore: 07 March 2025
ISBN Information: