Conferences >ICASSP 2025 - 2025 IEEE Inter...

Test-time Alignment-Enhanced Adapter for Vision-Language Models

Download PDF
Download References
Request Permissions
Save to
Alerts

Abstract:

Test-time adaptation with pre-trained vision-language models (VLMs) has attracted increasing attention for tackling the issue of distribution shift during the test phase....Show More

Metadata

Abstract:

Test-time adaptation with pre-trained vision-language models (VLMs) has attracted increasing attention for tackling the issue of distribution shift during the test phase. While prior methods have shown effectiveness in addressing distribution shift by adjusting classification logits, they are not optimal due to keeping text features unchanged. To address this issue, we introduce a new approach called Test-time Alignment-Enhanced Adapter (TAEA), which trains an adapter with test samples to adjust text features during the test phase. We can enhance the text-to-image alignment prediction by utilizing an adapter to adapt text features. Furthermore, we also propose to adopt the negative cache from TDA as enhancement module, which further improves the performance of TAEA. Our approach outperforms the state-of-the-art TTA method of pre-trained VLMs by an average of 0.75% on the out-of-distribution benchmark and 2.5% on the cross-domain benchmark, with an acceptable training time. Code will be available at https://github.com/BaoshunWq/clip-TAEA.

Published in: ICASSP 2025 - 2025 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

Date of Conference: 06-11 April 2025

Date Added to IEEE Xplore: 07 March 2025

ISBN Information:

ISSN Information:

DOI: 10.1109/ICASSP49660.2025.10888937

Conference Location: Hyderabad, India

Contents

References is not available for this document.

Test-time Alignment-Enhanced Adapter for Vision-Language Models

Abstract:

Metadata

Abstract:

ISSN Information:

References

IEEE Account

Purchase Details

Profile Information

Need Help?

Test-time Alignment-Enhanced Adapter for Vision-Language Models

Alerts

Abstract:

Metadata

Abstract:

ISSN Information:

Authors

Figures

References

Keywords

Metrics

References

IEEE Account

Purchase Details

Profile Information

Need Help?