Conferences >2025 IEEE/CVF Winter Conferen...

Frequency-Domain Refinement of Vision Transformers for Robust Medical Image Segmentation Under Degradation

Download PDF
Download References
Request Permissions
Save to
Alerts

Abstract:

Medical image segmentation is crucial for precise diagnosis, treatment planning, and disease monitoring in clinical settings. While convolutional neural networks (CNNs) h...Show More

Metadata

Abstract:

Medical image segmentation is crucial for precise diagnosis, treatment planning, and disease monitoring in clinical settings. While convolutional neural networks (CNNs) have achieved remarkable success, they struggle with modeling long-range dependencies. Vision Transformers (ViTs) address this limitation by leveraging self-attention mechanisms to capture global contextual information. However, ViTs often fall short in local feature description, which is crucial for precise segmentation. To address this issue, we reformulate self-attention in the frequency domain to enhance both local and global feature representation. Our approach, the Enhanced Wave Vision Transformer (EW-ViT), incorporates wavelet decomposition within the self-attention block to adaptively refine feature representation in low and high-frequency components. We also introduce the Prompt-Guided High-Frequency Refiner (PGHFR) module to handle image degradation, which mainly affects high-frequency components. This module uses implicit prompts to encode degradation-specific information and adjust high-frequency representations accordingly. Additionally, we apply a contrastive learning strategy to maintain feature consistency and ensure robustness against noise, leading to state-of-the-art (SOTA) performance in medical image segmentation, especially under various conditions of degradation. Source code is available at GitHub.

Published in: 2025 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)

Date of Conference: 26 February 2025 - 06 March 2025

Date Added to IEEE Xplore: 08 April 2025

ISBN Information:

ISSN Information:

DOI: 10.1109/WACV61041.2025.00889

Conference Location: Tucson, AZ, USA

Funding Agency:

Contents

1. Introduction

Medical image segmentation plays a pivotal role in disease diagnosis and quantitative assessment throughout the clinical workflow. Convolutional neural networks (CNNs) have emerged as pioneering approaches in this domain. The seminal U-Net [23], with its series of convolutional and down-sampling layers designed to gather contextual information through a symmetrical hierarchical architecture, has demonstrated remarkable segmentation capabilities. Despite their widespread use, CNNs struggle to effectively model long-range dependencies due to their reliance on stacks of convolutional blocks to increase receptive fields.

References is not available for this document.

Frequency-Domain Refinement of Vision Transformers for Robust Medical Image Segmentation Under Degradation

Abstract:

Metadata

Abstract:

ISSN Information:

Funding Agency:

1. Introduction

References

IEEE Account

Purchase Details

Profile Information

Need Help?

Frequency-Domain Refinement of Vision Transformers for Robust Medical Image Segmentation Under Degradation

Alerts

Abstract:

Metadata

Abstract:

ISSN Information:

Funding Agency:

1. Introduction

Authors

Figures

References

Keywords

Metrics

Supplemental Items

References

IEEE Account

Purchase Details

Profile Information

Need Help?