Loading [MathJax]/extensions/MathMenu.js
Structural-Aware Disentangled Learning with CLIP for Hyperbolic Zero-Shot Sketch-Based Image Retrieval* | IEEE Conference Publication | IEEE Xplore

Structural-Aware Disentangled Learning with CLIP for Hyperbolic Zero-Shot Sketch-Based Image Retrieval*


Abstract:

The zero-shot sketch-based image retrieval task faces two key challenges: domain gap and knowledge transfer. Our innovation is recognizing that directly aligning cross-do...Show More

Abstract:

The zero-shot sketch-based image retrieval task faces two key challenges: domain gap and knowledge transfer. Our innovation is recognizing that directly aligning cross-domain features weakens the discriminative ability of the model, as it overlooks the asymmetry between sketches and images. Additionally, Euclidean space is inadequate for capturing the hierarchical structure, which limits the performance of the model on complex data. To address these issues, we propose a Structural-Aware Disentangled Learning network (termed SADLnet) that incorporates CLIP and hyperbolic geometry. Specifically, we use CLIP to extract visual features from each domain to enhance the domain generalization of the model. Furthermore, we design a structure-guided disentanglement strategy to decompose image representations into sketch-related and sketch-unrelated features, addressing the domain gap. Moreover, we project the retrieval features into hyperbolic space to capture hierarchical information, improving feature discrimination in retrieval tasks. Extensive experiments demonstrate that SADLnet establishes new state-of-the-art performance on three datasets.
Date of Conference: 06-11 April 2025
Date Added to IEEE Xplore: 07 March 2025
ISBN Information:

ISSN Information:

Conference Location: Hyderabad, India

Funding Agency:


Contact IEEE to Subscribe

References

References is not available for this document.