Journals & Magazines >IEEE Transactions on Pattern ... >Volume: 46 Issue: 1

Simple Primitives With Feasibility- and Contextuality-Dependence for Open-World Compositional Zero-Shot Learning

Download PDF
Download References
Request Permissions
Save to
Alerts

Abstract:

The task of Open-World Compositional Zero-Shot Learning (OW-CZSL) is to recognize novel state-object compositions in images from all possible compositions, where the nove...Show More

Metadata

Abstract:

The task of Open-World Compositional Zero-Shot Learning (OW-CZSL) is to recognize novel state-object compositions in images from all possible compositions, where the novel compositions are absent during the training stage. The performance of conventional methods degrades significantly due to the large cardinality of possible compositions. Some recent works consider simple primitives (i.e., states and objects) independent and separately predict them to reduce cardinality. However, it ignores the heavy dependence between states, objects, and compositions. In this paper, we model the dependence via feasibility and contextuality. Feasibility-dependence refers to the unequal feasibility of compositions, e.g., hairy is more feasible with cat than with building in the real world. Contextuality-dependence represents the contextual variance in images, e.g., cat shows diverse appearances when it is dry or wet. We design Semantic Attention (SA) to capture the feasibility semantics to alleviate impossible predictions, driven by the visual similarity between simple primitives. We also propose a generative Knowledge Disentanglement (KD) to disentangle images into unbiased representations, easing the contextual bias. Moreover, we complement the independent compositional probability model with the learned feasibility and contextuality compatibly. In the experiments, we demonstrate our superior or competitive performance, SA-and-kD-guided Simple Primitives (SAD-SP), on three benchmark datasets.

Published in: IEEE Transactions on Pattern Analysis and Machine Intelligence ( Volume: 46, Issue: 1, January 2024)

Page(s): 543 - 560

Date of Publication: 09 October 2023

ISSN Information:

PubMed ID: 37812558

DOI: 10.1109/TPAMI.2023.3323012

Contents

I. Introduction

Many datasets exhibit long-tailed distribution, i.e., a large number of classes have few or even no prior instances [1], [2], [3], [4], [5], [6], [7], [8]. Insufficient data become a bottleneck limiting the universality of deep learning [9], [10], [11], [12], [13], [14], [15]. Comparatively, humans can intuitively identify non-existent concepts (e.g., canvas tree), once humans understand the underlying primitives (e.g., canvas and tree). Inspired by this, recent works [16], [17], [18], [19], [20] propose a new learning paradigm named Compositional Zero-Shot Learning (CZSL). CZSL models images as compositions of primitive state and object concepts [21], [22], [23], [24]. It aims to extract states and objects in seen images, transferring knowledge from seen to unseen, thereby recognizing unseen state-object compositions without training. For example, given images of canvas shoe and brown tree, machines can learn simple primitives of shoe and brown, thus directly recognizing the unseen composition of brown shoe in images.

References is not available for this document.

Simple Primitives With Feasibility- and Contextuality-Dependence for Open-World Compositional Zero-Shot Learning

Abstract:

Metadata

Abstract:

ISSN Information:

I. Introduction

References

IEEE Account

Purchase Details

Profile Information

Need Help?

Simple Primitives With Feasibility- and Contextuality-Dependence for Open-World Compositional Zero-Shot Learning

Alerts

Abstract:

Metadata

Abstract:

ISSN Information:

I. Introduction

Authors

Figures

References

Citations

Keywords

Metrics

Supplemental Items

Footnotes

References

IEEE Account

Purchase Details

Profile Information

Need Help?