Abstract:
Contrastive Learning (CL) has emerged as a popular self-supervised representation learning paradigm that has been shown in many applications to perform similarly to tradi...Show MoreMetadata
Abstract:
Contrastive Learning (CL) has emerged as a popular self-supervised representation learning paradigm that has been shown in many applications to perform similarly to traditional supervised learning methods. A key component of CL is mining the latent discriminative relationships between positive and negative samples and using them as self-supervised labels. We argue that this discriminative contrastive task is, in essence, similar to a classification task, and the “either positive or negative” hard label sampling strategies are arbitrary. To solve this problem, we explore ideas from data distillation, which considers probabilistic logit vectors as soft labels to transfer model knowledge. We attempt to abandon the classical hard sampling labels in CL and instead explore self-supervised soft labels. We adopt soft sampling labels that are extracted, without supervision, from the inherent relationships in data pairs to retain more information. We propose a new self-supervised graph learning method, Distill and Contrast (D&C), for learning representations that closely approximate natural data relationships. D&C extracts node similarities from the features and structures to derive soft sampling labels, which also eliminate noise in the data to increase robustness. Extensive experimental results on real-world datasets demonstrate the effectiveness of the proposed method.
Published in: IEEE Transactions on Knowledge and Data Engineering ( Volume: 37, Issue: 6, June 2025)