Abstract:
Existing self-supervised learning methods learn representation by means of pretext tasks which are either (1) dis-criminating that explicitly specify which features shoul...Show MoreMetadata
Abstract:
Existing self-supervised learning methods learn representation by means of pretext tasks which are either (1) dis-criminating that explicitly specify which features should be separated or (2) aligning that precisely indicate which features should be closed together, but ignore the fact how to jointly and principally define which features to be repelled and which ones to be attracted. In this work, we combine the positive aspects of the discriminating and aligning methods, and design a hybrid method that addresses the above issue. Our method explicitly specifies the repulsion and attraction mechanism respectively by discriminative predictive task and concurrently maximizing mutual information between paired views sharing redundant information. We qualitatively and quantitatively show that our proposed model learns better features that are more effective for the diverse downstream tasks ranging from classification to semantic segmentation. Our experiments on nine established benchmarks show that the proposed model consistently outperforms the existing state-of-the-art results of self-supervised and transfer learning protocol. Code can be found at https://github.com/AnjanDutta/codial.
Date of Conference: 11-17 October 2021
Date Added to IEEE Xplore: 24 November 2021
ISBN Information: