Abstract:
Deep semi-supervised learning (SSL) methods aim to utilize abundant unlabeled data to improve the seen-class classification. However, in the open-world scenario, collecte...Show MoreMetadata
Abstract:
Deep semi-supervised learning (SSL) methods aim to utilize abundant unlabeled data to improve the seen-class classification. However, in the open-world scenario, collected unlabeled data tend to contain unseen-class data, which would degrade the generalization to seen-class classification. Formally, we define the problem as safe deep semi-supervised learning with unseen-class unlabeled data. One intuitive solution is removing these unseen-class instances after detecting them during the SSL process. Nevertheless, the performance of unseen-class identification is limited by the lack of suitable score function, the uncalibrated model, and the small number of labeled data. To this end, we propose a safe SSL method called SAFER-STUDENT from the teacher-student view. First, to enhance the ability of teacher model to identify seen and unseen classes, we propose a general scoring framework called Discrepancy with Raw (DR). Second, based on unseen-class data mined by teacher model from unlabeled data, we calibrate student model by newly proposed Unseen-class Energy-bounded Calibration (UEC) loss. Third, based on seen-class data mined by teacher model from unlabeled data, we propose Weighted Confirmation Bias Elimination (WCBE) loss to boost seen-class classification of student model. Extensive studies show that SAFER-STUDENT remarkably outperforms the state-of-the-art, verifying the effectiveness of our method in the under-explored problem.
Published in: IEEE Transactions on Knowledge and Data Engineering ( Volume: 36, Issue: 1, January 2024)