I. Introduction
Many tasks in machine learning require robustness—that the learning process of a model is less affected by noises [1]. Different from the noise in regression, which means that the attribute diverges from the expectant distribution, the noise in classification is more intricate and could be systematically classified into two categories: attribute noise and label noise [2], [3]. Attribute (or feature) noise means measurement errors resulting from noisy sensors, recordings, communications, and data storage, while label noise means wrong labeling. Label noise could result from mutual elements as attribute noise [4], such as communication errors, whereas it mainly arises from expert elements [5]: 1) unreliable labeling due to insufficient information; 2) unreliable nonexpert for low cost; and 3) subjective labeling. Not to mention, class is not always totally distinct as lived and died [6]. The outlier, a more adverse case of noise [7], usually causes serious performance degradation. We define attribute outliers appear with large attribute values while located in the opposite region of the expected decision boundary, and label outliers signify that recognizable samples are, however, assigned with wrong labels.