Skip to Main Content
In this paper, we propose a novel method that aims at assisting the ground-truth expert through an automatic detection of potentially mislabeled learning samples. This method is based on viewing the mislabeled sample detection issue as an optimization problem where it is looked for the best subset of learning samples in terms of statistical separability between classes. This problem is formulated within a genetic optimization framework, where each chromosome represents a candidate solution for validating/invalidating the learning samples collected by the ground-truth expert. The genetic optimization process is guided by the joint optimization of two different criteria which are the maximization of a between-class statistical distance and the minimization of the number of invalidated samples. Experiments conducted on both simulated and real data sets show that the proposed ground-truth validation method succeeds in the following: 1) in detecting the mislabeled samples with a high accuracy, even when up to 30% of the learning samples are mislabeled, and 2) in strongly limiting the negative impact of the mislabeling issue on the accuracy of the classification process.
Geoscience and Remote Sensing, IEEE Transactions on (Volume:47 , Issue: 7 )
Date of Publication: July 2009