Skip to Main Content
Remote sensing image classification constitutes a challenging problem since very few labeled pixels are typically available from the analyzed scene. In such situations, labeled data extracted from other images modeling similar problems might be used to improve the classification accuracy. However, when training and test samples follow even slightly different distributions, classification is very difficult. This problem is known as sample selection bias. In this paper, we propose a new method to combine labeled and unlabeled pixels to increase classification reliability and accuracy. A semisupervised support vector machine classifier based on the combination of clustering and the mean map kernel is proposed. The method reinforces samples in the same cluster belonging to the same class by combining sample and cluster similarities implicitly in the kernel space. A soft version of the method is also proposed where only the most reliable training samples, in terms of likelihood of the image data distribution, are used. Capabilities of the proposed method are illustrated in a cloud screening application using data from the MEdium Resolution Imaging Spectrometer (MERIS) instrument onboard the European Space Agency ENVISAT satellite. Cloud screening constitutes a clear example of sample selection bias since cloud features change to a great extent depending on the cloud type, thickness, transparency, height, and background. Good results are obtained and show that the method is particularly well suited for situations where the available labeled information does not adequately describe the classes in the test data.