I. Introduction
Over the past decades, remote-sensing (RS) images have greatly improved in temporal, spatial, and spectral resolution due to the development of multiple aerospace sensors technology [1]. With the growth of the information richness in RS data, especially data collected from various sensors, scene classification techniques are playing an increasingly vital role in earth observation (EO) missions [2]. Meanwhile, many high-level applications also greatly rely on RS scene classification products, such as urban development and planning [3], disaster response and management [4], land management [5].