I. Introduction
Remote-sensing object detection (RSOD) plays an important role in many fields, such as national defense and security, resource management, and emergency rescuing. With the development of deep learning, many deep neural network (DNN)-based detection methods [1], [2], [3], [4], [5], [6], [7] were proposed and achieved promising performance. Besides, a number of remote-sensing (RS) datasets (e.g., HRSC2016 [8], NWPU VHR-10 [9], and DOTA series [10]) containing accurate and rich annotations were proposed to develop and benchmark RSOD methods. In these datasets, accurate location, scale, category, and quantity information of objects are provided and greatly facilitate the development of RSOD. However, such rich annotation formats will lead to expensive labor costs when RSOD methods are transferred to the new RS data (e.g., images captured by new satellites).