I. Introduction
As a fundamental task of computer vision image analysis, object detection (OD) is widely used in many fields. The pipeline of the classic OD method is mainly based on the Deformable Part Models (DPM) [1], which has also been applied to the remote sensing (RS) field by Cheng et al. [2] and Chen et al. [3] to detect vehicles in aerial images. The DPM is a detection method using handcrafted features and sliding windows but is limited by its complexity of feature design and the inefficiency of object search.