Abstract:
Taking the feature pyramids into account has become a crucial way to boost the object detection performance. While various pyramid representations have been developed, pr...Show MoreMetadata
Abstract:
Taking the feature pyramids into account has become a crucial way to boost the object detection performance. While various pyramid representations have been developed, previous works are still inefficient to integrate the semantical information over different scales. Moreover, recent object detectors are suffering from accurate object location applications, mainly due to the coarse definition of the positive examples at training and predicting phases. In this paper, we begin by analyzing current pyramid solutions, and then propose a novel architecture by reconfiguring the feature hierarchy in a flexible yet effective way. In particular, our architecture consists of two lightweight and trainable processes: global attention and local reconfiguration. The global attention is to emphasize the global information of each feature scale, while the local reconfiguration is to capture the local correlations across different scales. Both the global attention and local reconfiguration are non-linear and thus exhibit more expressive ability. Then, we discover that the loss function for object detectors during training is the central cause of the inaccurate location problem. We propose to address this issue by reshaping the standard cross entropy loss such that it focuses more on accurate predictions. Both the feature reconfiguration and the consistent loss could be utilized in popular one-stage (SSD, RetinaNet) and two-stage (Faster R-CNN) detection frameworks. Extensive experimental evaluations on PASCAL VOC 2007, PASCAL VOC 2012, and MS COCO datasets demonstrate that our models achieve consistent and significant boosts compared with other state-of-the-art methods.
Published in: IEEE Transactions on Image Processing ( Volume: 28, Issue: 10, October 2019)
Funding Agency:
Contrastive Proposal Extension With LSTM Network for Weakly Supervised Object Detection
Pei Lv,Suqi Hu,Tianran Hao
FII-CenterNet: An Anchor-Free Detector With Foreground Attention for Traffic Object Detection
Siqi Fan,Fenghua Zhu,Shichao Chen,Hui Zhang,Bin Tian,Yisheng Lv,Fei-Yue Wang
Cyclic Self-Training With Proposal Weight Modulation for Cross-Supervised Object Detection
Yunqiu Xu,Chunluan Zhou,Xin Yu,Yi Yang
Dynamic Informative Proposal-Based Iterative Training Procedure for Weakly Supervised Object Detection in Remote Sensing Images
Zhiwen Tan,Zhiguo Jiang,Chen Guo,Haopeng Zhang
Hierarchical Information Enhancing Detector for Remotely Sensed Object Detection
Yuanlin Zhang,Yuan Yuan
TFEdet: Efficient Multi-Frame 3D Object Detector via Proposal-Centric Temporal Feature Extraction
Jongho Kim,Sungpyo Sagong,Kyongsu Yi
Multiple Region Proposal Experts Network for Wide-Scale Remote Sensing Object Detection
Qifeng Lin,Haibin Huang,Daoye Zhu,Nuo Chen,Gang Fu,Yuanlong Yu
Detector With Classifier2: An End-to-End Multi-Stream Feature Aggregation Network for Fine-Grained Object Detection in Remote Sensing Images
Shangdong Zheng,Zebin Wu,Yang Xu,Chengxun He,Zhihui Wei
Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks
Shaoqing Ren,Kaiming He,Ross Girshick,Jian Sun
Self-Training and Adversarial Background Regularization for Unsupervised Domain Adaptive One-Stage Object Detection
Seunghyeon Kim,Jaehoon Choi,Taekyung Kim,Changick Kim