Abstract:
Over the past ten years, deep learning methods has achieved great progress in the field of computer vision (CV), especially in object detection. In contrast with traditio...Show MoreMetadata
Abstract:
Over the past ten years, deep learning methods has achieved great progress in the field of computer vision (CV), especially in object detection. In contrast with traditional detection methods, deep learning methods significantly outperform those in real-time performance and accuracy without any complicated hand-crafted process of feature extraction. The powerful ability of convolutional neural networks (CNNs) to extract features was realized as image classification task made breakthroughs by employing it. Therefore, many researchers attempt to apply CNNs which can learn high-level semantic features to object detection, producing some representative models like classical one-stage detectors and two-stage detectors. Besides, in recent years, Transformer which shows extraordinary talents in the field of natural language processing (NLP) has also been utilized in object detection models. A kind of novel detectors has been proposed based on transformer encoder-decoder architecture without usage of anchor generation and non-maximum suppression (NMS) postprocessing, which starts a new detection mode—set prediction, and gains great performance. In this paper, we summarize various detectors from different detection modes. To begin with, our review shows architecture of classical two-stage detectors and one-stage detectors. Then, transformer-based detectors are introduced in detail. Experimental evaluation is also provided to compare performance from various detectors. Finally, we raise a conclusion and future prospects to serve as a guideline for future work.
Published in: 2022 5th International Conference on Pattern Recognition and Artificial Intelligence (PRAI)
Date of Conference: 19-21 August 2022
Date Added to IEEE Xplore: 04 October 2022
ISBN Information: