Loading [MathJax]/extensions/MathMenu.js
PACR-DETR: A Real-Time End-to-End Object Detector for Behavior Recognition in Various Classroom Scenarios | IEEE Journals & Magazine | IEEE Xplore

PACR-DETR: A Real-Time End-to-End Object Detector for Behavior Recognition in Various Classroom Scenarios


Abstract:

Recognizing behaviors within classroom settings is vital for gauging educational progress and optimizing teaching methodologies. The complexity of classroom environments ...Show More

Abstract:

Recognizing behaviors within classroom settings is vital for gauging educational progress and optimizing teaching methodologies. The complexity of classroom environments often poses challenges for detectors in terms of detection performance. Despite having a simplified structure for analyzing actions in classroom environments, detection transformer (DETR) encounters certain challenges. To overcome these obstacles, we propose a novel real-time end-to-end object detection model called position-aware channel residual (PACR)-DETR. Specifically, we introduce an efficient PACR network (PACRNet), which excels at extracting information from deep layers along the spatial dimension within complex backgrounds. Notably, it exclusively employs the transformer encoder to process the deepest layer of the feature map. Afterward, we introduce the efficient residual mixing (ERM) block, in order to apply convolution operations selectively to a subset of channels. This block underpins the multibranch perception cross-level fusion (MPCF) module, which facilitates hierarchical and cross-layer information fusion. Through a bidirectional chain, the MPCF network (MPCFNet) is architected to comprehensively integrate multiscale gradient information. Finally, we integrate the normalized Wasserstein distance (NWD) with the Inner-MPDIoU loss function to enhance the recognition performance for small-scale objects. The results show that the PACR-DETR outperforms other methods in detection performance and real-time capability. On the SCB03-S dataset for classroom behavior detection against complex backgrounds, the proposed method attains a precision of 72.1%, outperforming YOLOv6 by 14.1% and YOLOv8m by 4.5%. Furthermore, it reaches 41.67 frames/s, meeting the requirements for real-time application.
Article Sequence Number: 5016720
Date of Publication: 11 March 2025

ISSN Information:

Funding Agency:


Contact IEEE to Subscribe

References

References is not available for this document.