Abstract:
The task of temporal action detection aims to locate and classify action segments in untrimmed videos. Most existing works usually consist of two components: snippet-leve...Show MoreMetadata
Abstract:
The task of temporal action detection aims to locate and classify action segments in untrimmed videos. Most existing works usually consist of two components: snippet-level boundary segmentation and anchor-level action evaluation. These two components, however, are typically designed ir-relevantly, so the detection accuracy is undermined due to vague boundaries and complex video content. To tackle this problem, we design two supplementary modules. One mod-ule, termed as Anchor Aware Module (AAM), uses tem-poral and semantic related anchors to enhance snippet feature. The other module, named Boundary Aware Module (BAM), endows anchor feature with structured representation using intermediate supervision. Moreover, the ConvL-STM is applied to establish temporal relation in BAM with the structured representation. These two modules are in-tegrated as the Boundary-Anchor Complementary Network (BACNet), which achieves the state-of-the-art performance on both THUMOS-14 and ActivityNet-1.3 datasets.
Date of Conference: 18-22 July 2022
Date Added to IEEE Xplore: 26 August 2022
ISBN Information: