Abstract:
Temporal localization of actions in videos has been of increasing interest in recent years. However, most existing approaches rely on complex architectures that are eithe...Show MoreMetadata
Abstract:
Temporal localization of actions in videos has been of increasing interest in recent years. However, most existing approaches rely on complex architectures that are either expensive to train, inefficient at inference time, or require thorough and careful architecture engineering. Classical action recognition on pre-segmented clips, on the other hand, benefits from sophisticated deep architectures that paved the way for highly reliable video clip classifiers. In this paper, we propose to use transfer learning to leverage the good results from action recognition for temporal localization. We apply a network that is inspired by the classical bag-of-words model for transfer learning and show that the resulting framewise class posteriors already provide good results without explicit temporal modeling. Further, we show that combining these features with a deep but simple convolutional network achieves state of the art results on two challenging action localization datasets.
Date of Conference: 27-28 October 2019
Date Added to IEEE Xplore: 05 March 2020
ISBN Information:
ISSN Information:
Keywords assist with retrieval of results and provide a means to discovering other relevant content. Learn more.
- IEEE Keywords
- Index Terms
- Transfer Learning ,
- Action Recognition ,
- Temporal Localization ,
- Action Localization ,
- Temporal Action Localization ,
- Convolutional Network ,
- State Of The Art ,
- Video Clips ,
- Temporal Model ,
- Challenging Dataset ,
- Transfer Learning Model ,
- Posterior Probability ,
- Output Layer ,
- Object Detection ,
- Class Labels ,
- Intersection Over Union ,
- Speech Recognition ,
- Vanilla ,
- Pre-trained Network ,
- Action Classes ,
- Temporal Convolutional Network ,
- Action Instances ,
- Transfer Learning Strategy ,
- Proposal Network ,
- Proposal Generation ,
- Action Recognition Datasets ,
- Temporal Context ,
- Temporal Convolution ,
- Penultimate Layer ,
- Action Recognition Task
- Author Keywords
Keywords assist with retrieval of results and provide a means to discovering other relevant content. Learn more.
- IEEE Keywords
- Index Terms
- Transfer Learning ,
- Action Recognition ,
- Temporal Localization ,
- Action Localization ,
- Temporal Action Localization ,
- Convolutional Network ,
- State Of The Art ,
- Video Clips ,
- Temporal Model ,
- Challenging Dataset ,
- Transfer Learning Model ,
- Posterior Probability ,
- Output Layer ,
- Object Detection ,
- Class Labels ,
- Intersection Over Union ,
- Speech Recognition ,
- Vanilla ,
- Pre-trained Network ,
- Action Classes ,
- Temporal Convolutional Network ,
- Action Instances ,
- Transfer Learning Strategy ,
- Proposal Network ,
- Proposal Generation ,
- Action Recognition Datasets ,
- Temporal Context ,
- Temporal Convolution ,
- Penultimate Layer ,
- Action Recognition Task
- Author Keywords