Journals & Magazines >IEEE Transactions on Image Pr... >Volume: 29

Fast Collective Activity Recognition Under Weak Supervision

Download PDF
Download References
Request Permissions
Save to
Alerts

Abstract:

Collective activity recognition, which tells what activity a group of people is performing, is a cutting-edge research topic in computer vision. Different from action per...Show More

Metadata

Abstract:

Collective activity recognition, which tells what activity a group of people is performing, is a cutting-edge research topic in computer vision. Different from action performed by individuals, collective activity needs to consider the complex interactions among different people. However, most previous works require exhaustive annotations such as accurate label information of individual actions, pairwise interactions, and poses, which could not be easily available in practice. Moreover, most of them treat human detection as a decoupled task before collective activity recognition and leverage all detected persons. This not only ignores the mutual relation between the two tasks, which makes it hard for filtering out irrelevant people, but also probably increases the computation burden when reasoning the collective activities. In this paper, we propose a fast weakly supervised deep learning architecture for collective activity recognition. For fast inference, we propose to make the actor detection and weakly supervised collective activity reasoning collaborate in an end-to-end framework by sharing convolutional layers between them. The joint learning makes the two tasks united and reinforced each other, so that it is more effective to filter out the outliers who are not involved in the activity. For the weakly supervised learning, we propose a latent embedding scheme for mining person-group interactive relationship to get rid of the use of any pairwise relation between people and the individual action labels as well. The experimental results show that the proposed framework achieves comparable or even better performance as compared to the state-of-the-art on three datasets. Our joint modelling reasons collective activities at the speed of 22.65 fps, which is the fastest ever known and substantially makes collective activity recognition more towards real-time applications.

Published in: IEEE Transactions on Image Processing ( Volume: 29)

Page(s): 29 - 43

Date of Publication: 30 May 2019

ISSN Information:

PubMed ID: 31170069

DOI: 10.1109/TIP.2019.2918725

Funding Agency:

Peizhen Zhang

School of Data and Computer Science, Sun Yat-sen University, Guangzhou, China

Peizhen Zhang received the B.S. and M.S. degrees from the School of Data and Computer Science, Sun Yat-sen University, Guangzhou, China, in 2016 and 2019, respectively. His research interests include object detection, collective activity recognition, and autoML.

Yongyi Tang

Key Laboratory of Machine Intelligence and Advanced Computing (Sun Yat-sen University), Ministry of Education, China

Yongyi Tang received the B.S. degree from the South China University of Technology in 2015 and the M.S. degree from Sun Yat-Sen University, Guangzhou, China, in 2018. His research interests include collective activity recognition and video analysis.

Jian-Fang Hu

School of Data and Computer Science, Sun Yat-sen University, Guangzhou, China

Jian-Fang Hu received the B.S. and Ph.D. degrees from the School of Mathematics, Sun Yat-sen University, Guangzhou, China, in 2016 and 2010, respectively. He has published several scientific papers in international conferences and journals, including ICCV, CVPR, ECCV, IEEE TPAMI, IEEE TCSVT, and PR. His research interests include human-object interaction modeling, 3D face modeling, and RGB-D action recognition.

Wei-Shi Zheng

School of Data and Computer Science, Sun Yat-sen University, Guangzhou, China

Contents

Peizhen Zhang

School of Data and Computer Science, Sun Yat-sen University, Guangzhou, China

Yongyi Tang

Key Laboratory of Machine Intelligence and Advanced Computing (Sun Yat-sen University), Ministry of Education, China

Jian-Fang Hu

School of Data and Computer Science, Sun Yat-sen University, Guangzhou, China

Wei-Shi Zheng

School of Data and Computer Science, Sun Yat-sen University, Guangzhou, China

Wei-Shi Zheng received the Ph.D. degree in applied mathematics from Sun Yat-sen University in 2008. He is currently a Professor with Sun Yat-sen University. He has now published more than 110 papers, including more than 90 publications in main journals (TPAMI, TNN/TNNLS, TIP, and TSMC-B, PR) and top conferences (ICCV, CVPR, IJCAI, and AAAI). His research interests include person/object association and activity understanding in visual surveillance, and the related large-scale machine learning algorithm. He has joined the Microsoft Research Asia Young Faculty Visiting Programme. He has served as a Senior PC/Area Chair/Associate Editor for AVSS 2012, ICPR 2018, IJCAI 2019, AAAI 2020, and BMVC from 2018 to 2019. He is an Associate Editor of Pattern Recognition. He was a recipient of the Excellent Young Scientists Fund of the National Natural Science Foundation of China, and the Royal Society-Newton Advanced Fellowship, U.K.

References is not available for this document.

Fast Collective Activity Recognition Under Weak Supervision

Abstract:

Metadata

Abstract:

ISSN Information:

Funding Agency:

References

IEEE Account

Purchase Details

Profile Information

Need Help?

Fast Collective Activity Recognition Under Weak Supervision

Alerts

Abstract:

Metadata

Abstract:

ISSN Information:

Funding Agency:

Authors

Figures

References

Citations

Keywords

Metrics

Footnotes

References

IEEE Account

Purchase Details

Profile Information

Need Help?