Adaptive Mutual Supervision for Weakly-Supervised Temporal Action Localization | IEEE Journals & Magazine | IEEE Xplore

Adaptive Mutual Supervision for Weakly-Supervised Temporal Action Localization


Abstract:

Weakly-supervised temporal action localization aims to localize actions from untrimmed long videos with only video-level category labels. Most previous methods ignore the...Show More

Abstract:

Weakly-supervised temporal action localization aims to localize actions from untrimmed long videos with only video-level category labels. Most previous methods ignore the incompleteness issue of Class Activation Sequences (CAS), suffering from trivial detection results. To tackle this issue, we propose a novel Adaptive Mutual Supervision (AMS) framework with two branches, where the base branch detects the most discriminative action regions, while the supplementary branch localizes the less discriminative action regions through an adaptive sampler. The sampler dynamically updates the inputs for the supplementary branch using a sampling weight sequence negatively correlated with the CAS from the base branch, thus encouraging the supplementary branch to localize the action regions underestimated by the base branch. To promote mutual enhancement between two branches, we further construct mutual location supervision. Each branch adopts the location pseudo-labels generated from the other branch as the localization supervision. By alternately optimizing two branches for multiple iterations, we progressively complete action regions. Extensive experiments on THUMOS14 and ActivityNet1.2 demonstrate that the proposed AMS method significantly outperforms state-of-the-art methods.
Published in: IEEE Transactions on Multimedia ( Volume: 25)
Page(s): 6688 - 6701
Date of Publication: 17 October 2022

ISSN Information:

Funding Agency:

Author image of Chen Ju
Cooperative Medianet Innovation Center, Shanghai Jiao Tong University, Shanghai, China
Chen Ju received the B.E. degree from the University of Electronic Science and Technology of China, Chengdu, China, in 2018. He has been working toward the Ph.D. degree with the Cooperative Medianet Innovation Center, Shanghai Jiao Tong University, Shanghai, China, since 2018. His research interests include vision-language-audio pre-training and applications, multi-modal video analysis and understanding, and music composi...Show More
Chen Ju received the B.E. degree from the University of Electronic Science and Technology of China, Chengdu, China, in 2018. He has been working toward the Ph.D. degree with the Cooperative Medianet Innovation Center, Shanghai Jiao Tong University, Shanghai, China, since 2018. His research interests include vision-language-audio pre-training and applications, multi-modal video analysis and understanding, and music composi...View more
Author image of Peisen Zhao
Huawei Cloud and AI, Shenzhen, Guangdong, China
Peisen Zhao received the B.E. degree from the University of Electronic Science and Technology of China, Chengdu, China, in 2016, and the Ph.D. degree from Cooperative Medianet Innovation Center, Shanghai Jiao Tong University, Shanghai, China, in 2022. He is currently a Researcher of artificial intelligence with Huawei Cloud and AI, Shenzhen, China. His research interests include video generation, action recognition, and u...Show More
Peisen Zhao received the B.E. degree from the University of Electronic Science and Technology of China, Chengdu, China, in 2016, and the Ph.D. degree from Cooperative Medianet Innovation Center, Shanghai Jiao Tong University, Shanghai, China, in 2022. He is currently a Researcher of artificial intelligence with Huawei Cloud and AI, Shenzhen, China. His research interests include video generation, action recognition, and u...View more
Author image of Siheng Chen
Cooperative Medianet Innovation Center, Shanghai Jiao Tong University, Shanghai, China
Siheng Chen received the B.E. degree from the Beijing Institute of Technology, China, and the M.S. and Ph.D. degrees from Carnegie Mellon University, Pittsburgh, PA, USA. He is currently an Associate Professor with Shanghai Jiao Tong University, Shanghai, China. He was an Autonomy Engineer with Uber Advanced Technologies Group, Pittsburgh, PA, USA, and Research Scientist with Mitsubishi Electric Research Laboratories, Cam...Show More
Siheng Chen received the B.E. degree from the Beijing Institute of Technology, China, and the M.S. and Ph.D. degrees from Carnegie Mellon University, Pittsburgh, PA, USA. He is currently an Associate Professor with Shanghai Jiao Tong University, Shanghai, China. He was an Autonomy Engineer with Uber Advanced Technologies Group, Pittsburgh, PA, USA, and Research Scientist with Mitsubishi Electric Research Laboratories, Cam...View more
Author image of Ya Zhang
Cooperative Medianet Innovation Center, Shanghai Jiao Tong University, Shanghai, China
Ya Zhang is currently a Professor with Cooperative Medianet Innovation Center, Shanghai Jiao Tong University, Shanghai, China, and chief AI Scientist with the State Key Laboratory of UHD video and audio production and presentation. She has authored or coauthored more than 100 refereed papers in prestigious international conferences and journals. Her research interest mainly include machine learning with applications to mu...Show More
Ya Zhang is currently a Professor with Cooperative Medianet Innovation Center, Shanghai Jiao Tong University, Shanghai, China, and chief AI Scientist with the State Key Laboratory of UHD video and audio production and presentation. She has authored or coauthored more than 100 refereed papers in prestigious international conferences and journals. Her research interest mainly include machine learning with applications to mu...View more
Author image of Xiaoyun Zhang
Cooperative Medianet Innovation Center, Shanghai Jiao Tong University, Shanghai, China
Xiaoyun Zhang (Member, IEEE) received the B.S. and M.S. degrees in applied mathematics from Xi'an Jiaotong University, Xi'an, China, and the Ph.D. degree in pattern recognition from Shanghai Jiao Tong University, Shanghai, China. She is currently an Associate Professor with the CMIC, Shanghai Jiao Tong University. Her Ph.D. thesis was nominated as the National 100 Best Ph.D. Theses of China. Her research interests include...Show More
Xiaoyun Zhang (Member, IEEE) received the B.S. and M.S. degrees in applied mathematics from Xi'an Jiaotong University, Xi'an, China, and the Ph.D. degree in pattern recognition from Shanghai Jiao Tong University, Shanghai, China. She is currently an Associate Professor with the CMIC, Shanghai Jiao Tong University. Her Ph.D. thesis was nominated as the National 100 Best Ph.D. Theses of China. Her research interests include...View more
Author image of Yanfeng Wang
Cooperative Medianet Innovation Center, Shanghai Jiao Tong University, Shanghai, China
Yanfeng Wang received the B.S. degree from PLA Information Engineering University, Beijing, China, and the M.S. and Ph.D. degrees in business management from Shanghai Jiao Tong University, Shanghai, China. He is currently the Vice Director of Cooperative Medianet Innovation Center and also the Vice Dean of the School of Electrical and Information Engineering, Shanghai Jiao Tong University. His research interest mainly inc...Show More
Yanfeng Wang received the B.S. degree from PLA Information Engineering University, Beijing, China, and the M.S. and Ph.D. degrees in business management from Shanghai Jiao Tong University, Shanghai, China. He is currently the Vice Director of Cooperative Medianet Innovation Center and also the Vice Dean of the School of Electrical and Information Engineering, Shanghai Jiao Tong University. His research interest mainly inc...View more
Author image of Qi Tian
Huawei Cloud and AI, Shenzhen, Guangdong, China
Qi Tian (Fellow, IEEE) is currently the Chief Scientist of artificial intelligence with Huawei Cloud & AI, Shenzhen, China, and is also a Full Professor with the Department of Computer Science, University of Texas at San Antonio, San Antonio, TX, USA. He has authored or coauthored more than 700 refereed journal and conference papers, including many Best Papers and Best Paper Candidates. His research interests include mult...Show More
Qi Tian (Fellow, IEEE) is currently the Chief Scientist of artificial intelligence with Huawei Cloud & AI, Shenzhen, China, and is also a Full Professor with the Department of Computer Science, University of Texas at San Antonio, San Antonio, TX, USA. He has authored or coauthored more than 700 refereed journal and conference papers, including many Best Papers and Best Paper Candidates. His research interests include mult...View more

Author image of Chen Ju
Cooperative Medianet Innovation Center, Shanghai Jiao Tong University, Shanghai, China
Chen Ju received the B.E. degree from the University of Electronic Science and Technology of China, Chengdu, China, in 2018. He has been working toward the Ph.D. degree with the Cooperative Medianet Innovation Center, Shanghai Jiao Tong University, Shanghai, China, since 2018. His research interests include vision-language-audio pre-training and applications, multi-modal video analysis and understanding, and music composition. He is also the reviewer of some prestigious international conferences.
Chen Ju received the B.E. degree from the University of Electronic Science and Technology of China, Chengdu, China, in 2018. He has been working toward the Ph.D. degree with the Cooperative Medianet Innovation Center, Shanghai Jiao Tong University, Shanghai, China, since 2018. His research interests include vision-language-audio pre-training and applications, multi-modal video analysis and understanding, and music composition. He is also the reviewer of some prestigious international conferences.View more
Author image of Peisen Zhao
Huawei Cloud and AI, Shenzhen, Guangdong, China
Peisen Zhao received the B.E. degree from the University of Electronic Science and Technology of China, Chengdu, China, in 2016, and the Ph.D. degree from Cooperative Medianet Innovation Center, Shanghai Jiao Tong University, Shanghai, China, in 2022. He is currently a Researcher of artificial intelligence with Huawei Cloud and AI, Shenzhen, China. His research interests include video generation, action recognition, and understanding. He is also the reviewer of some prestigious international conferences and journals.
Peisen Zhao received the B.E. degree from the University of Electronic Science and Technology of China, Chengdu, China, in 2016, and the Ph.D. degree from Cooperative Medianet Innovation Center, Shanghai Jiao Tong University, Shanghai, China, in 2022. He is currently a Researcher of artificial intelligence with Huawei Cloud and AI, Shenzhen, China. His research interests include video generation, action recognition, and understanding. He is also the reviewer of some prestigious international conferences and journals.View more
Author image of Siheng Chen
Cooperative Medianet Innovation Center, Shanghai Jiao Tong University, Shanghai, China
Siheng Chen received the B.E. degree from the Beijing Institute of Technology, China, and the M.S. and Ph.D. degrees from Carnegie Mellon University, Pittsburgh, PA, USA. He is currently an Associate Professor with Shanghai Jiao Tong University, Shanghai, China. He was an Autonomy Engineer with Uber Advanced Technologies Group, Pittsburgh, PA, USA, and Research Scientist with Mitsubishi Electric Research Laboratories, Cambridge, MA, USA. His research interest include graph signal processing, graph neural networks, and autonomous driving. He was the recipient of the 2018 IEEE Signal Processing Society Young Author Best Paper Award, and also his paper was the recipient of the Best Student Paper Award at IEEE GlobalSIP 2018.
Siheng Chen received the B.E. degree from the Beijing Institute of Technology, China, and the M.S. and Ph.D. degrees from Carnegie Mellon University, Pittsburgh, PA, USA. He is currently an Associate Professor with Shanghai Jiao Tong University, Shanghai, China. He was an Autonomy Engineer with Uber Advanced Technologies Group, Pittsburgh, PA, USA, and Research Scientist with Mitsubishi Electric Research Laboratories, Cambridge, MA, USA. His research interest include graph signal processing, graph neural networks, and autonomous driving. He was the recipient of the 2018 IEEE Signal Processing Society Young Author Best Paper Award, and also his paper was the recipient of the Best Student Paper Award at IEEE GlobalSIP 2018.View more
Author image of Ya Zhang
Cooperative Medianet Innovation Center, Shanghai Jiao Tong University, Shanghai, China
Ya Zhang is currently a Professor with Cooperative Medianet Innovation Center, Shanghai Jiao Tong University, Shanghai, China, and chief AI Scientist with the State Key Laboratory of UHD video and audio production and presentation. She has authored or coauthored more than 100 refereed papers in prestigious international conferences and journals. Her research interest mainly include machine learning with applications to multimedia and healthcare. Dr. Zhang was the recipient of several best paper awards of international journals and conferences, and directed one Outstanding Doctorate Dissertations awarded by Chinese Association for Artificial Intelligence.
Ya Zhang is currently a Professor with Cooperative Medianet Innovation Center, Shanghai Jiao Tong University, Shanghai, China, and chief AI Scientist with the State Key Laboratory of UHD video and audio production and presentation. She has authored or coauthored more than 100 refereed papers in prestigious international conferences and journals. Her research interest mainly include machine learning with applications to multimedia and healthcare. Dr. Zhang was the recipient of several best paper awards of international journals and conferences, and directed one Outstanding Doctorate Dissertations awarded by Chinese Association for Artificial Intelligence.View more
Author image of Xiaoyun Zhang
Cooperative Medianet Innovation Center, Shanghai Jiao Tong University, Shanghai, China
Xiaoyun Zhang (Member, IEEE) received the B.S. and M.S. degrees in applied mathematics from Xi'an Jiaotong University, Xi'an, China, and the Ph.D. degree in pattern recognition from Shanghai Jiao Tong University, Shanghai, China. She is currently an Associate Professor with the CMIC, Shanghai Jiao Tong University. Her Ph.D. thesis was nominated as the National 100 Best Ph.D. Theses of China. Her research interests include computer vision, pattern recognition, image processing, video compression, and digital TV system.
Xiaoyun Zhang (Member, IEEE) received the B.S. and M.S. degrees in applied mathematics from Xi'an Jiaotong University, Xi'an, China, and the Ph.D. degree in pattern recognition from Shanghai Jiao Tong University, Shanghai, China. She is currently an Associate Professor with the CMIC, Shanghai Jiao Tong University. Her Ph.D. thesis was nominated as the National 100 Best Ph.D. Theses of China. Her research interests include computer vision, pattern recognition, image processing, video compression, and digital TV system.View more
Author image of Yanfeng Wang
Cooperative Medianet Innovation Center, Shanghai Jiao Tong University, Shanghai, China
Yanfeng Wang received the B.S. degree from PLA Information Engineering University, Beijing, China, and the M.S. and Ph.D. degrees in business management from Shanghai Jiao Tong University, Shanghai, China. He is currently the Vice Director of Cooperative Medianet Innovation Center and also the Vice Dean of the School of Electrical and Information Engineering, Shanghai Jiao Tong University. His research interest mainly include media Big Data, the emerging commercial applications of information technology, and technology transfer.
Yanfeng Wang received the B.S. degree from PLA Information Engineering University, Beijing, China, and the M.S. and Ph.D. degrees in business management from Shanghai Jiao Tong University, Shanghai, China. He is currently the Vice Director of Cooperative Medianet Innovation Center and also the Vice Dean of the School of Electrical and Information Engineering, Shanghai Jiao Tong University. His research interest mainly include media Big Data, the emerging commercial applications of information technology, and technology transfer.View more
Author image of Qi Tian
Huawei Cloud and AI, Shenzhen, Guangdong, China
Qi Tian (Fellow, IEEE) is currently the Chief Scientist of artificial intelligence with Huawei Cloud & AI, Shenzhen, China, and is also a Full Professor with the Department of Computer Science, University of Texas at San Antonio, San Antonio, TX, USA. He has authored or coauthored more than 700 refereed journal and conference papers, including many Best Papers and Best Paper Candidates. His research interests include multimedia information retrieval, computer vision, pattern recognition. He was recipient the 2017 UTSA President's Distinguished Award for Research Achievement, 2016 UTSA Innovation Award, 2014 Research Achievement Awards from the College of Science, UTSA, 2010 Google Faculty Award, and the 2010 ACM Service Award. He is an Associate Editor for many journals and on the Editorial Board of the Journal of Multimedia, and the Journal of Machine Vision and Applications.
Qi Tian (Fellow, IEEE) is currently the Chief Scientist of artificial intelligence with Huawei Cloud & AI, Shenzhen, China, and is also a Full Professor with the Department of Computer Science, University of Texas at San Antonio, San Antonio, TX, USA. He has authored or coauthored more than 700 refereed journal and conference papers, including many Best Papers and Best Paper Candidates. His research interests include multimedia information retrieval, computer vision, pattern recognition. He was recipient the 2017 UTSA President's Distinguished Award for Research Achievement, 2016 UTSA Innovation Award, 2014 Research Achievement Awards from the College of Science, UTSA, 2010 Google Faculty Award, and the 2010 ACM Service Award. He is an Associate Editor for many journals and on the Editorial Board of the Journal of Multimedia, and the Journal of Machine Vision and Applications.View more

Contact IEEE to Subscribe

References

References is not available for this document.