Loading [MathJax]/extensions/MathMenu.js
A Cuboid CNN Model With an Attention Mechanism for Skeleton-Based Action Recognition | IEEE Journals & Magazine | IEEE Xplore

A Cuboid CNN Model With an Attention Mechanism for Skeleton-Based Action Recognition


Abstract:

The introduction of depth sensors such as Microsoft Kinect have driven research in human action recognition. Human skeletal data collected from depth sensors convey a sig...Show More

Abstract:

The introduction of depth sensors such as Microsoft Kinect have driven research in human action recognition. Human skeletal data collected from depth sensors convey a significant amount of information for action recognition. While there has been considerable progress in action recognition, most existing skeleton-based approaches neglect the fact that not all human body parts move during many actions, and they fail to consider the ordinal positions of body joints. Here, and motivated by the fact that an action's category is determined by local joint movements, we propose a cuboid model for skeleton-based action recognition. Specifically, a cuboid arranging strategy is developed to organize the pairwise displacements between all body joints to obtain a cuboid action representation. Such a representation is well structured and allows deep CNN models to focus analyses on actions. Moreover, an attention mechanism is exploited in the deep model, such that the most relevant features are extracted. Extensive experiments on our new Yunnan University-Chinese Academy of Sciences-Multimodal Human Action Dataset (CAS-YNU MHAD), the NTU RGB+D dataset, the UTD-MHAD dataset, and the UTKinect-Action3D dataset demonstrate the effectiveness of our method compared to the current state-of-the-art.
Published in: IEEE Transactions on Multimedia ( Volume: 22, Issue: 11, November 2020)
Page(s): 2977 - 2989
Date of Publication: 25 December 2019

ISSN Information:

Funding Agency:

Author image of Kaijun Zhu
FIST LAB, School of Information Science and Engineering, Yunnan University, Kunming, Yunnan, P.R.China
Kaijun Zhu received the B.Eng. degree from the School of Physics and Electrical Engineering, Anqing Normal University, Anqing, China, in 2014, and the M.Sc. degree from the School of Information Science and Engineering, Yunnan University, Kunming, China, in 2019. His current research interests include computer vision and machine learning.
Kaijun Zhu received the B.Eng. degree from the School of Physics and Electrical Engineering, Anqing Normal University, Anqing, China, in 2014, and the M.Sc. degree from the School of Information Science and Engineering, Yunnan University, Kunming, China, in 2019. His current research interests include computer vision and machine learning.View more
Author image of Ruxin Wang
Union Vision Innovations Technology, Shenzhen, P.R.China
Ruxin Wang received the B.Eng. degree from Xidian University, Xi’an, China, the M.Sc. degree from the Huazhong University of Science and Technology, Wuhan, China, and the Ph.D. degree from the University of Technology Sydney, Ultimo, NSW, Australia. He is currently a Research Scientist with the Union Visual Innovation Technology Company Ltd., Shenzhen, China. His research interests include deep learning, image restoration...Show More
Ruxin Wang received the B.Eng. degree from Xidian University, Xi’an, China, the M.Sc. degree from the Huazhong University of Science and Technology, Wuhan, China, and the Ph.D. degree from the University of Technology Sydney, Ultimo, NSW, Australia. He is currently a Research Scientist with the Union Visual Innovation Technology Company Ltd., Shenzhen, China. His research interests include deep learning, image restoration...View more
Author image of Qingsong Zhao
Shenzhen College of Advanced Technology, University of Chinese Academy of Sciences, Shenzhen, China
Qingsong Zhao received the B.Eng. degree from the School of Electrical Engineering and Automation, Henan Polytechnic University, Jiaozuo, China, in 2016. He is currently working toward the M.Eng. degree with the Shenzhen College of Advanced Technology, University of Chinese Academy of Sciences, Beijing, China. His research interests include human action recognition, transfer learning, and meta learning.
Qingsong Zhao received the B.Eng. degree from the School of Electrical Engineering and Automation, Henan Polytechnic University, Jiaozuo, China, in 2016. He is currently working toward the M.Eng. degree with the Shenzhen College of Advanced Technology, University of Chinese Academy of Sciences, Beijing, China. His research interests include human action recognition, transfer learning, and meta learning.View more
Author image of Jun Cheng
CAS Key Laboratory of Human-Machine Intelligence-Synergy Systems, Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen, China
Chinese University of Hong Kong, Hong Kong, China
Jun Cheng received the B.E. and M.E. degrees from the University of Science and Technology of China, Hefei, China, in 1999 and 2002, respectively, and the Ph.D. degree from the Chinese University of Hong Kong, Hong Kong, in 2006. He is currently a Professor with the Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen, China, and also with the CAS Key Laboratory of Human-Machine Intelligence-S...Show More
Jun Cheng received the B.E. and M.E. degrees from the University of Science and Technology of China, Hefei, China, in 1999 and 2002, respectively, and the Ph.D. degree from the Chinese University of Hong Kong, Hong Kong, in 2006. He is currently a Professor with the Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen, China, and also with the CAS Key Laboratory of Human-Machine Intelligence-S...View more
Author image of Dapeng Tao
FIST LAB, School of Information Science and Engineering, Yunnan University, Kunming, Yunnan, P.R.China
Dapeng Tao received the B.E. degree from the Northwestern Polytechnical University, Xi’an, China, and the Ph.D. degree from the South China University of Technology, Guangzhou, China. He is currently a Professor with the School of Information Science and Engineering, Yunnan University, Kunming, China. He has authored or coauthored more than 30 scientific articles. His research interests include machine learning, computer ...Show More
Dapeng Tao received the B.E. degree from the Northwestern Polytechnical University, Xi’an, China, and the Ph.D. degree from the South China University of Technology, Guangzhou, China. He is currently a Professor with the School of Information Science and Engineering, Yunnan University, Kunming, China. He has authored or coauthored more than 30 scientific articles. His research interests include machine learning, computer ...View more

I. Introduction

Human action recognition [1]–[5] is an active yet challenging research area that has been explored in many applications including healthcare, smart surveillance, and security. RGB sensors and depth sensors (e.g., Microsoft Kinect sensors) have been used to improve human action recognition performance by exploiting rich, captured information such as depth and 3D location. Compared to RGB data, depth data can adapt to changes in lighting conditions through the use of infrared radiation. Xiao et al. [6] and Ji et al. [7] proposed an effective method to recognize human actions from depth map sequences. However, due to redundancy in depth maps, this huge amount of data increases computational complexity making them impractical for real-world use.

Author image of Kaijun Zhu
FIST LAB, School of Information Science and Engineering, Yunnan University, Kunming, Yunnan, P.R.China
Kaijun Zhu received the B.Eng. degree from the School of Physics and Electrical Engineering, Anqing Normal University, Anqing, China, in 2014, and the M.Sc. degree from the School of Information Science and Engineering, Yunnan University, Kunming, China, in 2019. His current research interests include computer vision and machine learning.
Kaijun Zhu received the B.Eng. degree from the School of Physics and Electrical Engineering, Anqing Normal University, Anqing, China, in 2014, and the M.Sc. degree from the School of Information Science and Engineering, Yunnan University, Kunming, China, in 2019. His current research interests include computer vision and machine learning.View more
Author image of Ruxin Wang
Union Vision Innovations Technology, Shenzhen, P.R.China
Ruxin Wang received the B.Eng. degree from Xidian University, Xi’an, China, the M.Sc. degree from the Huazhong University of Science and Technology, Wuhan, China, and the Ph.D. degree from the University of Technology Sydney, Ultimo, NSW, Australia. He is currently a Research Scientist with the Union Visual Innovation Technology Company Ltd., Shenzhen, China. His research interests include deep learning, image restoration, and computer vision. He has authored and coauthored about ten research papers including the IEEE Transactions on Neural Networks and Learning Systems, IEEE Transactions on Image Processing, and IEEE Transactions on Cybernetics.
Ruxin Wang received the B.Eng. degree from Xidian University, Xi’an, China, the M.Sc. degree from the Huazhong University of Science and Technology, Wuhan, China, and the Ph.D. degree from the University of Technology Sydney, Ultimo, NSW, Australia. He is currently a Research Scientist with the Union Visual Innovation Technology Company Ltd., Shenzhen, China. His research interests include deep learning, image restoration, and computer vision. He has authored and coauthored about ten research papers including the IEEE Transactions on Neural Networks and Learning Systems, IEEE Transactions on Image Processing, and IEEE Transactions on Cybernetics.View more
Author image of Qingsong Zhao
Shenzhen College of Advanced Technology, University of Chinese Academy of Sciences, Shenzhen, China
Qingsong Zhao received the B.Eng. degree from the School of Electrical Engineering and Automation, Henan Polytechnic University, Jiaozuo, China, in 2016. He is currently working toward the M.Eng. degree with the Shenzhen College of Advanced Technology, University of Chinese Academy of Sciences, Beijing, China. His research interests include human action recognition, transfer learning, and meta learning.
Qingsong Zhao received the B.Eng. degree from the School of Electrical Engineering and Automation, Henan Polytechnic University, Jiaozuo, China, in 2016. He is currently working toward the M.Eng. degree with the Shenzhen College of Advanced Technology, University of Chinese Academy of Sciences, Beijing, China. His research interests include human action recognition, transfer learning, and meta learning.View more
Author image of Jun Cheng
CAS Key Laboratory of Human-Machine Intelligence-Synergy Systems, Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen, China
Chinese University of Hong Kong, Hong Kong, China
Jun Cheng received the B.E. and M.E. degrees from the University of Science and Technology of China, Hefei, China, in 1999 and 2002, respectively, and the Ph.D. degree from the Chinese University of Hong Kong, Hong Kong, in 2006. He is currently a Professor with the Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen, China, and also with the CAS Key Laboratory of Human-Machine Intelligence-Synergy Systems, Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences. His current research interests include computer vision, robotics, machine intelligence, and control.
Jun Cheng received the B.E. and M.E. degrees from the University of Science and Technology of China, Hefei, China, in 1999 and 2002, respectively, and the Ph.D. degree from the Chinese University of Hong Kong, Hong Kong, in 2006. He is currently a Professor with the Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen, China, and also with the CAS Key Laboratory of Human-Machine Intelligence-Synergy Systems, Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences. His current research interests include computer vision, robotics, machine intelligence, and control.View more
Author image of Dapeng Tao
FIST LAB, School of Information Science and Engineering, Yunnan University, Kunming, Yunnan, P.R.China
Dapeng Tao received the B.E. degree from the Northwestern Polytechnical University, Xi’an, China, and the Ph.D. degree from the South China University of Technology, Guangzhou, China. He is currently a Professor with the School of Information Science and Engineering, Yunnan University, Kunming, China. He has authored or coauthored more than 30 scientific articles. His research interests include machine learning, computer vision, and robotics. He has served more than ten international journals, including the IEEE Transactions on Neural Networks and Learning Systems, IEEE Transactions on Multimedia, IEEE Transactions on Circuits and Systems for Video Technology, IEEE Signal Processing Letters, and Information Sciences.
Dapeng Tao received the B.E. degree from the Northwestern Polytechnical University, Xi’an, China, and the Ph.D. degree from the South China University of Technology, Guangzhou, China. He is currently a Professor with the School of Information Science and Engineering, Yunnan University, Kunming, China. He has authored or coauthored more than 30 scientific articles. His research interests include machine learning, computer vision, and robotics. He has served more than ten international journals, including the IEEE Transactions on Neural Networks and Learning Systems, IEEE Transactions on Multimedia, IEEE Transactions on Circuits and Systems for Video Technology, IEEE Signal Processing Letters, and Information Sciences.View more

Contact IEEE to Subscribe

References

References is not available for this document.