Loading [MathJax]/extensions/MathMenu.js
Real-Time 3D Hand Pose Estimation with 3D Convolutional Neural Networks | IEEE Journals & Magazine | IEEE Xplore

Real-Time 3D Hand Pose Estimation with 3D Convolutional Neural Networks


Abstract:

In this paper, we present a novel method for real-time 3D hand pose estimation from single depth images using 3D Convolutional Neural Networks (CNNs). Image-based feature...Show More

Abstract:

In this paper, we present a novel method for real-time 3D hand pose estimation from single depth images using 3D Convolutional Neural Networks (CNNs). Image-based features extracted by 2D CNNs are not directly suitable for 3D hand pose estimation due to the lack of 3D spatial information. Our proposed 3D CNN-based method, taking a 3D volumetric representation of the hand depth image as input and extracting 3D features from the volumetric input, can capture the 3D spatial structure of the hand and accurately regress full 3D hand pose in a single pass. In order to make the 3D CNN robust to variations in hand sizes and global orientations, we perform 3D data augmentation on the training data. To further improve the estimation accuracy, we propose applying the 3D deep network architectures and leveraging the complete hand surface as intermediate supervision for learning 3D hand pose from depth images. Extensive experiments on three challenging datasets demonstrate that our proposed approach outperforms baselines and state-of-the-art methods. A cross-dataset experiment also shows that our method has good generalization ability. Furthermore, our method is fast as our implementation runs at over 91 frames per second on a standard computer with a single GPU.
Page(s): 956 - 970
Date of Publication: 16 April 2018

ISSN Information:

PubMed ID: 29993927

Funding Agency:

Author image of Liuhao Ge
Institute for Media Innovation, Nanyang Technological University, Singapore
Liuhao Ge received the BEng degree in detection guidance and control technology from the Nanjing University of Aeronautics and Astronautics, in 2011 and the MEng degree in control theory and engineering from Southeast University, in 2014. He is working toward the PhD degree with the Institute for Media Innovation, Interdisciplinary Graduate School, Nanyang Technological University, Singapore. His research interests mainly...Show More
Liuhao Ge received the BEng degree in detection guidance and control technology from the Nanjing University of Aeronautics and Astronautics, in 2011 and the MEng degree in control theory and engineering from Southeast University, in 2014. He is working toward the PhD degree with the Institute for Media Innovation, Interdisciplinary Graduate School, Nanyang Technological University, Singapore. His research interests mainly...View more
Author image of Hui Liang
Institute for Media Innovation, Nanyang Technological University, Singapore
Hui Liang received the BEng degree in electronics and information engineering and the MEng degree in communication and information system from the Huazhong University of Science and Technology, in 2008 and 2011, respectively, and the PhD degree from Nanyang Technological University (NTU), Singapore, in 2016. He was a research scientist with Institute of High Performance Computing, A*STAR, Singapore, a research associate w...Show More
Hui Liang received the BEng degree in electronics and information engineering and the MEng degree in communication and information system from the Huazhong University of Science and Technology, in 2008 and 2011, respectively, and the PhD degree from Nanyang Technological University (NTU), Singapore, in 2016. He was a research scientist with Institute of High Performance Computing, A*STAR, Singapore, a research associate w...View more
Author image of Junsong Yuan
Department of Computer Science and Engineering, State University of New York at Buffalo, Buffalo, NY
Junsong Yuan received the BEng degree from the Special Class for the Gifted Young of the Huazhong University of Science and Technology, P.R. China, in 2002, the MEng degree from the National University of Singapore, and the PhD degree from Northwestern University. He is currently an associate professor with Computer Science and Engineering Department, University at Buffalo, the State University of New York. Before that, h...Show More
Junsong Yuan received the BEng degree from the Special Class for the Gifted Young of the Huazhong University of Science and Technology, P.R. China, in 2002, the MEng degree from the National University of Singapore, and the PhD degree from Northwestern University. He is currently an associate professor with Computer Science and Engineering Department, University at Buffalo, the State University of New York. Before that, h...View more
Author image of Daniel Thalmann
École Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland
Daniel Thalmann received the PhD degree in computer science from the University of Geneva, in 1977 and the honorary doctorate degree from the University Paul-Sabatier in Toulouse, France, in 2003. He is honorary professor at EPFL, Switzerland, and director of Research Development, MIRALab Sarl. He has been the founder of the Virtual Reality Lab, EPFL, and visiting professor with the Institute for Media Innovation, Nanyang...Show More
Daniel Thalmann received the PhD degree in computer science from the University of Geneva, in 1977 and the honorary doctorate degree from the University Paul-Sabatier in Toulouse, France, in 2003. He is honorary professor at EPFL, Switzerland, and director of Research Development, MIRALab Sarl. He has been the founder of the Virtual Reality Lab, EPFL, and visiting professor with the Institute for Media Innovation, Nanyang...View more

Author image of Liuhao Ge
Institute for Media Innovation, Nanyang Technological University, Singapore
Liuhao Ge received the BEng degree in detection guidance and control technology from the Nanjing University of Aeronautics and Astronautics, in 2011 and the MEng degree in control theory and engineering from Southeast University, in 2014. He is working toward the PhD degree with the Institute for Media Innovation, Interdisciplinary Graduate School, Nanyang Technological University, Singapore. His research interests mainly include computer vision, machine learning, and human-computer interaction.
Liuhao Ge received the BEng degree in detection guidance and control technology from the Nanjing University of Aeronautics and Astronautics, in 2011 and the MEng degree in control theory and engineering from Southeast University, in 2014. He is working toward the PhD degree with the Institute for Media Innovation, Interdisciplinary Graduate School, Nanyang Technological University, Singapore. His research interests mainly include computer vision, machine learning, and human-computer interaction.View more
Author image of Hui Liang
Institute for Media Innovation, Nanyang Technological University, Singapore
Hui Liang received the BEng degree in electronics and information engineering and the MEng degree in communication and information system from the Huazhong University of Science and Technology, in 2008 and 2011, respectively, and the PhD degree from Nanyang Technological University (NTU), Singapore, in 2016. He was a research scientist with Institute of High Performance Computing, A*STAR, Singapore, a research associate with the Institute for Media Innovation and research fellow with the Rapid-Rich Object Search Lab, NTU, Singapore. He is currently a research scientist with Amazon, Seattle, Washington. His research interests mainly include computer vision, machine learning, and human-computer interaction. He is a member of the IEEE.
Hui Liang received the BEng degree in electronics and information engineering and the MEng degree in communication and information system from the Huazhong University of Science and Technology, in 2008 and 2011, respectively, and the PhD degree from Nanyang Technological University (NTU), Singapore, in 2016. He was a research scientist with Institute of High Performance Computing, A*STAR, Singapore, a research associate with the Institute for Media Innovation and research fellow with the Rapid-Rich Object Search Lab, NTU, Singapore. He is currently a research scientist with Amazon, Seattle, Washington. His research interests mainly include computer vision, machine learning, and human-computer interaction. He is a member of the IEEE.View more
Author image of Junsong Yuan
Department of Computer Science and Engineering, State University of New York at Buffalo, Buffalo, NY
Junsong Yuan received the BEng degree from the Special Class for the Gifted Young of the Huazhong University of Science and Technology, P.R. China, in 2002, the MEng degree from the National University of Singapore, and the PhD degree from Northwestern University. He is currently an associate professor with Computer Science and Engineering Department, University at Buffalo, the State University of New York. Before that, he was an associate professor with the School of Electrical and Electronics Engineering, Nanyang Technological University (NTU), Singapore. He has published more then 200 papers in computer vision, pattern recognition, and multimedia. He received 2016 Best Paper Award from the IEEE Trans. on Multimedia, Nanyang assistant professorship from NTU, and Outstanding EECS PhD Thesis Award from Northwestern University. He is currently senior area editor of the Journal of Visual Communication and Image Representation, associate editor of the IEEE Trans. on Image Processing, the IEEE Trans. on Circuits and Systems for Video Technology, and served as guest editor of the International Journal of Computer Vision. He is program co-chair of ICME'18 and VCIP'15, and area chair of ACM MM'18, ACCV'18’14, ICPR'18’16, CVPR'17, ICIP'18’17, etc. He is a fellow of the International Association of Pattern Recognition. He is a senior member of the IEEE.
Junsong Yuan received the BEng degree from the Special Class for the Gifted Young of the Huazhong University of Science and Technology, P.R. China, in 2002, the MEng degree from the National University of Singapore, and the PhD degree from Northwestern University. He is currently an associate professor with Computer Science and Engineering Department, University at Buffalo, the State University of New York. Before that, he was an associate professor with the School of Electrical and Electronics Engineering, Nanyang Technological University (NTU), Singapore. He has published more then 200 papers in computer vision, pattern recognition, and multimedia. He received 2016 Best Paper Award from the IEEE Trans. on Multimedia, Nanyang assistant professorship from NTU, and Outstanding EECS PhD Thesis Award from Northwestern University. He is currently senior area editor of the Journal of Visual Communication and Image Representation, associate editor of the IEEE Trans. on Image Processing, the IEEE Trans. on Circuits and Systems for Video Technology, and served as guest editor of the International Journal of Computer Vision. He is program co-chair of ICME'18 and VCIP'15, and area chair of ACM MM'18, ACCV'18’14, ICPR'18’16, CVPR'17, ICIP'18’17, etc. He is a fellow of the International Association of Pattern Recognition. He is a senior member of the IEEE.View more
Author image of Daniel Thalmann
École Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland
Daniel Thalmann received the PhD degree in computer science from the University of Geneva, in 1977 and the honorary doctorate degree from the University Paul-Sabatier in Toulouse, France, in 2003. He is honorary professor at EPFL, Switzerland, and director of Research Development, MIRALab Sarl. He has been the founder of the Virtual Reality Lab, EPFL, and visiting professor with the Institute for Media Innovation, Nanyang Technological University, Singapore. He is a pioneer in research on Virtual Humans. His current research interests include real-time virtual humans in virtual reality, crowd simulation, and 3D interaction. He is co-editor-in-chief of the Journal of Computer Animation and Virtual Worlds, and member of the Editorial Board of 12 other journals. He was program chair and co-chair of several conferences including IEEE-VR, ACM-VRST, and ACM-VRCAI. He has published more than 600 papers in Graphics, Animation, and Virtual Reality. He received the Eurographics Distinguished Career Award in 2010, the 2012 Canadian Human Computer Communications Society Achievement Award, and the CGI 2015 Career Achievement.
Daniel Thalmann received the PhD degree in computer science from the University of Geneva, in 1977 and the honorary doctorate degree from the University Paul-Sabatier in Toulouse, France, in 2003. He is honorary professor at EPFL, Switzerland, and director of Research Development, MIRALab Sarl. He has been the founder of the Virtual Reality Lab, EPFL, and visiting professor with the Institute for Media Innovation, Nanyang Technological University, Singapore. He is a pioneer in research on Virtual Humans. His current research interests include real-time virtual humans in virtual reality, crowd simulation, and 3D interaction. He is co-editor-in-chief of the Journal of Computer Animation and Virtual Worlds, and member of the Editorial Board of 12 other journals. He was program chair and co-chair of several conferences including IEEE-VR, ACM-VRST, and ACM-VRCAI. He has published more than 600 papers in Graphics, Animation, and Virtual Reality. He received the Eurographics Distinguished Career Award in 2010, the 2012 Canadian Human Computer Communications Society Achievement Award, and the CGI 2015 Career Achievement.View more

Contact IEEE to Subscribe

References

References is not available for this document.