Journals & Magazines >IEEE Transactions on Pattern ... >Volume: 41 Issue: 4

Real-Time 3D Hand Pose Estimation with 3D Convolutional Neural Networks

Download PDF
Download References
Request Permissions
Save to
Alerts

Abstract:

In this paper, we present a novel method for real-time 3D hand pose estimation from single depth images using 3D Convolutional Neural Networks (CNNs). Image-based feature...Show More

Metadata

Abstract:

In this paper, we present a novel method for real-time 3D hand pose estimation from single depth images using 3D Convolutional Neural Networks (CNNs). Image-based features extracted by 2D CNNs are not directly suitable for 3D hand pose estimation due to the lack of 3D spatial information. Our proposed 3D CNN-based method, taking a 3D volumetric representation of the hand depth image as input and extracting 3D features from the volumetric input, can capture the 3D spatial structure of the hand and accurately regress full 3D hand pose in a single pass. In order to make the 3D CNN robust to variations in hand sizes and global orientations, we perform 3D data augmentation on the training data. To further improve the estimation accuracy, we propose applying the 3D deep network architectures and leveraging the complete hand surface as intermediate supervision for learning 3D hand pose from depth images. Extensive experiments on three challenging datasets demonstrate that our proposed approach outperforms baselines and state-of-the-art methods. A cross-dataset experiment also shows that our method has good generalization ability. Furthermore, our method is fast as our implementation runs at over 91 frames per second on a standard computer with a single GPU.

Published in: IEEE Transactions on Pattern Analysis and Machine Intelligence ( Volume: 41, Issue: 4, 01 April 2019)

Page(s): 956 - 970

Date of Publication: 16 April 2018

ISSN Information:

PubMed ID: 29993927

DOI: 10.1109/TPAMI.2018.2827052

Funding Agency:

Liuhao Ge

Institute for Media Innovation, Nanyang Technological University, Singapore

Liuhao Ge received the BEng degree in detection guidance and control technology from the Nanjing University of Aeronautics and Astronautics, in 2011 and the MEng degree in control theory and engineering from Southeast University, in 2014. He is working toward the PhD degree with the Institute for Media Innovation, Interdisciplinary Graduate School, Nanyang Technological University, Singapore. His research interests mainly...Show More

Hui Liang

Institute for Media Innovation, Nanyang Technological University, Singapore

Junsong Yuan

Department of Computer Science and Engineering, State University of New York at Buffalo, Buffalo, NY

Daniel Thalmann

École Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland

Contents

Liuhao Ge

Institute for Media Innovation, Nanyang Technological University, Singapore

Hui Liang

Institute for Media Innovation, Nanyang Technological University, Singapore

Hui Liang received the BEng degree in electronics and information engineering and the MEng degree in communication and information system from the Huazhong University of Science and Technology, in 2008 and 2011, respectively, and the PhD degree from Nanyang Technological University (NTU), Singapore, in 2016. He was a research scientist with Institute of High Performance Computing, A*STAR, Singapore, a research associate with the Institute for Media Innovation and research fellow with the Rapid-Rich Object Search Lab, NTU, Singapore. He is currently a research scientist with Amazon, Seattle, Washington. His research interests mainly include computer vision, machine learning, and human-computer interaction. He is a member of the IEEE.

Junsong Yuan

Department of Computer Science and Engineering, State University of New York at Buffalo, Buffalo, NY

Junsong Yuan received the BEng degree from the Special Class for the Gifted Young of the Huazhong University of Science and Technology, P.R. China, in 2002, the MEng degree from the National University of Singapore, and the PhD degree from Northwestern University. He is currently an associate professor with Computer Science and Engineering Department, University at Buffalo, the State University of New York. Before that, he was an associate professor with the School of Electrical and Electronics Engineering, Nanyang Technological University (NTU), Singapore. He has published more then 200 papers in computer vision, pattern recognition, and multimedia. He received 2016 Best Paper Award from the IEEE Trans. on Multimedia, Nanyang assistant professorship from NTU, and Outstanding EECS PhD Thesis Award from Northwestern University. He is currently senior area editor of the Journal of Visual Communication and Image Representation, associate editor of the IEEE Trans. on Image Processing, the IEEE Trans. on Circuits and Systems for Video Technology, and served as guest editor of the International Journal of Computer Vision. He is program co-chair of ICME'18 and VCIP'15, and area chair of ACM MM'18, ACCV'18’14, ICPR'18’16, CVPR'17, ICIP'18’17, etc. He is a fellow of the International Association of Pattern Recognition. He is a senior member of the IEEE.

Daniel Thalmann

École Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland

Daniel Thalmann received the PhD degree in computer science from the University of Geneva, in 1977 and the honorary doctorate degree from the University Paul-Sabatier in Toulouse, France, in 2003. He is honorary professor at EPFL, Switzerland, and director of Research Development, MIRALab Sarl. He has been the founder of the Virtual Reality Lab, EPFL, and visiting professor with the Institute for Media Innovation, Nanyang Technological University, Singapore. He is a pioneer in research on Virtual Humans. His current research interests include real-time virtual humans in virtual reality, crowd simulation, and 3D interaction. He is co-editor-in-chief of the Journal of Computer Animation and Virtual Worlds, and member of the Editorial Board of 12 other journals. He was program chair and co-chair of several conferences including IEEE-VR, ACM-VRST, and ACM-VRCAI. He has published more than 600 papers in Graphics, Animation, and Virtual Reality. He received the Eurographics Distinguished Career Award in 2010, the 2012 Canadian Human Computer Communications Society Achievement Award, and the CGI 2015 Career Achievement.

References is not available for this document.

Real-Time 3D Hand Pose Estimation with 3D Convolutional Neural Networks

Abstract:

Metadata

Abstract:

ISSN Information:

Funding Agency:

References

IEEE Account

Purchase Details

Profile Information

Need Help?

Real-Time 3D Hand Pose Estimation with 3D Convolutional Neural Networks

Alerts

Abstract:

Metadata

Abstract:

ISSN Information:

Funding Agency:

Authors

Figures

References

Citations

Keywords

Metrics

Footnotes

References

IEEE Account

Purchase Details

Profile Information

Need Help?