Journals & Magazines >IEEE Transactions on Pattern ... >Volume: 45 Issue: 8

Consistent 3D Hand Reconstruction in Video via Self-Supervised Learning

Download PDF
Download References
Request Permissions
Save to
Alerts

Abstract:

We present a method for reconstructing accurate and consistent 3D hands from a monocular video. We observe that the detected 2D hand keypoints and the image texture provi...Show More

Metadata

Abstract:

We present a method for reconstructing accurate and consistent 3D hands from a monocular video. We observe that the detected 2D hand keypoints and the image texture provide important cues about the geometry and texture of the 3D hand, which can reduce or even eliminate the requirement on 3D hand annotation. Accordingly, in this work, we propose

$\mathrm{{S}^{2}HAND}$ , a self-supervised 3D hand reconstruction model, that can jointly estimate pose, shape, texture, and the camera viewpoint from a single RGB input through the supervision of easily accessible 2D detected keypoints. We leverage the continuous hand motion information contained in the unlabeled video data and explore

$\mathrm{{S}^{2}HAND(V)}$ , which uses a set of weights shared

$\mathrm{{S}^{2}HAND}$ to process each frame and exploits additional motion, texture, and shape consistency constrains to obtain more accurate hand poses, and more consistent shapes and textures. Experiments on benchmark datasets demonstrate that our self-supervised method produces comparable hand reconstruction performance compared with the recent full-supervised methods in single-frame as input setup, and notably improves the reconstruction accuracy and consistency when using the video training data.

Published in: IEEE Transactions on Pattern Analysis and Machine Intelligence ( Volume: 45, Issue: 8, August 2023)

Page(s): 9469 - 9485

Date of Publication: 22 February 2023

ISSN Information:

PubMed ID: 37027607

DOI: 10.1109/TPAMI.2023.3247907

Funding Agency:

Citations are not available for this document.

Contents

I. Introduction

Hands play a central role in the interaction between humans and the environment, from physical contact and grasping to daily communications via hand gesture. Learning 3D hand reconstruction is the preliminary step for many computer vision applications such as augmented reality [1], sign language translation [2], [3], and human-computer interaction [4], [5], [6]. However, due to diverse hand configurations and interaction with the environment, 3D hand reconstruction remains a challenging problem, especially when the task relies on monocular data as input.

Cites in Papers - |

References is not available for this document.

Consistent 3D Hand Reconstruction in Video via Self-Supervised Learning

Abstract:

Metadata

Abstract:

ISSN Information:

Funding Agency:

I. Introduction

Cites in Papers - |

References

IEEE Account

Purchase Details

Profile Information

Need Help?

Consistent 3D Hand Reconstruction in Video via Self-Supervised Learning

Alerts

Abstract:

Metadata

Abstract:

ISSN Information:

Funding Agency:

I. Introduction

Authors

Figures

References

Citations

Cites in Papers - IEEE () | Other Publishers ()

Keywords

Metrics

Supplemental Items

Footnotes

References

IEEE Account

Purchase Details

Profile Information

Need Help?

Cites in Papers - |