Conferences >2017 IEEE Conference on Compu...

A Point Set Generation Network for 3D Object Reconstruction from a Single Image

Download PDF
Download References
Request Permissions
Save to
Alerts

Abstract:

Generation of 3D data by deep neural networks has been attracting increasing attention in the research community. The majority of extant works resort to regular represent...Show More

Metadata

Abstract:

Generation of 3D data by deep neural networks has been attracting increasing attention in the research community. The majority of extant works resort to regular representations such as volumetric grids or collections of images; however, these representations obscure the natural invariance of 3D shapes under geometric transformations, and also suffer from a number of other issues. In this paper we address the problem of 3D reconstruction from a single image, generating a straight-forward form of output - point cloud coordinates. Along with this problem arises a unique and interesting issue, that the groundtruth shape for an input image may be ambiguous. Driven by this unorthodox output form and the inherent ambiguity in groundtruth, we design architecture, loss function and learning paradigm that are novel and effective. Our final solution is a conditional shape sampler, capable of predicting multiple plausible 3D point clouds from an input image. In experiments not only can our system outperform state-of-the-art methods on single image based 3d reconstruction benchmarks; but it also shows strong performance for 3D shape completion and promising ability in making multiple plausible predictions.

Published in: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

Date of Conference: 21-26 July 2017

Date Added to IEEE Xplore: 09 November 2017

ISBN Information:

Print ISSN: 1063-6919

DOI: 10.1109/CVPR.2017.264

Conference Location: Honolulu, HI, USA

Contents

1. Introduction

As we try to duplicate the successes of current deep convolutional architectures in the 3D domain, we face a fundamental representational issue. Extant deep net architectures for both discriminative and generative learning in the signal domain are well-suited to data that is regularly sampled, such as images, audio, or video. However, most common 3D geometry representations, such as 2D meshes or point clouds are not regular structures and do not easily fit into architectures that exploit such regularity for weight sharing, etc. That is why the majority of extant works on using deep nets for 3D data resort to either volumetric grids or collections of images (2D views of the geometry). Such representations, however, lead to difficult trade-offs between sampling resolution and net efficiency. Furthermore, they enshrine quantization artifacts that obscure natural invariances of the data under rigid motions, etc. Figure 1.

A 3D point cloud of the complete object can be reconstructed from a single image. Each point is visualized as a small sphere. The reconstruction is viewed at two viewpoints (0° and 90° along azimuth). A segmentation mask is used to indicate the scope of the object in the image.

References is not available for this document.

A Point Set Generation Network for 3D Object Reconstruction from a Single Image

Abstract:

Metadata

Abstract:

1. Introduction

References

IEEE Account

Purchase Details

Profile Information

Need Help?

A Point Set Generation Network for 3D Object Reconstruction from a Single Image

Alerts

Abstract:

Metadata

Abstract:

1. Introduction

Authors

Figures

References

Citations

Keywords

Metrics

Footnotes

References

IEEE Account

Purchase Details

Profile Information

Need Help?