I. Introduction
With the rapid development of 3D sensing technology, 3D point cloud data are appearing in many application areas such as autonomous driving, virtual and augmented reality, and robotics. Driven by deep neural networks, recent 3D works [1], [2], [3], [4], [5], [6], [7], [8] have focused on processing point clouds with learning-based methods. However, unlike images arranged on a regular pixel or grid, point clouds are sets of points embedded in three-dimensional space. This makes 3D point clouds structurally different from images and representationally different from complex 3D data (e.g., grid and voxel data), which have the simplest format but cannot be directly applied to design deep networks for standard tasks in computer vision.