Reinforcement Learning Based Point-Cloud Acquisition and Recognition Using Exploration-Classification Reward Combination | IEEE Conference Publication | IEEE Xplore

Reinforcement Learning Based Point-Cloud Acquisition and Recognition Using Exploration-Classification Reward Combination


Abstract:

3D points acquisitions based on robust sensors such as tactile or laser sensors are true alternatives to computer vision for 3D object recognition. In real life scenarios...Show More

Abstract:

3D points acquisitions based on robust sensors such as tactile or laser sensors are true alternatives to computer vision for 3D object recognition. In real life scenarios where robots are equipped with such sensors to acquire 3D data, only few points can be iteratively collected in a reasonable amount of time. However, existing Point-cloud classifiers are extremely sensitive to sparse points, missing parts and noise. To compensate for the sparsity of the data, some Reinforcement Learning (RL) based approaches have been proposed to learn a sparse yet efficient exploration of the target object regarding the 3D recognition objective. However, existing RL approaches only focus on classification performances to guide the training of the active acquisition-and-classification frameworks, and thus fail to dissociate poor exploration strategy (missing parts, noisy points) from actual classifier mistakes on proper data. In this study, we proposed a new RL framework that was rewarded regarding both the classification performances and the exploration quality. Our trained framework outperforms existing State-Of-The-Art models on 3D geometric objects classification. We further showed that our trained framework learnt to alternate between (1) a clean and broad exploration strategy, suitable for easily distinguishable categories, and (2) a specific local exploration strategy, facilitating the discrimination of similar categories.
Date of Conference: 18-22 July 2022
Date Added to IEEE Xplore: 26 August 2022
ISBN Information:

ISSN Information:

Conference Location: Taipei, Taiwan

1. Introduction

Most existing deep learning architectures capable of reasoning about 3D point-cloud data are trained on offline/frozen datasets of dense and clean point-clouds [1]–[10]. A recent study [11] proved that State-Of-The-Art (SOTA) point-cloud classifiers actually perform significantly worse on data including noise, missing parts and sparse points. On the contrary, many industrial applications require online/active acquisition of point-clouds, and often result in sparse and noisy data. Typical applications include environments where cameras are unoperable (e.g., dusty environments, poor luminosity conditions), requiring to use more robust data acquisition pipelines, such as iterative 3D points acquisitions using a laser sensor or a tactile sensor mounted on a polyarticulated robot. To maintain a decent pace in industrial context, only a limited number of points can be sampled for each object to process. To allow accurate 3D object classification from a limited number of actively sampled points, some recent works [12], [13] proposed to simultaneously learn the active sampling strategy, namely the exploration strategy, and the classifier. The insight behind such approach is that the exploration strategy training should be guided by the classification performances, so that each point acquisition provides the most information for the 3D recognition task. Both models leveraged an online RL algorithm rewarded by the classification performances to learn the exploration strategy, coupled with a classification loss to train the classifier. Such strategy basically compensates for the sparsity of the data by maximizing the efficiency of the exploration. However, such strategy doesn't explicitly penalize (1) missing parts on the explored object, e.g., due to a too narrow exploration, nor (2) noisy points, due to exploration trials that missed the object. In other words, such approach doesn't dissociate miss-classifications deriving from poor exploration strategy, from the ones caused by classifier mistakes. This leads to a potential unstable and sample-inefficient training.

Our RL based approach.

Contact IEEE to Subscribe

References

References is not available for this document.