Abstract:
Several unmanned retail stores have been introduced with the development of sensors, wireless communication, and computer vision technologies. A vision-based kiosk that i...Show MoreMetadata
Abstract:
Several unmanned retail stores have been introduced with the development of sensors, wireless communication, and computer vision technologies. A vision-based kiosk that is only equipped with a vision sensor has significant advantages such as compactness and low implementation cost. Using convolutional neural network (CNN)-based object detectors, the kiosk recognizes an object when a customer picks up a product. In retail object recognition, the key challenge is the limited number of detections and high interclass similarity. In this study, these challenges are addressed by utilizing the “view-specific” feature of an object; specifically, an object class is divided into multiple “view-based” subclasses, and the object detectors are trained using these data. Further, the “view-aware feature” is defined by aggregating subclass detection results from multiple cameras. A superclass classifier predicts a superclass by utilizing an informative subclass detection result that distinguishes the target object from other similar-looking objects. To verify the effectiveness of the proposed approach, a prototype of the vision-based unmanned kiosk system is implemented. Experimental results indicate that the proposed method outperforms the conventional method, even on a state-of-the-art detection network. The dataset used in this study has been subsequently provided in the IEEE DataPort for reproducibility.
Published in: IEEE Sensors Journal ( Volume: 22, Issue: 22, 15 November 2022)