Skip to Main Content
A 1920 × 1080 160° object viewpoint recognition system-on-chip (SoC) is presented in this paper. The SoC design is dedicated to wearable vision applications, and we address several crucial issues including the low recognition accuracy due to the use of low resolution images and dramatic changes in object viewpoints, and the high power consumption caused by the complex computations in existing computer vision object recognition systems. The human-centered design (HCD) mechanism is proposed in order to maintain a high recognition rate in difficult situations. To overcome the degradation of accuracy when dramatic changes to the object viewpoint occur, the object viewpoint prediction (OVP) engine in the HCD provides 160° object viewpoint in- variance by synthesizing various object poses from predicted object viewpoints. To achieve low power consumption, the visual vocabulary processor (VVP), which is based on bag-of-words (BoW) matching algorithm, is used to advance the matching stage from the feature-level to the object-level and results in a 97% reduction in the required memory bandwidth compared to previous recognition systems. Moreover, the matching efficiency of the VVP enables the system to support real-time full HD (1920 × 1080) processing, thereby improving the recognition rate for detecting a traffic light at a distance of 50 m to 95% compared to the 29% recognition rate for VGA (640 × 480) processing. The real-time 1920 × 1080 visual recognition chip is realized on a 6.38 mm2 die with 65 nm CMOS technology. It achieves an average recognition rate of 94%, a power efficiency of 1.18 TOPS/W, and an area efficiency of 25.9 GOPS/mm2 while only dissipating 52 mW at 1.0 V.