By Topic

A 52 mW Full HD 160-Degree Object Viewpoint Recognition SoC With Visual Vocabulary Processor for Wearable Vision Applications

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$33 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

6 Author(s)

A 1920 × 1080 160° object viewpoint recognition system-on-chip (SoC) is presented in this paper. The SoC design is dedicated to wearable vision applications, and we address several crucial issues including the low recognition accuracy due to the use of low resolution images and dramatic changes in object viewpoints, and the high power consumption caused by the complex computations in existing computer vision object recognition systems. The human-centered design (HCD) mechanism is proposed in order to maintain a high recognition rate in difficult situations. To overcome the degradation of accuracy when dramatic changes to the object viewpoint occur, the object viewpoint prediction (OVP) engine in the HCD provides 160° object viewpoint in- variance by synthesizing various object poses from predicted object viewpoints. To achieve low power consumption, the visual vocabulary processor (VVP), which is based on bag-of-words (BoW) matching algorithm, is used to advance the matching stage from the feature-level to the object-level and results in a 97% reduction in the required memory bandwidth compared to previous recognition systems. Moreover, the matching efficiency of the VVP enables the system to support real-time full HD (1920 × 1080) processing, thereby improving the recognition rate for detecting a traffic light at a distance of 50 m to 95% compared to the 29% recognition rate for VGA (640 × 480) processing. The real-time 1920 × 1080 visual recognition chip is realized on a 6.38 mm2 die with 65 nm CMOS technology. It achieves an average recognition rate of 94%, a power efficiency of 1.18 TOPS/W, and an area efficiency of 25.9 GOPS/mm2 while only dissipating 52 mW at 1.0 V.

Published in:

IEEE Journal of Solid-State Circuits  (Volume:47 ,  Issue: 4 )