A low-power, high-performance 4-way 32-bit vector processor is developed for handheld 3D graphics systems. It contains a floating-point unified matrix, vector, and elementary function unit. By utilizing the logarithmic arithmetic, the unit achieves single-cycle throughput for all these operations except for the matrix-vector multiplication with 2-cycle throughput. The processor featured by this function unit, cascaded integer-float datapath, reconfiguration of datapath, operand forwarding in logarithmic domain, and vertex cache takes 9.7 mm2 in 0.18 mum CMOS technology and achieves 141 Mvertices/s for geometry transformation and 12.1 Mvertices/s for OpenGL transformation and lighting at 200 MHz with 86.6 mW power consumption.
Published in:
Solid State Circuits Conference, 2007. ESSCIRC 2007. 33rd European
Date of Conference: 11-13 Sept. 2007