In this paper, a power efficient vertex processor for mobile graphics applications is presented. A four-threaded and four-issue expanded VLIW datapath with a quad-float vertex texture fetcher is proposed by exploiting graphics specific characteristics after evaluation of several candidate architectures. Instruction-level power control methods such as operand sharing and writeback re-allocation along with operand isolations and gated clocks result in 40.4% and 82% reduction in energy dissipation and energy delay product compared to the most widely used single threaded SIMD. The proposed processor with the optimized datapath and vertex caches implemented in a 0.18- mum 1P4M CMOS process achieves 186-Mvertices/s geometry performance which is the best result among the processors that are IEEE-754 compliant.
Published in:
Very Large Scale Integration (VLSI) Systems, IEEE Transactions on
(Volume:17
,
Issue:
10
)
Date of Publication: Oct. 2009