Skip to Main Content
SIMD architectures are less efficient for applications with the diverse control-flow behavior, which can be mainly attributed to the requirement of the identical control-flow. In this paper, we propose a novel instruction shuffle scheme that features an efficient control-flow handling mechanism. The cornerstones are composed of a shuffle source instruction buffer array and an instruction shuffle unit. The shuffle unit can concurrently deliver instructions of multiple distinct control-flows from the instruction buffer array to eligible SIMD lanes. Our instruction shuffle scheme combines the best attributes of both the SIMD and MIMD execution paradigms. Experimental results show that, an average performance improvement of 86% can be achieved, at a cost of only 5.8% area overhead.