Skip to Main Content
The conflicting requirements of performance and flexibility in today 's embedded system market are forcing system designers to use more and more of the so called configurable or customizable processor cores. Such processors tend to meet the demanding performance constraints by accommodating application specific instruction set extensions (ISEs) which have, naturally, become a vital component of current processor customization flows. One major bottleneck in maximizing ISE performance is the limitation on the data-bandwidth between the general purpose register (GPR) file and the ISEs. For improved performance, it is desirable to have a large data-bandwidth from the GPRs to ISEs. However, the tight area constraints of modern embedded processors often restrict the GPR I/O of ISEs to save port area of the register files. This paper presents a novel approach to increase the GPR I/O of ISEs without significantly increasing the size of the GPR files. This is achieved by applying the concept of register clustering, common in many VLIW architectures, to single-issue processors with high performance ISEs. Such clustering often causes extra register moves in compiled code. This work also presents an algorithm to minimize such register moves. The benchmark results presented in this paper show that our solution can significantly reduce the area overhead of many-port GPR files without sacrificing the performance improvements through ISEs.