Skip to Main Content
Reduced energy consumption is one of the most important design goals for embedded application domains like wireless communication, multimedia and biomedical applications. The instruction memory hierarchy has been proven to be one of the most power hungry parts of the system. This paper introduces an architectural enhancement for the instruction memory to reduce energy consumption and improve performance. The proposed distributed instruction memory organization requires minimal hardware overhead and supports the execution of multiple incompatible loops in parallel in a uni-processor system. We present different methods to implement the loop controller architecture, compare them, and show that distributing the instruction memory helps to reduce the interconnect cost as well. This architecture enhancement can reduce the energy consumed in the instruction memory hierarchy by 59% and improve the performance by 22% compared to hardware based enhanced SMT based architectures.