Skip to Main Content
Increasing system complexity of SOC applications leads to an increasing requirement on powerful embedded DSP processors. To increase the performance of DSP processors the number of parallel-executed instructions has been increased. To program the parallel units VLIW (very long instruction word) has been introduced. Traditional VLIW architectures feature poor code density and therefore high area consumption caused by the program memory. To overcome this limitation the proposed configuration DSP core supports unaligned program memory, to reduce the size of the program memory port an execution bundle can be mapped onto several fetch bundles. To overcome the memory bandwidth mismatch between fetch and execution bundle an instruction buffer is introduced. Using the instruction buffer during execution of the inner loops the power dissipation of the DSP subsystems can be reduced. Cache logic is used to control the entries of the instruction buffer during out-of-order execution. This paper describes the architecture and the implementation of the instruction buffer. The instruction buffer is part of a project for a configurable DSP core.