Skip to Main Content
Needs for performance on embedded applications leads to the use of dynamic execution on embedded processors in the next few years. However, complete out-of-order superscalar cores are still expensive in terms of silicon area and power dissipation. In this paper, we study the adequacy of a more limited form of dynamic execution, namely decoupled architecture, to embedded applications. Decoupled architecture is known to work very efficiently whenever the execution does not suffer from inter-processor dependencies causing some loss of decoupling, called LOD events. In this study, we address regularity of codes in terms of the LOD events that may occur. We address three aspects of regularity: control regularity, control/memory dependency, and patterns of referencing memory data. Most of the kernels in MiBench will be amenable to efficient performance on a decoupled architecture.