Skip to Main Content
In this paper, we propose a hardware/software partitioning method for improving applications' performance in embedded systems. Critical software parts are accelerated on hardware of a single-chip generic system comprised by an embedded processor and coarse-grain reconfigurable hardware. The reconfigurable hardware is realized by a 2D array of processing elements. The partitioning flow utilizes an analysis procedure at the basic-block level for detecting kernels in software. A list-based mapping algorithm has been developed for estimating the execution cycles of kernels on coarse-grain reconfigurable arrays. The proposed partitioning flow has been largely automated for a program description in C language. Extensive hardware/software experiments on five real-life applications are presented. It is shown that the benchmarks spend an average of 69% of their instruction count in 11% on average of their code that correspond to the kernels' code. The results illustrate that by mapping critical code on coarse-grain reconfigurable hardware, speedups ranging from 1.2 to 3.7, with an average value of 2.2, are achieved.