Modern High Level Synthesis (HLS) tools are now efficient at generating RTL models from algorithmic descriptions of the target hardware accelerators but they still do not manage memory hierarchies. Memory hierarchies are efficiently optimized by performing code transformations prior to HLS in frameworks which exploit the linearity of the mapping functions between loop indexes and memory references (called linear kernels). Unfortunately, non-linear kernels are algorithms which do not benefit of such classical frameworks, because of the disparity of the non-linear functions to compute their memory references. In this paper we propose a method to design non-linear kernels in a HLS flow, which can be seen as a code pre-processing. The method starts from an algorithmic description and generates an enhanced algorithmic description containing both the non-linear kernel and an optimized memory hierarchy. The transformation and the associated optimization process provides a significant gain when compared to a standard optimization. Experiments on benchmarks show an average reduction of 28% of the external memory traffic and about 32 times of the embedded memory size.
Published in:
Design, Automation & Test in Europe Conference & Exhibition (DATE), 2012
Date of Conference: 12-16 March 2012