Heterogeneous coarse-grained processing elements: A template architecture for embedded processing acceleration
Ansaloni, G.
Bonzini, P.
Pozzi, L.
Fac. of Inf., Univ. of Lugano (USI), Lugano;
Abstract
Reconfigurable Architectures are good candidates for application accelerators that cannot be set in stone at production time. FPGAs however, often suffer from the area and performance penalty intrinsic in gate-level reconfigurability. To reduce this overhead, coarse-grained reconfigurable arrays (CGRAs) are reconfigurable at the ALU level, but a successful design needs more than computational power-the main bottleneck usually being memory transfers. Just like the integration of hardwired multiplier and memory blocks enabled FPGAs to efficiently implement digital signal processing applications, in this paper we study a customizable architecture template based on heterogeneous processing elements (multipliers, ALU clusters and memories) that provides enough flexibility to realize fast pipelined implementations of various loop kernels on a CGRA.
Index
Terms
Available to subscribers and IEEE members.
References
Available to subscribers and IEEE members.
Citing Documents
Available to subscribers and IEEE members.