Skip to Main Content
To complement the flexible, fine-grain logic in field programmable gate arrays (FPGAs), configurable hardware devices now incorporate more complex coarse-grain components such as memories, embedded processing units and fused-arithmetic units. These components provide speed and density advantages due to the specialised logic and fixed interconnect. In this paper, a methodology is presented to automatically propose and explore the benefits of different types of fused arithmetic units for configurable devices. The methods are based on common subgraph extraction techniques, meaning that it is possible to explore different subcircuits that occur frequently across a set of benchmarks. A quantitative analysis is performed of the various fused-arithmetic circuits identified by our tool, which are then automatically synthesised to an ASIC process, providing a study of the speed and area benefits of the components. We report improvements of up to 3.3times in speed and 19.7times in area for the average improvement of particular silicon cores identified by our approach when compared to implementation of the same sub-circuits implemented in a commercial mixed-granularity FPGA in a comparable 90nm technology. The average improvements across all embedded cores identified by our approach are 1.67times and 5.55times when designing the ASIC cores for fastest speed performance.