Abstract:
Event-driven neuromorphic processors for artificial intelligence (AI) inference on edge/IoT devices require largeon-chip memory capacity, for efficient execution of spiki...Show MoreMetadata
Abstract:
Event-driven neuromorphic processors for artificial intelligence (AI) inference on edge/IoT devices require largeon-chip memory capacity, for efficient execution of spiking neural networks (NNs). In this work, we evaluate 3-D stacking benefits on SENECA, a digital neuromorphic accelerator core, sweeping itson-chip memory capacity from 2 up to 32 Mb in both legacy planar and advanced nanosheet CMOS logic nodes. In a planar CMOS node (GF-22 nm), two-die memory-on-logic (MoL) partitioning enables 8\times moreon-chip memory, and it boosts operating frequency by 7% with 26% less power than the 2-D. Moving to an advanced nanosheet technology (imec A10), multidie (up to 7 dies) MoL stacking enables a performance increase of up to 29% and power savings up to 31%. Furthermore, a core folding (CF) partitioning in A10 shows up to 16% performance improvement with 12% total power savings with respect to the 2-D implementation on the same technology. We also demonstrate no thermal overhead for multidie stacking at advanced nodes for designs exhibiting low power density. These physical design explorations lay the foundation for system technology co-optimization studies for edge devices.
Published in: IEEE Transactions on Very Large Scale Integration (VLSI) Systems ( Volume: 32, Issue: 11, November 2024)