I. Introduction
Many deep neural networks (DNN) accelerators share common architectural features like the presence of a large array of specialized processing elements (PEs) interconnected through a specialized Network-on-Chip (NoC). Although the different accelerators target specific sectors, including mobile, automotive, and datacenter, the current trend is the design of scalable architectures that can be used on a broad computing spectrum spanning from mobile IoT to large-scale data centers by means of the use of multi-chip-module (MCM) based architectures [1].