Abstract:
Edge computing aims to enable swift, real-time data processing, analysis, and storage close to the data source. However, edge computing platforms are often constrained by...Show MoreMetadata
Abstract:
Edge computing aims to enable swift, real-time data processing, analysis, and storage close to the data source. However, edge computing platforms are often constrained by limited processing power and efficiency. This paper presents DFU-E, a dataflow-based accelerator specifically designed to meet the demands of edge digital signal processing (DSP) and artificial intelligence (AI) applications. Our design addresses real-world requirements with three main innovations. First, to accommodate the diverse algorithms utilized at the edge, we propose a multi-layer dataflow mechanism capable of exploiting task-level, instruction block-level, instruction-level, and data-level parallelism. Second, we develop an edge dataflow architecture that includes a customized processing element (PE) array, memory, and on-chip network microarchitecture optimized for the multi-layer dataflow mechanism. Third, we design an edge dataflow software stack that enables automatic optimizations through operator fusion, dataflow graph mapping, and task scheduling. We utilize representative real-world DSP and AI applications for evaluation. Comparing with Nvidia's state-of-the-art edge computing processor, DFU-E achieves up to 1.42× geometric mean performance improvement and 1.27× energy efficiency improvement.
Published in: IEEE Transactions on Parallel and Distributed Systems ( Volume: 36, Issue: 6, June 2025)