Skip to Main Content
Future manycore Systems-on-Chip will integrate tens or even hundreds of cores. Tiled architectures have come to the focus of research and industry. Such platforms integrate processing cores in clusters and connect those `tiles' with a global interconnect. Message passing programming models are favored to program such complex distributed memory systems. A significant performance overhead is involved with the message passing communication and especially with collective communication, that involves several tasks in one communication. To tackle this overhead, we propose a concept for an interface between processing elements and a Network-on-Chip. The primary idea is to offload the software from processing intensive functionalities. This includes collective communication operations, like broadcast, scatter, gather and reduction. The conceptual design of the functionality of such a Smart Network Adapter is presented in this paper. An analytical estimation of the performance gain shows promising results to support our work in progress.