Skip to Main Content
The task graph of telecommunication applications often exhibits massive coarse grained parallelism, which can be exploited by an on-chip multiprocessor. In many cases it can be organized into several subsequent stages, each containing dozens or even hundreds of identical tasks. We implement communications between tasks via software channels mapped to on-chip memory, allowing for multiple readers and writers to access them in arbitrary order. Our architecture is based on the shared memory paradigm. The interconnection network is hierarchical, so that communication latencies vary with the distance between the cluster where the task is located and the cluster on which the channel is placed. Moreover, packet sizes and arrival rates are subject to strong variations. An analytical approach to dimensioning the channels is thus near impossible. Within a purely simulation based approach, we gain insight into the performance of such software channels.