Skip to Main Content
As the computation cost increases to meet the design requirements for computation-intensive applications on today's systems, the pressure to develop high performance parallel processors on a chip will increase. Network-on-Chip (NoC) techniques that interconnect multiple processing elements with routers are the solution for reducing computation time and power consumption by parallel processing on a chip. The shared communication platform is also essential to meet the scalability and complexity challenges for System-on-Chip (SoC). However not many parallel applications have been studied for such an architecture and workload characterizations have not been researched to benefit the architecture design optimization. In this paper, we study multiple data-parallel applications on a multicore NoC architecture with distributed memory space. We introduce an efficient runtime workload distribution algorithm that balances workloads of parallel processors and apply for selected embedded applications. Using our cycle accurate multicore simulator, we simulated our NoC-enabled multicore architecture model and executed data-parallel applications on various number of processing elements using the proposed runtime load balancing algorithm and analyzed performance and communication overheads.