Skip to Main Content
This paper presents a synchronization framework for parallel computing heterogeneous processing elements, which are controlled by a RISC processor. The communication delay between RISC and processing elements is a key issue if the RISC is not closely attached to the processing elements. Recent synchronization approaches neglect communication delays or require low communication delays. This results in a low synchronization rate between RISC and PEs. In order to overcome this delay, a special hardware-based synchronization approach is proposed that reduces the communication overhead and increases the number of executable tasks per time unit. Further, it supports parallel execution of independent hardware tasks. The approach was evaluated for a modular coprocessor architecture containing several processing elements for image processing tasks. The coarse-grained parallel execution of independent tasks significantly improves the speed of an exemplary application for aerial image based vehicle detection on straight highways.