Skip to Main Content
Several compile time transformations of loops with simple dependencies have been developed in order to expose possible parallelism in these loops. However, once an irregular data dependence is detected, no attempt is usually made to extract any parallel thread from the loop. In this paper, we present the parallel region execution, a new compile time approach for improving the execution of loops with complex dependencies. It consists of dividing the iteration space of the loop into parallel regions and serial regions, where all the iterations in the parallel regions can be fully executed in parallel. Our parallel region execution technique has been tested on the MasPar machine for various examples and generally resulted in a large speedup.