Skip to Main Content
Our loop parallelizing method of compiler for SIMD architecture enables SIMD instructions to be generated from loops which include complicated data dependency. The characteristic of our method is in choosing the more optimizing method for parallelization from two shearing conversions by inner and outer loop carried data dependences. One of them is novel and involves shearing horizontally along the inner loop index and the other is well-established shearing vertically along the outer loop index. These loop transformations are formalized by matrix operations. They enable the original loop indexes to be expressed using new loop indexes so that compiler does not need to make any change in loop body. At this point, simple templates suffice to generate optimal code. To conclude we summarize the conditions for choosing suitable shearing method and the requirements for conversion.