Skip to Main Content
In this article, an optimization method for parallelized execution of irregular fine grain computations is presented. This method was implemented using pseudo-vector processing (PVP) and sliding window register (SWR) mechanisms, which have been provided in Hitachi SR2201 supercomputer. The general idea of PVP and SWR relies on optimizing access to big continuous parts of memory and parallel execution of three kinds of operations placed in loops: loading and storing data, arithmetic operations. The main disadvantage of the above-mentioned mechanisms are that gain can be obtained only for long loops and regular expressions inside them. In our method, we focused attention on irregular computations, devoid of any predictable dependencies. We divided a given code into parts and manually optimized relations between loading and storing operations with taking into consideration the memory latency and delays in accessing needed data. In our implementation we obtained a speedup by using a simple reordering of sequences access operations to registers and memory.