By Topic

Compilation scheme for near fine grain parallel processing on a multiprocessor system without explicit synchronization

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$31 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

4 Author(s)
Ogata, W. ; Dept. of Electr. Eng., Waseda Univ., Tokyo, Japan ; Fujimoto, K. ; Oota, M. ; Kasahara, H.

In Fortran parallelizing compilers for multiprocessor systems, a loop parallelizing scheme has been used. However, there still exist loops to which the Do-all and Do-across techniques cannot be effectively applied because of loop carried dependence and conditional branches to the outside of the loops. Also, the compiler do not exploit the parallelism of the subroutines, loops and basic blocks and the near-fine-grain parallelism inside the basic blocks in the outside of loops or in sequential loops. Therefore, it is important to use coarse-grain parallelism and near-fine-grain parallelism in addition to loop parallelization. Taking into consideration the above facts, the authors propose a multigrain parallel processing scheme which combines coarse-grain parallel processing or macro-data flow processing, loop concurrency, and a near-fine-grain parallel processing hierarchy. To minimize the data transfer overhead and the total processing time, the proposed compilation scheme uses a static scheduling algorithm called CP/DT/MISF (critical path/data transfer/most immediate successors first). Also, to minimize the synchronization overhead, the compilation scheme eliminates all synchronization codes by using machine-clock level precise code scheduling for a target multiprocessor system OSCAR. This scheme has been implemented on OSCAR and a performance evaluation on OSCAR shows the proposed near-fine-grain parallel processing without synchronization reduces the processing time of test programs by 30% to 40% compared with conventional near-fine-grain parallel processing with synchronization codes

Published in:

Communications, Computers, and Signal Processing, 1995. Proceedings., IEEE Pacific Rim Conference on

Date of Conference:

17-19 May 1995