Skip to Main Content
The paper considerably extends the multiprocessor scheduling techniques of G.N.S. Prasanna and B.R. Musicus (1995; 1991) and applies it to matrix arithmetic compilation. Using optimal control theory in the special case where the speedup function of each task is p α (where p is the amount of processing power applied to the task), closed form solution for task graphs formed from parallel and series connections was derived by G.N.S. Prasanna and B.R. Musicus (1995; 1991). The paper extends these results for arbitrary DAGS. The optimality conditions impose nonlinear constraints on the flow of processing power from predecessors to successors, and on the finishing times of siblings. The paper presents a fast algorithm for determining and solving these nonlinear equations. The algorithm utilizes the structure of the finishing time equations to efficiently run a conjugate gradient minimization, leading to the optimal solution. The algorithm has been tested on a variety of DAGs commonly encountered in matrix arithmetic. The results show that if the p α speedup assumption holds, the schedules produced are superior to heuristic approaches. The algorithm has been applied to compiling matrix arithmetic (K.P. Belkhale and P. Banerjee, 1993), for the MIT Alewife machine, a distributed shared memory multiprocessor. While matrix arithmetic tasks do not exactly satisfy the p α speedup assumptions, the algorithm can be applied as a good heuristic. The results show that the schedules produced by our algorithm are faster than alternative heuristic techniques.