This paper presents an instruction scheduling and cluster assignment approach for clustered very long instruction words (VLIW) processors. The technique produces high performance code by simultaneously balancing instructions among clusters and minimizing the amount of inter-cluster data communications. The scheme is evaluated based on benchmarks extracted from UTDSP. Results show a significant speedup compared with previously used techniques with speed-ups of up to 44%, with average speed-ups ranging from 140/0 (2-cluster) to 18% (4-cluster).
Published in:
Tsinghua Science and Technology
(Volume:15
,
Issue:
3
)
Date of Publication: June 2010