This paper presents performance and step-bp-step complexity analysis of two different design alternatives of multithreaded architecture: dynamic inter-thread resource scheduling and static resource allocation. We show that with two concurrent threads the dynamic scheduling processor achieves from 5 to 45% higher performance at the cost of much more complicated design. The paper shows that for a relatively high number of execution resources the complexity of the dynamic scheduling logic will inevitably require design compromises. Moreover, high chip-wide communication time and an “incomplete bypassing network” will force the dynamic scheduling to use static-like execution unit assignment, thus reducing its performance advantage. At the same transistor budget the static architecture may implement additional functional units, resulting in better overall performance
Published in:
Parallel Architectures and Compilation Techniques, 1996., Proceedings of the 1996 Conference on
Date of Conference: Oct 1996