Skip to Main Content
The newly emerging many-core-on-a-chip designs have renewed an intense interest in parallel processing. By applying Amdahl's formulation to the programs in the PARSEC and SPLASH-2 benchmark suites, we find that most applications may not have sufficient parallelism to efficiently utilize modern parallel machines. The long sequential portions in these application programs are caused by computation as well as communication latency. However, value prediction techniques may allow the ldquoparallelizationrdquo of the sequential portion by predicting values before they are produced. In conventional superscalar architectures, the computation latency dominates the sequential sections. Thus, value prediction techniques may be used to predict the computation result before it is produced. In many-core architectures, since the communication latency increases with the number of cores, value prediction techniques may be used to reduce both the communication and computation latency. In this paper, we extend Amdahl's formulation to model the data redundancy inherent to each benchmark, thereby identifying the potential of value prediction techniques. Our analysis shows that the performance of PARSEC benchmarks may improve by a factor of 180 and 230 percent for the SPLASH-2 suite, compared to when only the intrinsic parallelism is considered. This demonstrates the immense potential of fine-grained value prediction in reducing the communication latency in many-core architectures.