Skip to Main Content
Computation and communication intensive applications such as scientific data analysis and data visualization are commonly found in grid computing environment.These applications can be divided into a sequence of pipeline stages which could be executed concurrently on different grid resources to achieve high performance. Finding the optimal placement of pipeline stages on grid is a difficult problem due to the aggregation of computation and communication cost involved. This paper proposes a solution to such problem that allows the maximum application throughput by the integration of pipeline placement and data routing. The proposed solution, on one hand, minimizes the computation bottleneck of a pipeline and, on the other hand, prevents the communication cost between successive stages from dominating the entire processing time. Our proposed solution consists of two novel methods. The first method is the single path pipeline execution that fully exploits temporal parallelism and the second method is the multipath pipeline execution which can leverage both temporal and spatial parallelism inherent in any pipeline applications. We evaluate our proposed methods using a set of experiments running in a real grid environment. When compared with the results from several traditional placement methods, our proposed methods give the highest throughput.