Skip to Main Content
High-throughput distributed data analysis based on clustered computing is gaining increasing importance in the field of computational biology. This paper describes a parallel programming approach and its software implementation using Message Passing Interface (MPI) to parallelize a computationally intensive algorithm for identifying cellular contexts. We report successful implementation on a 1,024 processor Beowulf cluster to analyze microarray data consisting of hundreds of thousands of measurements from different datasets. Detailed performance evaluation shows that data analysis that could have taken months on a stand-alone computer was accomplished in less than a day.