Skip to Main Content
Characterizing biological pathways at the genome scale is one of the most important and challenging tasks in the post genomic era. To address this challenge, we have developed a computational method to systematically and automatically derive partial biological pathways in yeast using high-throughput biological data, including yeast two hybrid data, protein complexes identified from mass spectroscopy, genetics interactions, and microarray gene expression data in yeast Saccharomyces cerevisiae. The inputs of the method are the upstream starting protein (e.g., a sensor of a signal) and the downstream terminal protein (e.g., a transcriptional factor that induces genes to respond the signal); the output of the method is the protein interaction chain between the two proteins. The high-throughput data are coded into a graph of interaction network, where each node represents a protein. The weight of an edge between two nodes models the "closeness" of the two represented proteins in the interaction network and it is defined by a rule-based formula according to the high-throughput data and modified by the protein function classification and subcellular localization information. The protein interaction cascade pathway in vivo is predicted as the shortest path identified from the graph of the interaction network using Dijkstra's algorithm. We have also developed a web server of this method (http://compbio.ornl.gov/structure/pathway) for public use. To our knowledge, our method is the first automated method to generally construct partial biological pathways using a suite of high-throughput biological data. This work demonstrates the proof of principle using computational approaches for discoveries of biological pathways with high-throughput data and biological annotation data.