Skip to Main Content
Large amounts of protein-protein interaction data have been identified using various genome-scale screening techniques. Although interaction data is a valuable resource, high-throughput datasets are prone to higher false positive rates. We developed a new reliability assessment system for protein-protein interaction dataset of yeast that can identify real interacting protein pairs from noisy dataset. The system is based on a neural network algorithm, and utilizes three characteristics of interacting proteins: 1) interacting proteins share similar functional category, 2) interacting proteins must locate in close proximity, at least transiently, and 3) an interacting protein pair is tightly linked with other proteins in the protein interaction network. The statistical evaluation of the neural network classifier by 10-fold cross-validation shows that it performs well with 96% of accuracy on the average. We experimented our classifier with pure 5,564 interactions. The classifier distinguished the yeast two-hybrid dataset into 2,831 true positives and 2,733 false positives.