By Topic

Detecting experimental noises in protein-protein interactions with iterative sampling and model-based clustering

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$33 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

1 Author(s)
H. Mamitsuka ; Inst. for Chem. Res., Kyoto Univ., Uji, Japan

One of the most important issues in current molecular biology is to build exact networks of protein-protein interactions. Recently developed high-throughput experimental techniques accumulate a vast amount of protein-protein interaction data, but it is well known that data reliability has not reached at a satisfactory level. In this paper we attempt to computationally detect experimental errors or noises presumably contained in the protein-protein interaction data by an iterative sampling method using the learning of a stochastic model as its subroutine. The method repeats two steps of selecting examples that can be regarded as non-noises, and training the component algorithm with the selected examples alternately. Noise candidates are selected as the examples having the smallest average likelihoods computed by previously obtained stochastic models. We empirically evaluated the method with other two methods by using both synthetic and real data sets. We examined the effect of noises and data sizes by using medium- and large-sized synthetic data sets that contain noises added intentionally. The results obtained by the medium-sized synthetic data sets show that the significance level of the performance difference between the method and the two other methods has more pronounced for higher noise ratios. Further experiments show that this experimental finding was also true of a large-scale data set. The performance advantage of the method was further confirmed by the experiments using a real protein-protein interaction data set.

Published in:

Bioinformatics and Bioengineering, 2003. Proceedings. Third IEEE Symposium on

Date of Conference:

10-12 March 2003