Skip to Main Content
An important problem in systems biology is to reconstruct gene regulatory networks from experimental data and other a priori information. Based on linear regression techniques and significance tests, in this paper, an identification algorithm is developed using multifactorial perturbation experimental data. Basic ideas behind this algorithm are as follows. The larger the sum of residual squares, the weaker the direct regulatory interaction. Moreover, the higher the significance level of linear regression, the greater the probability of the existence of the direct regulatory interaction. To take both of them into consideration, a weight corresponding to a possible direct regulation is selected as their product. Besides, normalization of these weights have also been discussed, noting that in a gene regulatory network, some genes may be easily regulated by other genes, while regulations on some other genes may need more efforts. A distinguished feature of the algorithm is that the power law has been quantitatively incorporated into estimations, which is one important structure property that most large scale genetic regulatory networks approximately have. Through constructing loss functions and incorporating power law, and solving a 0-1 integer programming problem, direct regulation genes for an arbitrary gene can be estimated. The weight matrix is further adjusted using these estimated direct regulatory relationships. Computation results with the DREAM4 In Silico Size 100 Multifactorial subchallenge show that estimation performances of the suggested algorithm can even outperform the best team. Furthermore, the high precision of the obtained most reliable predictions imply that the suggested algorithm may be very helpful in guiding biological validation experiment designs.