Skip to Main Content
This paper develops theoretical bounds on the number of required experiments to infer which genes are active in a particular biological process. The standard approach is to perform many experiments, each with a single gene suppressed or knocked down. However, certain effects are not revealed by single-gene knockouts and are only observed when two or more genes are suppressed simultaneously. Here, we propose a framework for identifying such interactions without resorting to an exhaustive pairwise search. We exploit the inherent sparsity of the problem that stems from the fact that very few gene pairs are likely to be active. We model the biological process by a multilinear function with unknown coefficients and develop a compressed sensing framework for inferring the coefficients. Our main result is that if at most S gene or gene pairs are active out of N total then approximately S2 log N measurements suffice to identify the significant active components.