Skip to Main Content
Computational protein design can be formulated as an optimization problem, where the objective is to identify the sequence of amino acids that minimizes the energy of a given protein structure. In this paper, we propose a novel search-based approach that utilizes a Boolean function to encode the solution space where the function's onset represents the sequences considered during the search. We first present a dead-end-elimination (DEE) based method for the initial setup of the Boolean function and then describe a branch-and-bound algorithm that employs the search and deduction engine of a modern Boolean Satisfiability (SAT) solver. Its fast implication processing and conflict-based learning provide an efficient framework for the overall algorithm. Our results indicate that the presented approach can efficiently find the guaranteed optimum solution for protein core design problems. Furthermore, since our method is complete and symbolic, it can find all solutions that are within an Â¿-distance from the global minimum. This capability allows further analysis, such as identifying common sequence patterns of close-to-optimum solutions. Lastly, the SAT-based encoding of the search space provides a flexible mechanism to take complex design constraints into account, such as enforcing dependencies for amino acid choices at different positions or optimizing a single amino acid sequence to be simultaneously consistent with multiple input structures.