Research Article: Computational Design of a DNA- and Fc-Binding Fusion Protein

Date Published: September 14, 2011

Publisher: Hindawi Publishing Corporation

Author(s): Jonas Winkler, Giuliano Armano, J. Nikolaj Dybowski, Oliver Kuhn, Filippo Ledda, Dominik Heider.


Computational design of novel proteins with well-defined functions is an ongoing topic in computational biology. In this work, we generated and optimized a new synthetic fusion protein using an evolutionary approach. The optimization was guided by directed evolution based on hydrophobicity scores, molecular weight, and secondary structure predictions. Several methods were used to refine the models built from the resulting sequences. We have successfully combined two unrelated naturally occurring binding sites, the immunoglobin Fc-binding site of the Z domain and the DNA-binding motif of MyoD bHLH, into a novel stable protein.

Partial Text

Protein design methods use trial and error or more sophisticated methods like directed evolution or inverse folding to generate novel scaffolds or to find novel protein sequences folding into a defined scaffold, respectively. Given the intimate relationship between a protein’s structure and function, a way to design proteins with targeted properties is to start from a desired structure and find sequences able to fold into it, imposing additional constraints in the process [1]. On the one hand, it is known that, in general, similar sequences fold into similar structures [2]; on the other hand, there are many cases of nearly identical structures known, sharing no sequence similarity at all [3]. However, the aim of computational design methods is not finding all possible solutions, but at least one solution that fits the required properties. One of the methods that have been proposed is a multiobjective optimization, in which protein stability and catalytic activity are simultaneously optimized [4, 5].

We have successfully combined two independent binding sites into a given protein scaffold (PDB: 1LP1) using a genetic algorithm for sequence optimization. The fused sequence of the Z domain and MyoD was used as a start sequence (see Figure 2). Helix 3 of the Z domain (residue 42 to 57) was replaced by a DNA-binding helix of MyoD (residue 110 to 125). Amino acids essential for binding of Fc (5,9–11,13,14,28,31) and DNA (110,111,114,115,117–119,121) were conserved, while remaining positions were mutated during the optimization. The initial population consisted of randomly mutated seed sequences.

We have applied multi-objective optimization guided by directed evolution to combine the MyoD DNA-binding motif into the Z domain conserving the scaffolds structure. Simulations showed that the optimization of the sequences based on hydrophobicity, molecular weight, and secondary structure predictions improved structural stability while maintaining protein functionality. The use of simple fitness functions reduces the optimization complexity, and thus allows to optimize more individuals over more generations resulting in a better sampling of the sequence space.




Leave a Reply

Your email address will not be published.