Date Published: February 01, 2017
Publisher: International Union of Crystallography
Author(s): Ilka Müller.
This article aims to guide efforts in protein–ligand complex crystal structure generation, with special consideration of protein construct design, and summarizes different approaches to co-crystallization and crystal soaking. Common problems and pitfalls are highlighted.
Over the last years, experimental information on the vast number of protein-crystallization experiments carried out by different structural genomics initiatives (Savitsky et al., 2010 ▸; Ng et al., 2016 ▸) and also within industrial settings (Öster et al., 2015 ▸) has become publically available. Systematic analysis of this large knowledge base has helped to generate a catalogue of strategies for the crystallization of proteins, especially as the data include negative results, i.e. which approaches did not lead to the desired outcome. In addition, improvements to the molecular-cloning toolbox and automation as well as miniaturization of the crystallization setup make it possible to parallelize experiments, thus improving the timelines from choosing the target to solving the first crystal structure. This is particularly important if the structure is not the end point, but rather the start for the generation of multiple or even hundreds of ligand-bound structures to guide drug discovery.
Generation of protein–ligand complex crystals can be divided into five main steps: construct design, protein expression and purification, protein–ligand complex crystal formation and finally a sanity check of the complex structure. It is important to use as much prior information on the protein target as possible alongside general principles, and to keep in mind that not all factors leading to protein crystals are understood. Thus, some problems may only be solved by trial and error, and the design cycle shown in Fig. 1 ▸ is repeated several times. With prior structural information available, it might be possible to skip these first steps and enter the cycle at the complex crystal formation step. Experimental considerations for the different entry points are given in Table 1 ▸.
To study protein–ligand interactions, it is often not necessary to investigate the ligand in the context of the full-length protein, especially if it is comprised of multiple domains (Derewenda, 2010 ▸). In the case of the somatic mutation underlying chronic myelogenous leukaemia (CML), parts of the c-Abl gene are fused to the breakpoint cluster region (Bcr), resulting in the approximately 200 kDa BCR-Abl oncogene. This fusion protein is a constitutively active form of the tightly regulated tyrosine kinase c-Abl (1130 amino-acid residues), which results in the uncontrolled cell growth and survival in leukaemia. The structure of the kinase domain of Abl (less than 300 residues) was used by Schindler et al. (2000 ▸) to elucidate the structural mechanism of BCR-Abl inhibition by STI-571 (GleevecTM/imatinib), and it was found that the drug binds to and stabilizes the inactive form of the kinase (PDB entry 1fpu). Using Abl kinase-domain constructs further helped to understand imatinib-resistant mutants in CML patients and assisted the development of inhibitors that overcome this drug resistance (Levinson et al., 2006 ▸; Cowan-Jacob et al., 2007 ▸).
There are many different expression systems available that are used for structural studies. The vast majority (75%) of all proteins with structures reported in the PDB were expressed using the prokaryotic expression host E. coli, and the ratio is the same if only human proteins are considered. E. coli is a relatively easy-to-use expression system and is time- and resource-efficient, and suitable equipment is readily available in standard laboratories. There are a number of different E. coli expression strains with enhanced functionality available to help overcome issues such as codon bias, proteolysis, toxicity of the expressed gene or disulfide-bridge formation. Overviews of protein expression in E. coli for structural studies are given in Gräslund et al. (2011 ▸) and by Papaneophytou & Kontopidis (2014 ▸).
Ligand co-crystallization typically requires more resources compared with soaking, both in terms of time and material. Miniaturization and automation of the crystallization setup help to minimize protein consumption and to achieve reasonable throughput. In addition, crystal seeding techniques are a powerful tool to speed up the optimization process and to obtain good-quality crystals more consistently.
Crystallization of protein–ligand complexes requires careful consideration of a variety of parameters from protein construct design, choice of expression system and optimization of purification conditions to fine-tuning of (co-)crystallization and soaking. The review of recent analyses of large-scale protein structure-determination efforts has identified a number of factors that are important for a systematic approach to establishing a robust crystal system. Prioritization of the different aspects of protein construct design is thought to be key for developing a crystal system early to maximize the impact of structural information in a project, rather than merely describing structural aspects in hindsight. The examples discussed should make the variables and issues in protein crystal complex formation more tangible, but also serve as an inspiration for anyone stuck in the process. Yet, it should not be forgotten that protein crystallography does involve trial and error as a key aspect and at each stage. Therefore, as many (well chosen) conditions as possible should be tested, but no more than the laboratory is set up to deal with and the experimenter is comfortable handling at any time.