Date Published: October 01, 2019
Publisher: International Union of Crystallography
Author(s): Dorothee Liebschner, Pavel V. Afonine, Matthew L. Baker, Gábor Bunkóczi, Vincent B. Chen, Tristan I. Croll, Bradley Hintze, Li-Wei Hung, Swati Jain, Airlie J. McCoy, Nigel W. Moriarty, Robert D. Oeffner, Billy K. Poon, Michael G. Prisant, Randy J. Read, Jane S. Richardson, David C. Richardson, Massimo D. Sammito, Oleg V. Sobolev, Duncan H. Stockwell, Thomas C. Terwilliger, Alexandre G. Urzhumtsev, Lizbeth L. Videau, Christopher J. Williams, Paul D. Adams.
Recent developments in the Phenix software package are described in the context of macromolecular structure determination using X-rays, neutrons and electrons.
Macromolecules are essential for biological processes within organisms, engendering the need to understand their behavior to explain the fundamentals of life. The function of macromolecules correlates with their three-dimensional structure, i.e. how the atoms of the molecule are arranged in space and how they move over time. Two major methods to obtain macromolecular structures are diffraction (usually using X-rays, but also neutrons or electrons) and electron cryo-microscopy (cryo-EM1) (Fig. 1 ▸), both of which are handled by Phenix. The following subsections describe some concepts underpinning each method for the benefit of readers who are not experts in each of these areas.
Fig. 4 ▸ shows the steps of the structure-solution process for X-ray/neutron crystallography and cryo-EM. Owing to the different nature of the interactions, there are nuances for each structure-determination method (Figs. 5 ▸ and 6 ▸), but the overall procedure to obtain a molecular model is similar. The first step consists of analyzing the derived experimental data to detect any anomalies that can complicate or even prevent structure determination (Section 3). The second and third steps are to obtain the best possible map (Section 4) so that a model can be built (Section 5). The fourth step focuses on iteratively improving the model by cycles of local rebuilding, refinement and validation (Sections 6 and 7). The subsequent sections elaborate on the steps and explain similarities and differences for X-ray/neutron crystallographic and cryo-EM data. The Phenix tools that perform the corresponding steps are described.
Data quality should be analyzed carefully because unusual features can thwart structure solution. If the data have anomalies, it is not guaranteed that they can be addressed at later stages, in which case it might be necessary to perform new experiments or re-analyze the raw data.
To determine the structure of a macromolecule, a model must be built that fits the experimental data. Cryo-EM maps contain phase information, but they often have low resolution (Fig. 7 ▸), which makes interpretation difficult. For both techniques, but especially in the case of cryo-EM, the molecules can be very large, so that automated procedures are preferred over manual interpretation wherever feasible.
Models from the building or phasing steps are approximate and need to be improved; for example, side chains may not fit the density, and water molecules and ligands are likely to be missing. Refinement is the process of improving the parameters of a model until the best fit is achieved between experimental and model-calculated data. The parameterization of an atomic model mainly depends on the data quality and the current stage of refinement. Generally, the parameterization is chosen such that a simpler model is used at the beginning (such as rigid body) and a more complex model is used towards the end. The target function guides the refinement by linking the model parameters to the experimental data and by scoring model-versus-data fit. For reciprocal-space refinement, the target function (T) is expressed through structure factors (or diffraction intensities). For real-space refinement, the target is formulated in terms of a map. In both cases, the process alternates automated refinement with validation and either manual or automated model corrections.
Validation indicates good parts and highlights problems in macromolecular models, and should guide corrections throughout the structure-solution process. In particular, the refinement stage benefits from validation: the process consists of cycles of validation, rebuilding (either manual or automated) and automated refinement, which are repeated until a satisfactory model is obtained. As problems are corrected, the model quality, refinement behavior and even the density map quality (for X-ray and neutron) all improve.
The Phenix software for macromolecular structure determination handles data from three experimental methods: cryo-EM, X-ray diffraction and neutron diffraction. All steps in the structure-solution process are addressed by programs that are tailored for the type of experimental data, but share algorithms where appropriate. Procedures are automated to minimize repetitive and time-consuming manual tasks as far as feasible. For the future, the improvement of automated model building, refinement and validation at low resolution (worse than 3 Å) remains a priority; another area of development to help the structural biology community is the automated identification, fitting and refinement of ligands, ions and water.