Research Article: Overview of refinement procedures within REFMAC5: utilizing data from different sources

Date Published: March 01, 2018

Publisher: International Union of Crystallography

Author(s): Oleg Kovalevskiy, Robert A. Nicholls, Fei Long, Azzurra Carlon, Garib N. Murshudov.


Here, a macromolecule-centred approach to three-dimensional structure determination as implemented in REFMAC5 is considered. The use of restraints to transfer chemical and structural information during macromolecular refinement, and how different sources of information can be combined in order to achieve models that are more consistent with data derived from a variety of experimental techniques, including macromolecular crystallography, cryo-EM and NMR spectroscopy, are discussed.

Partial Text

Attempting to understand the three-dimensional structures of macromolecules is akin to opening a ‘black box’: structural models can provide some ideas, or at least hypotheses to test, regarding the function of a molecule of interest. The Protein Data Bank (PDB; Berman et al., 2002 ▸) is the main archive of experimentally determined structural models of macromolecules. The PDB contains models elucidated using three different methods: nuclear magnetic resonance (NMR), X-ray crystallography and electron cryo-microscopy (cryo-EM). Although ultimately all three methods result in an ‘atomic model’, experimental data obtained using these different methods are subjected to different procedures using different tools, reflecting differences in the physical processes underlying each method. Each of these experimental methods produces imperfect data that are insufficient to elucidate a three-dimensional structure with complete certainty. For instance, in an NMR experiment one measures either only short-range interactions (nuclear Overhauser effects; NOEs; Ferella et al., 2012 ▸), distances between the observed nuclei and a paramagnetic metal (pseudocontact shifts; PCSs) or orientations of interatomic vectors connecting two coupled nuclei with respect to an external reference frame (residual dipolar couplings; RDCs) (Koehler & Meiler, 2011 ▸; Bertini et al., 2017 ▸). In the absence of prior structural knowledge, the measured distances and angles may be hard to interpret. In X-ray crystallography, only the intensities of structure factors are measured, so one needs to solve the ‘phase problem’ either by using prior knowledge about a related structure, i.e. molecular replacement (Rossmann, 1972 ▸; McCoy et al., 2007 ▸; Vagin & Teplyakov, 2010 ▸), or by carrying out additional experiments and determining phases for a substructure (SAD/MAD, SIR etc.; Hendrickson, 1991 ▸; Perutz, 1956 ▸; Sheldrick, 2015 ▸; Skubák & Pannu, 2013 ▸). Often, macromolecules, especially large complexes of exceptional biological interest, fail to form high-quality crystals, resulting in poor diffraction. In such cases, the resulting electron-density maps might be hard to interpret; additional sources of structural information could potentially help to interpret such poor-quality maps. Cryo-EM electrostatic potential maps are the result of averaging thousands of independent but extremely noisy observations, so the local quality (i.e. local resolution) of the map varies greatly within the map (Kucukelbir et al., 2014 ▸) and between several reconstructions. Again, the use of additional prior knowledge can help to interpret the low-resolution parts of such maps that exhibit varying signal-to-noise ratios.

It is necessary to analyse the information contained in the data, and to use enough prior knowledge to complement this information. Both crystallographic and cryo-EM techniques produce data pertaining to long-range interactions between atoms in the molecule; shorter range interactions become available as the resolution of the data increases (Fig. 1 ▸). There are several techniques to define the optical resolution of data (Vaguine et al., 1999 ▸; Urzhumtseva et al., 2013 ▸). Since the quoted resolution of experimental data is determined in reciprocal space, it is interesting to estimate the minimal distance between two points in real space that can be resolved using diffraction data of a certain resolution. It is easier to estimate the minimal theoretically resolvable distance between two points if we assume the absence of noise and zero B factors, i.e. under the assumption that the data are ideal. Here, we consider the Fourier transformation of a sphere with radius smax to demonstrate how a single peak is broadened when only data extending to limited resolutions are available (see, for example, Pinsky, 2001 ▸), where j1 is a spherical Bessel function of order 1.

As opposed to X-ray crystallography, NMR spectroscopy is limited in terms of the amount of data that can be obtained. Typical NMR experiments deliver torsion-angle restraints (from chemical shifts) and short-range distances (<10 Å) that are obtained by dipolar cross-relaxation (nuclear Overhauser effect; NOE; Ferella et al., 2012 ▸). It is a computationally challenging task to deconvolute a set of distances and angles into a three-dimensional macromolecular structure or even complex. Thus, it is a rather long and inefficient task to obtain a high-resolution structure by NMR spectroscopy. However, NMR-based restraints measured in solution can make a valuable contribution to the structural refinement of macromolecules against diffraction data. Two kinds of restraints are particularly useful in this sense: PCSs and RDCs (Fig. 3 ▸). The first can be measured to cover distances spanning the whole macromolecule, whereas the latter yield highly accurate angle measurements. These restraints have thus played a fundamental role in the development of structure-determination/refinement strategies by solution NMR spectroscopy, and their use has become increasingly popular in recent years (Koehler & Meiler, 2011 ▸). PCSs and RDCs owe their popularity to the fact that, in contrast to other classical NMR parameters, these restraints allow the extraction of distance and angular information relative to an external reference frame, making it possible to obtain information about the overall molecular system. The use of prior knowledge in refinement aids the extraction of biologically relevant information from noisy and limited experimental data. Various types of restraints allow the injection of additional information obtained from various sources, ranging from high-resolution structures of related proteins to different experimental observations such as NMR, into the refinement process. However, as experimental data quality degrades, and thus the effective number of observations decreases, it is much easier to achieve good agreement with the prior knowledge, simply because limited data contain negligible information about short-range interactions. Note that good agreement with prior knowledge does not necessarily mean that the derived model corresponds well to the true structure; rather, it only means that the model agrees with the prior knowledge (i.e. that the model is chemically/structurally sensible). When considering diffraction data at low resolution, all indicators, as well as the predictive power of the model, must be checked. In other words, when maps are displayed/used as evidence, the part of the atomic model in question (corresponding to the relevant map region) should not be used during the map-calculation procedure; OMIT maps or similar must be used.   Source:


Leave a Reply

Your email address will not be published.