Date Published: October 01, 2016
Publisher: International Union of Crystallography
Author(s): Oleg Kovalevskiy, Robert A. Nicholls, Garib N. Murshudov.
An automated pipeline for low-resolution structure refinement (LORESTR) has been developed to assist in the hassle-free refinement of difficult cases. The pipeline automates the selection of high-resolution homologues for external restraint generation and optimizes the parameters for ProSMART and REFMAC5, improving R factors and geometry statistics in 94% of the test cases.
Poor diffraction quality from macromolecular crystals is a persistent problem: various types of short-range and long-range disorder induced by impurities, imperfect crystal-growth conditions and the natural conformational mobility of macromolecules result in the weakening of high-resolution observations, anisotropic diffraction and other problems (Chernov, 2003 ▸; Shaikevitch & Kam, 1981 ▸; Caylor et al., 1999 ▸). Poor crystal diffraction results in the corresponding data sets having a low information content. Whilst substantial effort has been directed towards improving crystal diffraction quality (Heras & Martin, 2005 ▸), quite often it is technically impossible to achieve high-quality diffraction, especially for large multi-subunit complexes. However, the primary aim of X-ray structure analysis is not to obtain the perfect crystal, but rather to determine atomic models of proteins, nucleic acids or complexes of interest that are of sufficient quality to be of use in addressing questions of biological relevance. Low-resolution crystal data can provide exactly such information. Hence, there is a high demand for the development of techniques to allow crystallographers to build reliable models using incomplete, limited and noisy diffraction data.
REFMAC5 v.5.8.0107 (Murshudov et al., 1997 ▸, 2011 ▸) and ProSMART v.0.843 (Nicholls et al., 2014 ▸) were used in all tests. 16% of the target low-resolution structures in the test set were detected as twinned by REFMAC5; these were treated as twinned (by specifying the ‘TWIN’ REFMAC5 keyword) in all tests. Details of the exact REFMAC5 and ProSMART parameters used can be found in Appendices A and B. Refinement quality was assessed using Rfree (Brünger, 1992 ▸) and the MolProbity score percentile (Chen et al., 2010 ▸). MolProbity assessment was performed using the MolProbity implementation from the PHENIX suite v.1.10 (Adams et al., 2010 ▸).
Using a test set of 104 cases, we attempted to identify the best parameters and combinations of protocols for improving refinement at low resolution using REFMAC5 with the assistance of external restraints generated by ProSMART. Every crystallographic case is different. For instance, structural models of high-resolution close homologues may be available for some target proteins, but not for others. This makes it almost impossible to design one universal refinement protocol that would be optimal for all possible cases. Also, we could not find a strong correlation between refinement protocol performance and any of the obvious main parameters, such as the resolution of the data, the geometric quality of the model, the values of the R factors etc. Consequently, we concluded that the most appropriate strategy, which strikes a good balance between reliability and efficiency, is to first identify the minimal set of top-performing protocols and then try using all of these protocols in order to find the one which performs best. Ideally, the minimal set of protocols should be such that one of the protocols produces optimal refinement results in all cases. We have attempted to identify such a minimal set of refinement protocols, and have implemented them in a refinement pipeline.
We have tested various refinement strategies as well as different REFMAC5 and ProSMART parameters on a test set of more than 100 structures. We found that in cases where high-resolution homologues are available the best strategy is to first execute a REFMAC5 refinement run using external restraints generated by ProSMART, followed by a second round of refinement using only jelly-body restraints. The availability and the selection of appropriate high-resolution homologues is important for successful refinement using external restraints. Such homologues should have a local conformation sufficiently close to that of the true structure in the target low-resolution crystal, which is typically the case for proteins sharing at least 75% sequence identity. In cases where no homologues are available for a particular protein chain, external restraints representing backbone hydrogen bonds can improve refinement.