Research Article: Using Phaser and ensembles to improve the performance of SIMBAD

Date Published: January 01, 2020

Publisher: International Union of Crystallography

Author(s): Adam J. Simpkin, Felix Simkovic, Jens M. H. Thomas, Martin Savko, Andrey Lebedev, Ville Uski, Charles C. Ballard, Marcin Wojdyr, William Shepard, Daniel J. Rigden, Ronan M. Keegan.


Improvements to the sensitivity of the search for suitable molecular-replacement search models in SIMBAD through the use of Phaser and an ensemble-based model database are reported.

Partial Text

Molecular replacement (MR) remains the most popular method to solve the phase problem in macromolecular crystallography since it is quick, inexpensive and often highly automated (Evans & McCoy, 2008 ▸; Long et al., 2008 ▸; Scapin, 2013 ▸). Conventional MR exploits the fact that evolutionarily related macromolecules tend to be structurally similar. Therefore, when correctly placed within the asymmetric unit of a target structure, a homologous protein can provide a sufficiently accurate approximation of the phases to solve the unknown structure (Rossmann, 1972 ▸, 1990 ▸). As evolutionarily related molecules are likely to have similar protein sequences, sequence similarity often provides a quick and easy route to identify suitable homologues for MR. However, search models selected by sequence similarity can give poor results for a number of reasons. These include cases where only distant, low-sequence-identity homologues can be identified, which can often be too structurally divergent from the target. Even where high-sequence-identity homologues are available, they may have crystallized in different conformational states and hence prove too structurally distinct to succeed. Another possibility is that a contaminant protein has crystallized in place of the target protein.

The methodology for SIMBAD has been described in detail previously (Simpkin et al., 2018 ▸). In outline, a lattice-parameter search is followed by the screening of a small database of common contaminants and then the MoRDa database using the AMoRe rotation function. These three elements can be run singly or sequentially.

The implementation of the Phaser fast rotation search in SIMBAD has proved to be significantly better at detecting suitable search models than the version using AMoRe. The use of ensembles in SIMBAD has helped to further increase the sensitivity of the rotation function when screening the MoRDa database. This has allowed us to obtain greater RFZ values for suitable search models and therefore increase the likelihood of a solution and early termination.

The use of the Phaser fast rotation search in SIMBAD, and the ensemble search models which can therefore be used, each significantly improve the effectiveness of the pipeline. Together with an early-termination function, they allow SIMBAD to more readily identify suitable search models in the MoRDa database and to identify them more quickly, thereby more efficiently solving a wider range of cases in a sequence-independent fashion.




Leave a Reply

Your email address will not be published.