Research Article: Supercell refinement: a cautionary tale

Date Published: September 01, 2019

Publisher: International Union of Crystallography

Author(s): Jeffrey Lovelace, Václav Petrícek, Garib Murshudov, Gloria E. O. Borgstahl.


A higher dimensional superspace description accounts for an unexpected supercell refinement result.

Partial Text

On occasion, a diffraction pattern is observed that consists of many intense reflections interspersed with many weaker ‘satellite’ reflections. Indexing software may find a good fit for the main (most intense) reflections and may not index the weaker reflections. In some of these cases, this subcell can be extended in integer multiples along one or more of its dimensions, forming a supercell, so that all of the reflections can be correctly indexed (Fig. 1 ▸). When the satellites can be indexed with the main reflections in this manner the diffraction data are called ‘commensurately modulated’. If not indexable, the data are ‘incommensurate’. A view of this process is that the subcell describes an average of what is occurring in the structure if only the main reflections are used, and the other weaker reflections describe the more complex displacement that occurs in each subcell over the supercell. One approach to solving this type of problem is to use the main reflections to arrive at an average solution, and then extend this average solution into a supercell and refine the resulting supercell against all of the reflections. While testing this approach with simulated commensurate data, the expected outcome was that the refined positions, starting from the average position, would match the correct positions that were used to create the reflections for the simulation. After verifying that the approach would work with commensurate data, the plan was then to move on to investigate how commensurate approximates of incommensurate data refine and to see how close the modulation functions match the incommensurate functions (example described below). The refinement had excellent statistics, and initially it was thought to have worked; however, this was not the case. On closer inspection, the expected positions (circles in Fig. 2 ▸) did not match the refined positions (crosses in Fig. 2 ▸). The result appears to be shifted in some way, and this is where the mystery began. Until this issue was resolved, it held up the application of this approach to the ‘real-world’ incommensurate case that we aimed to solve.

Test structures and simulated diffraction data were made to allow researchers to study refinement strategies for modulated data sets in a controlled setting (Lovelace et al., 2013 ▸). The test data were created from a modified form of the ToxD structure [PDB entry 1dtx; Fig. 4 ▸(a)]. The ToxD monomer was broken into three chains. Chain B (residues 31–38 of the original molecule), which was located against the solvent channel, was renumbered and translated out into the solvent channel and, to avoid collisions, residues 1, 4, 7 and 8 were mutated to alanines using Coot (Emsley et al., 2010 ▸). The coordinates were extended to a 7× supercell, and chain B was modulated rotationally around an axis defined by the Cα atom in the second residue in chain B and the Cβ atom in the eighth residue in chain B [Fig. 4 ▸(b)]. The amount of modulation was determined by the location along the y direction of the center of mass of chain B with a maximum rotation of ±15° [Figs. 4 ▸(c) and 4 ▸(d)]. The modulating rotation was carried out using Matlab. The starting supercell structure for refinement was a 7× expansion of the average structure. The modulation vector for the test diffraction data set was set to q = (2/7)b*, or there were two modulation waves every seven unit cells. In other words, each subcell of the supercell had chain B in a different rotated orientation based on its position within the supercell. The average structure was found using Phaser (McCoy et al., 2007 ▸) to place chains A, B and C into the subcell using only the main reflections. The supercell was refined with REFMAC (Murshudov et al., 2011 ▸) using the following settings: restrained refinement for 40 cycles with jelly body enabled and set to 0.020. A zip archive file containing all of the starting models, reflections (mtz) and refined models is available as supporting information and can also be obtained by contacting the corresponding author.

The structure solution was performed in two stages. Firstly, the average structure was solved by using only the main reflections and performing molecular replacement with Phaser (McCoy et al., 2007 ▸) and was refined with REFMAC on the corresponding subcell. The second step was to expand the average solution into a supercell (7× in y in this case) and then refine against all reflections that were indexed as a supercell. This approach was taken because it more closely mirrors the formulation of the superspace theory, in which atoms are described mathematically as having an average position that is perturbed by an atomic modulation function, as opposed to directly performing molecular replacement against the entire supercell. When the atoms of a modulated structure in a supercell are plotted as a displacement from their average position as a function of their t value in superspace, the resulting seven points (for the supercell used in this paper) on the graph provide an approximation of the AMF [black line in Fig. 5 ▸(a)]. For all graphs (Figs. 2 ▸, 5 ▸ and 6 ▸), the lines represent the AMFs and the circles represent the correct positions that the atoms in the supercell occupy on the AMF. The initial starting state for the refinement [crosses in Fig. 5 ▸(a)] has all atoms on a flat line along x1 at zero displacement because the average structure is in the same position in each subcell of the initial supercell structure.

In conclusion, we have revealed that the refined supercell model may not end up in the true atomic positions of the modulated structure owing to the availability of multiple 3D daughter space groups. Using the refined positions of the supercell to fit AMFs should result in approximate AMFs of good enough quality to test whether phase-shifting the atomic positions of the supercell provides a better structural solution. Software tools such as Jana2006 or the Superspace Group Finder website can be used to find the appropriate (3+1)D to 3D daughter space group options for testing phase shifts in refinement. For supercell structures, it may be useful to study the atomic positions as plotted in superspace t plots to gain more insight into the underlying mechanisms of the displace­ment. Additionally, for supercells, the jelly-body refinement option (or any option like jelly-body refinement in your refinement software of choice) should always be enabled to prevent the model from attempting to refine two solutions simultaneously. In future work, we will employ these methods and observations in the refinement of incommensurately modulated profilin–actin (Lovelace et al., 2008 ▸).




Leave a Reply

Your email address will not be published.