Date Published: May 01, 2018
Publisher: International Union of Crystallography
Author(s): Keitaro Yamashita, Kunio Hirata, Masaki Yamamoto.
An automated data-processing pipeline for protein microcrystals is presented. The processing of multiple small-wedge data sets was made dramatically easier by this pipeline.
The major difficulty in solving protein structures by X-ray crystallography is obtaining crystals with suitable size and sufficient diffracting power. Recently, X-ray microbeams have become available at synchrotron light sources and have enabled data collection from microcrystals at high signal-to-noise ratios (Smith et al., 2012 ▸; Owen et al., 2016 ▸; Yamamoto et al., 2017 ▸). The preparation of microcrystals is often easier than that of larger crystals, and smaller crystals may sometimes be better ordered and thus lead to higher quality data sets (Cusack et al., 1998 ▸; Evans et al., 2011 ▸). The lipidic mesophase method has facilitated the production of small but high-quality crystals of membrane proteins, which are typically difficult targets in protein crystallography (Caffrey, 2015 ▸). However, the radiation-dose limit of greater than 10 MGy is rapidly exceeded in microcrystals, and consequently multiple crystals are needed to overcome this problem. Serial femtosecond crystallography (SFX) using an X-ray free-electron laser (XFEL; Schlichting, 2015 ▸) is a very powerful approach in microcrystallography, based on the principle of ‘diffraction before destruction’. Since an ultra-intense femtosecond XFEL pulse only permits the collection of a single still diffraction image, thousands to tens of thousands of diffraction patterns are usually required for structural analysis. Although data collection from such a large number of crystals is fast at a high-repetition-rate XFEL, limited beamtime prevents its routine use. In contrast, radiation damage from synchrotron radiation cannot be completely avoided but high-resolution diffraction images can be collected via the rotation method, where integrated Bragg intensities are experimentally obtained. At cryogenic temperatures, small-wedge (5–10°) data collection can be a good compromise to obtain strong diffraction signals under tolerable doses, and tens to hundreds of data sets are usually sufficient to obtain a high-resolution structure. Efficient data collection can be achieved and easily automated when multiple microcrystals are held in a sample holder. MeshAndCollect, an automated multi-crystal data-collection workflow, has been developed at the ESRF (Zander et al., 2015 ▸). The recently developed ZOO system (Hirata et al., manuscript in preparation) at the microfocus beamline BL32XU at SPring-8 also supports the automatic collection of small-wedge data sets from loop-harvested microcrystals. These systems enabled rapid data collection from a large number of crystals. Multi-crystal data-collection strategies are advantageous over single-crystal data collection in obtaining accurate data which allow SAD phasing when correctly processed (Akey et al., 2014 ▸; Liu & Hendrickson, 2015 ▸; Huang et al., 2016 ▸; Olieric et al., 2016 ▸). Small-wedge data collection is also compatible with room-temperature experiments that utilize crystallization plates, without the need for tedious crystal harvesting and cooling, as well as possibly allowing the exploration of functionally relevant protein conformations (Fraser et al., 2011 ▸).
KAMO has been used for several structure analyses (Table 1 ▸). Processing notes describing the analysis of raw data that are publicly available can be found at https://github.com/keitaroyam/yamtbx/wiki. In this section, some processing examples are presented using the methods described here. The program versions used here were yamtbx commit 1107fc4 (as of 26 December 2017) with cctbx commit 16b2e7c (as of 17 April 2017); XDS 1 May 2016 (BUILT=20160617); CCP4 7.0 (update 042); PHENIX 1.11.1; SHELXC, SHELXD and SHELXE 2016/1, 2013/2 and 2016/3, respectively; and R 3.2.2. For data processing using XDS, diffraction images in the HDF5 format were read through the modified version of Takanori Nakane’s eiger2cbf (https://github.com/keitaroyam/eiger2cbf). The processing results may differ slightly from the original studies because of the use of different versions.
KAMO, an automated data-processing system for multiple microcrystals, has been developed and its use is spreading in facilities across the world. KAMO dramatically simplifies the processing of multiple small-wedge data sets, working with XDS and CCP4 programs. Data collection from a number of crystals was highly beneficial for successful SAD phasing (§3.2) and finding the high-quality subset of the whole data (§3.3), which were easily achieved through KAMO.