Date Published: July 01, 2020
Publisher: International Union of Crystallography
Author(s): Thorsten Wagner, Luca Lusnig, Sabrina Pospich, Markus Stabrin, Fabian Schönfeld, Stefan Raunser.
Two approaches for the selection of filaments from cryo-EM micrographs are described.
Many proteins of biological and medical relevance form filaments. Prominent examples are cytoskeletal proteins such as microtubules and actin, which are essential for many cellular functions, including muscle contraction and cargo transport (Pospich & Raunser, 2018 ▸). Further examples are amyloid and tau fibrils, which are involved in neurodegenerative diseases and have recently been the focus of many structural studies (Fitzpatrick et al., 2017 ▸; Pospich & Raunser, 2017 ▸). As filaments are, in general, reluctant to crystallize, cryo-EM is the method of choice to study their structure, as illustrated by the increasing number of deposited helical structures (https://www.ebi.ac.uk/pdbe/emdb/statistics_emmethod.html).
Particle picking is a crucial step in the cryo-EM processing pipeline. Picking of helical specimens is particularly challenging, as crossings, overlaps and filaments that are in too close proximity to each other need to be omitted. This challenge is well illustrated by the data sets used as examples in this work. Here, we present two picking procedures that allow accurate picking of filaments even for complex data sets, as illustrated for TMV and F-actin in this work. One of them is based on the deep-learning particle-picking procedure crYOLO and the other, called STRIPER, is based on a line-detection algorithm. CrYOLO is to our knowledge the first deep-learning-based particle-picking procedure which supports filaments. Through this approach, crYOLO learns the spatial context of filaments, enabling it to omit carbon edges and filaments that are too dense or broken, thereby accurately reproducing the accuracy of manual picking. As no general model for filaments is yet available, crYOLO requires training on manually selected micrographs. In contrast, STRIPER only requires four parameters, which can be quickly determined by the integrated optimization procedure. While this makes STRIPER a very fast, easily accessible picker, it comes at the cost of reduced precision compared with crYOLO. However, we also showed that the usage of a binary mask significantly improves the precision of STRIPER, resulting in high-quality 2D classes. Both picking procedures produce box files that are compatible with the majority of cryo-EM processing software and come with standard hardware requirements. Considering the performance and accessibility, we believe that both procedures are seminal contributions to the cryo-EM field.
The filament mode of crYOLO has been available since crYOLO version 1.3. CrYOLO itself is free for academic use and can be downloaded together with the source code at http://sphire.mpg.de/. crYOLO is implemented in Python and based on Keras 2.2.5 (https://keras.io/) using Tensorflow 1.10.1 (Abadi et al., 2016 ▸) as the backend. The version of STRIPER used in this paper was first implemented as an ImageJ (Rueden et al., 2017 ▸) plugin and is open source. The source code can be found at https://github.com/MPI-Dortmund/ij_striper. A Python implementation of STRIPER is currently being implemented and the alpha version can be found at https://github.com/MPI-Dortmund/striper. All data supporting the findings of this study are available from the corresponding author on reasonable request.