Research Article: Middle-way flexible docking: Pose prediction using mixed-resolution Monte Carlo in estrogen receptor α

Date Published: April 23, 2019

Publisher: Public Library of Science

Author(s): Justin Spiriti, Sundar Raman Subramanian, Rohith Palli, Maria Wu, Daniel M. Zuckerman, Emilio Gallicchio.


There is a vast gulf between the two primary strategies for simulating protein-ligand interactions. Docking methods significantly limit or eliminate protein flexibility to gain great speed at the price of uncontrolled inaccuracy, whereas fully flexible atomistic molecular dynamics simulations are expensive and often suffer from limited sampling. We have developed a flexible docking approach geared especially for highly flexible or poorly resolved targets based on mixed-resolution Monte Carlo (MRMC), which is intended to offer a balance among speed, protein flexibility, and sampling power. The binding region of the protein is treated with a standard atomistic force field, while the remainder of the protein is modeled at the residue level with a Gō model that permits protein flexibility while saving computational cost. Implicit solvation is used. Here we assess three facets of the MRMC approach with implications for other docking studies: (i) the role of receptor flexibility in cross-docking pose prediction; (ii) the use of non-equilibrium candidate Monte Carlo (NCMC) and (iii) the use of pose-clustering in scoring. We examine 61 co-crystallized ligands of estrogen receptor α, an important cancer target known for its flexibility. We also compare the performance of the MRMC approach with Autodock smina. Adding protein flexibility, not surprisingly, leads to significantly lower total energies and stronger interactions between protein and ligand, but notably we document the important role of backbone flexibility in the improvement. The improved backbone flexibility also leads to improved performance relative to smina. Somewhat unexpectedly, our implementation of NCMC leads to only modestly improved sampling of ligand poses. Overall, the addition of protein flexibility improves the performance of docking, as measured by energy-ranked poses, but we do not find significant improvements based on cluster information or the use of NCMC. We discuss possible improvements for the model including alternative coarse-grained force fields, improvements to the treatment of solvation, and adding additional types of NCMC moves.

Partial Text

Computational structure-based drug design can play an important role in drug development, as exemplified in the development of inhibitors of HIV protease, which have a major impact on treatment for people living with HIV. [1, 2] Because of their potential to reduce the cost and time associated with drug development, a multitude of methods have been developed to screen potential drug candidates virtually and prioritize possible structures for synthesis. [3–6] Some of the most popular docking methods represent the potential energy due to the receptor using grids and optimize the ligand conformation with respect to this potential. These include DOCK, [7, 8] Autodock Vina, [9] and the related smina [10], Schrodinger Glide, [11–13] CDOCKER, [14] and others. [15] Once a grid is constructed, however, the protein conformation represented by that grid is fixed, which is a serious approximation, since a number of structural studies have shown that “hidden” protein conformations and protein flexibility play important roles in protein-ligand binding; methods that consider protein flexibility and multiple structures improve docking performance compared to those that make use of only one structure. [16–23]

In this paper, three separate tactics for improving protein-ligand docking were tested. These included a mixed-resolution potential in which most of the protein is treated using a coarse grained model while the region around the ligand was treated atomistically; a nonequilibrium Monte Carlo method, which is intended to improve sampling by systematically varying the coupling between protein and ligand; and the use of clustering to identify free energy basins corresponding to multiple binding poses, and scoring the poses based on this information. The docking results were evaluated by comparing the final structures to known crystal structures; the sampling was also evaluated by studying the relationship between acceptance rate and move size for NCMC moves. We found that allowing for full protein flexibility using the mixed-resolution potential significantly improved the docking results, and the use of NCMC produces a further modest improvement. However, clustering did not appear to offer any significant advantages over simply ranking the poses by protein-ligand interaction energy. Overall, we obtained a correct pose within 2 Å for about half the ligands, so there is significant room for improvement. Below, we discuss ways that several aspects of the approach could be improved. We note that systematic investigation of these different aspects of the docking problem is facilitated by having a highly flexible/adjustable Monte Carlo platform.

We used a highly adjustable mixed-resolution Monte Carlo (MRMC) platform to examine several aspects of docking protocols in a systematic way. Most importantly, we examined the effects of rigidifying both side chains and the protein backbone. The detrimental results are not completely surprising, but the systematic comparison underscores the importance of backbone flexibility, which is absent from almost all grid-based docking studies. We further examined the sampling improvement afforded by non-equilibrium candidate Monte Carlo, finding only modest improvement in our implementation. Our test case was the flexible ligand binding domain of the estrogen receptor alpha, which is an important cancer target and also a model for other nuclear hormone receptors.