Date Published: April , 2009
Publisher: A.I. Gordeyev
Author(s): T. V. Pyrkov, D. V. Pyrkova, E. D. Balitskaya, R. G. Efremov.
The biological function of proteins is closely connected to interactions with their ligands and substrates. Proteins acting as receptors and enzymes bind these small molecules. Knowledge of the molecular mechanisms of protein-ligand interactions, particularly in the spatial structure of the protein-ligand complex, is a prerequisite for understanding the structure-functional properties of proteins and their role in biochemical pathways in the living cell. The availability of such a structure serves as a basis for rational drug design projects and greatly assists the search for new inhibitors (the ligands of certain protein-targets in an organism).
Of all the various types of interactions in biomolecular complexes (such as hydrogen bonds, salt bridges, etc.), the stacking of aromatic substances deserves special attention. Most drugs include aromatic fragments in their chemical structure, and stacking often plays a notable role in their recognition by protein-targets. We have recently shown that an explicit account of stacking in scoring functions increases the efficiency of ATP docking . The aromatic interactions were identified by the mutual orientation of two cycles described by geometrical parameters: the height h and displacement d of one cycle relative to the other, and the angle α between their planes (Fig. 1).
We demonstrated the efficiency of an explicit account of stacking interactions in a scoring function along with reranking the results of GTP docking to the 14 different proteins that bind this ligand. All structures of GTP-protein complexes, which were generated with the docking procedure, were labeled as either correct or misleading by the value of root-mean-square deviation (rmsd) of the guanine atoms from the reference X-ray structure (see Methods). To estimate the validity of a docking pose, we constructed a number of scoring criteria in the form of a linear combination of the interaction terms. To do that, all GTP-protein complexes were divided into two equal groups: the training and the test sets. The weighting coefficients for these terms were fitted by the linear regression procedure to the binary function, which took on the value of 1 for the correct docking poses and 0 for the seven complexes of the training set. To test the robustness of the new scoring functions, the leave-one-out cross-validation procedure was performed. The relative error of all weighting coefficients of interaction terms was less than 30%, thus indicating the reliability of the results.
The analysis of structural data for complexes of proteins with ligands containing adenine or guanine moiety yielded a more accurate definition of the geometrical parameters of stacking interactions with aromatic side chains of receptor amino acids. Reranking the results of GTP docking demonstrated that an explicit account of stacking in scoring criteria provides a more efficient estimation of the reliability of the structure of the protein-ligand complex predicted with molecular modeling approaches. The obtained results can be further applied to a broader class of nucleobase-containing ligands.
The structures of complexes of proteins with adenine- and guanine-containing ligands were taken from the Brookhaven Protein Data Bank (PDB) . The PDBlig web server  was used to identify those PDB entries that contain a ligand with a purine nucleobase (adenine or guanine as a substructure). Structures with modified nucleobases and entries that contain nucleic acids other than simple nucleotides or nucleosides were omitted. Finally, to reduce the redundancy of the set of protein-ligand complexes, a multiple sequence alignment was carried out using the Clustalw program . After that, all complexes with the same ligand were clustered according to similarity in a protein sequence and the structure with the best resolution from each cluster was retained.