Research Article: Producing High-Accuracy Lattice Models from Protein Atomic Coordinates Including Side Chains

Date Published: August 15, 2012

Publisher: Hindawi Publishing Corporation

Author(s): Martin Mann, Rhodri Saunders, Cameron Smith, Rolf Backofen, Charlotte M. Deane.


Lattice models are a common abstraction used in the study of protein structure, folding, and refinement. They are advantageous because the discretisation of space can make extensive protein evaluations computationally feasible. Various approaches to the protein chain lattice fitting problem have been suggested but only a single backbone-only tool is available currently. We introduce LatFit, a new tool to produce high-accuracy lattice protein models. It generates both backbone-only and backbone-side-chain models in any user defined lattice. LatFit implements a new distance RMSD-optimisation fitting procedure in addition to the known coordinate RMSD method. We tested LatFit’s accuracy and speed using a large nonredundant set of high resolution proteins (SCOP database) on three commonly used lattices: 3D cubic, face-centred cubic, and knight’s walk. Fitting speed compared favourably to other methods and both backbone-only and backbone-side-chain models show low deviation from the original data (~1.5 Å RMSD in the FCC lattice). To our knowledge this represents the first comprehensive study of lattice quality for on-lattice protein models including side chains while LatFit is the only available tool for such models.

Partial Text

It is not always computationally feasible to undertake protein structure studies using full atom representations. The challenge is to reduce complexity while maintaining detail [1–3]. Lattice protein models are often used to achieve this but in general only the protein backbone or the amino acid centre of mass is represented [4–12]. A huge variety of lattices and energy functions have previously been developed and applied [4, 13, 14].

In order to enable a precise formulation of the method we introduce some preliminary definitions. A lattice L is a set of 3D coordinates x defined by a set of neighboring vectors υ ∈ N. The neighboring vectors are of equal length (∀υ,υ′∈N:|υ | = |υ′|), each with a reverse within the neighborhood (∀υ∈N : −υ ∈ N), such that each coordinate in L can be expressed by a linear combination of the neighboring vectors, that is, L = {  x | x = ∑υ∈Nd · υ∧d ∈ ℤ0+}. |N| gives the coordinate number of the lattice, for example, 6 for 3D-cubic or 12 for the FCC lattice.

In the following, we evaluate the average fitting quality of our new LatFit tool to results known from literature [6, 8, 13]. Furthermore, we investigate the performance of the new dRMSD-based fitting procedure implemented in LatFit. To this end, we compare its results to the cRMSD-optimizing approach that follows [6, 18], both implemented within LatFit.

LatFit enables the automated high resolution fitting of both backbone and side chain lattice protein models from full atomic data in PDB format. We demonstrate its high accuracy on three widely used lattices using a large, nonredundant protein data set of high resolution. Side chain fits show on average a higher deviation than backbone models, but both produce high quality fits with results generally less than 1.5 Å on the face-centred cubic lattice. To our knowledge, this is the first study and publicly available implementation for side chain models in this field. Available via web interface and as a stand-alone tool, LatFit addresses the lack of available programs and is well placed to enable further, more detailed investigation of protein structure in a reduced complexity environment. Even now the LatFit webserver is in daily use worldwide (monitored via Google Analytics,, which shows the need for efficient implementations such as LatFit.