Research Article: Data mining of iron(II) and iron(III) bond-valence parameters, and their relevance for macromolecular crystallography

Date Published: April 01, 2017

Publisher: International Union of Crystallography

Author(s): Heping Zheng, Karol M. Langner, Gregory P. Shields, Jing Hou, Marcin Kowiel, Frank H. Allen, Garib Murshudov, Wladek Minor.


Using all available metal-containing organic compound structures in the Cambridge Structural Database, a novel data-driven method to derive bond-valence R0 parameters was developed. While confirming almost all reference literature values, two distinct populations of FeII—N and FeIII—N bonds are observed, which are interpreted as low-spin and high-spin states of the coordinating iron. Based on the R0 parameters derived here, guidelines for the modeling of iron–ligand distances in macromolecular structures are suggested.

Partial Text

The bond-valence model relates the oxidation number of an atom to its immediate surroundings, and as such has been indispensable in a multitude of structural applications (Brown, 2009 ▸), including the analysis of metal-binding sites in proteins (Müller et al., 2003 ▸). During the investigation of metal ion-binding architectures in proteins (Zheng et al., 2008 ▸, 2014 ▸), we successfully employed the bond-valence model to check the quality of metal-binding site modeling in low-resolution structures. Initially, we used reference literature values for bond-valence (R0) parameters, which were derived two decades ago from manually curated structures and extrapolated linear relationships between bond-valence contributions (Brese & O’Keeffe, 1991 ▸; Brown & Altermatt, 1985 ▸). However, in cases involving iron and nitrogen, such as structures containing heme, we consistently obtained bond-valence sums that were significantly different from known oxidation states, prompting us to attempt a re-evaluation of bond-valence R0 parameters for iron-binding sites.

By minimizing the squared deviations of bond-valence sums around the expected oxidation state, we derived optimal R0 bond-valence parameters from a large set of iron–organic binding sites from the CSD. Several thousand homoleptic and heteroleptic iron-binding sites were treated together during each optimization. We were able to discern two populations of iron(II)-binding sites, corresponding to iron–nitrogen R0 parameters of 1.57 Å for low-spin iron and 1.76 Å for high-spin iron. Iron(III) sites revealed a similar bimodal distribution, corresponding to iron–nitrogen R0 parameters of 1.70 and 1.83 Å for low-spin and high-spin iron, respectively. To validate our novel approach, we examined its applicability to five other biologically relevant metal ions (Na, Mg, K, Ca and Zn). All of the resulting metal–ligand bond-valence parameters and distances agree with the R0 values reported previously within the estimated uncertainty |ΔR0|2 of 0.1 Å. We recommend the use of spin-state-dependent R0 values for evaluating future structures with metal–organic sites containing iron–nitrogen bonds, particularly in heme-containing proteins. Our data-driven optimization procedure is fully reproducible and therefore provides a starting point for further improving the methodology and further refinement of bond-valence model parameters. The data and code for the numeric procedure for optimization of bond-valence parameters has been made available on github at and on figshare at




Leave a Reply

Your email address will not be published.