Research Article: Information Thermodynamics of Cytosine DNA Methylation

Date Published: March 10, 2016

Publisher: Public Library of Science

Author(s): Robersy Sanchez, Sally A. Mackenzie, Barbara Bardoni.


Cytosine DNA methylation (CDM) is a stable epigenetic modification to the genome and a widespread regulatory process in living organisms that involves multicomponent molecular machines. Genome-wide cytosine methylation patterning participates in the epigenetic reprogramming of a cell, suggesting that the biological information contained within methylation positions may be amenable to decoding. Adaptation to a new cellular or organismal environment also implies the potential for genome-wide redistribution of CDM changes that will ensure the stability of DNA molecules. This raises the question of whether or not we would be able to sort out the regulatory methylation signals from the CDM background (“noise”) induced by thermal fluctuations. Here, we propose a novel statistical and information thermodynamic description of the CDM changes to address the last question. The physical basis of our statistical mechanical model was evaluated in two respects: 1) the adherence to Landauer’s principle, according to which molecular machines must dissipate a minimum energy ε = kBT ln2 at each logic operation, where kB is the Boltzmann constant, and T is the absolute temperature and 2) whether or not the binary stretch of methylation marks on the DNA molecule comprise a language of sorts, properly constrained by thermodynamic principles. The study was performed for genome-wide methylation data from 152 ecotypes and 40 trans-generational variations of Arabidopsis thaliana and 93 human tissues. The DNA persistence length, a basic mechanical property altered by CDM, was estimated with values from 39 to 66.9 nm. Classical methylome analysis can be retrieved by applying information thermodynamic modelling, which is able to discriminate signal from noise. Our finding suggests that the CDM signal comprises a language scheme properly constrained by molecular thermodynamic principles, which is part of an epigenomic communication system that obeys the same thermodynamic rules as do current human communication systems.

Partial Text

Plant and animal phenotypes respond to environmental changes, an adaptive capacity that is, at least in part, trans-generational. Genetic and epigenetic factors are involved in a phenotypic range of this response. The genome-wide cytosine DNA methylation patterning that participates in the epigenetic response of cells to environmental variation is controlled by a complex network of genes. Cytosine DNA methylation (CDM) results from the addition of methyl groups to cytosine C5 residues, and the configuration of methylation within a genome provides trans-generational epigenetic information. The biochemical reaction is catalyzed by methyltransferases recruited into complex multicomponent molecular machines [1]. The reverse process of methyl group removal is catalyzed by demethylases [2]. These epigenetic modifications can influence the transcriptional activity of the corresponding genes, or maintain genome integrity by repressing transposable elements and affecting long-term gene silencing mechanisms [1,3].

The absolute amount of information IR processed by the methylation machinery in the genomic region R was estimated from Arabidopsis and human methylomes (Eq 3). Under Landauer’s principle, the minimum energy dissipated to process the information IR can be expressed by the equation: ER = IRkBT ln2 (Eq 4). Based on simple physical assumptions, the probability density function (PDF) for the energies ER was approached by a Generalized Gamma distribution (GG, Eq 7). This probability distribution accounts for an informational statistical thermodynamics description of methylation changes induced by thermal fluctuations, which are presumed to stabilize the DNA molecule. These methylation changes represent “methylation background noise” with respect to the signal created by the methylation regulatory machinery. However, since methylation changes alter the mechanical properties of the DNA molecule [6], any methylation signal created by the methylation regulatory machinery also implies a redistribution of CDM changes for DNA stability. So, the knowledge of the probability distribution followed by the methylation background noise provides an analytical way to discriminate it from the biological signal [29–31].

Results to date encompass the classical methylome analysis based on DMP detection, and suggest that CDM underlies an epigenomic communication system in living tissues, shown here in Arabidopsis thaliana and human samples. These observations present an approach for epigenomic studies within a framework of communication systems. The information thermodynamic modeling proposed here unveils links between genome-wide methylation analysis, molecular thermodynamics and information theory. The application of an information thermodynamics approach permits not only the discrimination of biological signal from methylation background noise, but also the application of Bayesian SDT [29,36]. That is, application of Bayesian SDT together with the information thermodynamics process provides the formulae for robust detection of epigenetic biomarkers [29,38,39].