Research Article: Mathematical Philology: Entropy Information in Refining Classical Texts’ Reconstruction, and Early Philologists’ Anticipation of Information Theory

Date Published: January 13, 2010

Publisher: Public Library of Science

Author(s): John L. Cisne, Robert M. Ziomkowski, Steven J. Schwager, Enrico Scalas.

Abstract: Philologists reconstructing ancient texts from variously miscopied manuscripts anticipated information theorists by centuries in conceptualizing information in terms of probability. An example is the editorial principle difficilior lectio potior (DLP): in choosing between otherwise acceptable alternative wordings in different manuscripts, “the more difficult reading [is] preferable.” As philologists at least as early as Erasmus observed (and as information theory’s version of the second law of thermodynamics would predict), scribal errors tend to replace less frequent and hence entropically more information-rich wordings with more frequent ones. Without measurements, it has been unclear how effectively DLP has been used in the reconstruction of texts, and how effectively it could be used. We analyze a case history of acknowledged editorial excellence that mimics an experiment: the reconstruction of Lucretius’s De Rerum Natura, beginning with Lachmann’s landmark 1850 edition based on the two oldest manuscripts then known. Treating words as characters in a code, and taking the occurrence frequencies of words from a current, more broadly based edition, we calculate the difference in entropy information between Lachmann’s 756 pairs of grammatically acceptable alternatives. His choices average 0.26±0.20 bits higher in entropy information (95% confidence interval, P = 0.005), as against the single bit that determines the outcome of a coin toss, and the average 2.16±0.10 bits (95%) of (predominantly meaningless) entropy information if the rarer word had always been chosen. As a channel width, 0.26±0.20 bits/word corresponds to a 0.790.79+0.09−0.15 likelihood of the rarer word being the one accepted in the reference edition, which is consistent with the observed 547/756 = 0.72±0.03 (95%). Statistically informed application of DLP can recover substantial amounts of semantically meaningful entropy information from noise; hence the extension copiosior informatione lectio potior, “the reading richer in information [is] preferable.” New applications of information theory promise continued refinement in the reconstruction of culturally fundamental texts.

Partial Text: How accurately have culturally fundamental texts been transmitted to the present by way of variously miscopied manuscripts? If the accuracy can be measured, can it be improved, and if so, how? Philology traditionally has been concerned almost entirely with information of the semantic kind, that is, with meaning. Here we are concerned instead with what has been called entropy information, information entropy, and Shannon entropy (and sometimes negentropy in recognition that a higher information content corresponds to a higher degree of disorder). In the first study of its kind, we measure the accuracy of transmission in bits/word of meaningful entropy information. The case in point is one of acknowledged editorial excellence and cultural importance: the reconstruction of Lucretius’s De Rerum Natura, beginning with Lachmann’s 1850 edition [1], a defining example of modern textual criticism [2]–[4].

Our results suggest that the difficilior lectio potior principle (DLP) can indeed be useful as an editorial rule of thumb. This is consistent with the notion that the early philologists who framed DLP had prescient understanding of information as a probabilistic phenomenon.



0 0 vote
Article Rating
Notify of
Inline Feedbacks
View all comments