Date Published: August 18, 2010
Publisher: Public Library of Science
Author(s): Regina Berretta, Pablo Moscato, William C. S. Cho. http://doi.org/10.1371/journal.pone.0012262
Abstract: It is a commonly accepted belief that cancer cells modify their transcriptional state during the progression of the disease. We propose that the progression of cancer cells towards malignant phenotypes can be efficiently tracked using high-throughput technologies that follow the gradual changes observed in the gene expression profiles by employing Shannon’s mathematical theory of communication. Methods based on Information Theory can then quantify the divergence of cancer cells’ transcriptional profiles from those of normally appearing cells of the originating tissues. The relevance of the proposed methods can be evaluated using microarray datasets available in the public domain but the method is in principle applicable to other high-throughput methods.
Using melanoma and prostate cancer datasets we illustrate how it is possible to employ Shannon Entropy and the Jensen-Shannon divergence to trace the transcriptional changes progression of the disease. We establish how the variations of these two measures correlate with established biomarkers of cancer progression. The Information Theory measures allow us to identify novel biomarkers for both progressive and relatively more sudden transcriptional changes leading to malignant phenotypes. At the same time, the methodology was able to validate a large number of genes and processes that seem to be implicated in the progression of melanoma and prostate cancer.
We thus present a quantitative guiding rule, a new unifying hallmark of cancer: the cancer cell’s transcriptome changes lead to measurable observed transitions of Normalized Shannon Entropy values (as measured by high-througput technologies). At the same time, tumor cells increment their divergence from the normal tissue profile increasing their disorder via creation of states that we might not directly measure. This unifying hallmark allows, via the the Jensen-Shannon divergence, to identify the arrow of time of the processes from the gene expression profiles, and helps to map the phenotypical and molecular hallmarks of specific cancer subtypes. The deep mathematical basis of the approach allows us to suggest that this principle is, hopefully, of general applicability for other diseases.
Partial Text: In a seminal review paper published nine years ago, Hanahan and Weinberg  introduced the “hallmarks of cancer”. They are six essential alterations of cell physiology that generally occur in cancer cells independently of the originating tissue type. They listed: “self-sufficiency in growth signals, insensitivity to growth-inhibitory signals, evasion of the normal programmed-cell mechanisms (apoptosis), limitless replicative potential, sustained angiogenesis, and finally, tissue invasion and metastasis”. More recently, several researchers have advocated including “stemness” as the seventh hallmark of cancer cells. This conclusion has been reached from the outcomes of the analysis of high-throughput gene expression datasets , . The new role of stemness as a hallmark change of cancer cells is also supported by the observation that histologically poorly differentiated tumors show transcriptional profiles on which there is an overexpression of genes normally enriched in embryonic stem cells. For example, in breast cancer the activation targets of the pluripotency markers like NANOG, OCT4, SOX2 and c-MYC have been shown to be overexpressed in poorly differentiated tumors in marked contrast with their expression in well-differentiated tumors .
We refer the reader to the original publications for details of methods for data collection, but we highlight here some aspects that are important to understand the data generation process for the purpose of our analysis.