Research Article: Investigation of somatic single nucleotide variations in human endogenous retrovirus elements and their potential association with cancer

Date Published: April 1, 2019

Publisher: Public Library of Science

Author(s): Ting-Chia Chang, Santosh Goud, John Torcivia-Rodriguez, Yu Hu, Qing Pan, Robel Kahsay, Jonas Blomberg, Raja Mazumder, Robert Belshaw.


Human endogenous retroviruses (HERVs) have been investigated for potential links with human cancer. However, the distribution of somatic nucleotide variations in HERV elements has not been explored in detail. This study aims to identify HERV elements with an over-representation of somatic mutations (hot spots) in cancer patients. Four HERV elements with mutation hotspots were identified that overlap with exons of four human protein coding genes. These hotspots were identified based on the significant over-representation (p<8.62e-4) of non-synonymous single-nucleotide variations (nsSNVs). These genes are TNN (HERV-9/LTR12), OR4K15 (HERV-IP10F/LTR10F), ZNF99 (HERV-W/HERV17/LTR17), and KIR2DL1 (MST/MaLR). In an effort to identify mutations that effect survival, all nsSNVs were further evaluated and it was found that kidney cancer patients with mutation C2270G in ZNF99 have a significantly lower survival rate (hazard ratio = 2.6) compared to those without it. Among HERV elements in the human non-protein coding regions, we found 788 HERVs with significantly elevated numbers of somatic single-nucleotide variations (SNVs) (p<1.60e-5). From this category the top three HERV elements with significantly over-represented SNVs are HERV-H/LTR7, HERV-9/LTR12 and HERV-L/MLT2. Majority of the SNVs in these 788 HERV elements are located in three DNA functional groups: long non-coding RNAs (lncRNAs) (60%), introns (22.2%) and transcriptional factor binding sites (TFBS) (14.8%). This study provides a list of mutational hotspots in HERVs, which could potentially be used as biomarkers and therapeutic targets.

Partial Text

Endogenous retroviruses (ERVs) have been embedded in the primate genomes for over 30 million years [1, 2]. Typically, the genetic structure of ERVs contains the internal coding sequencing of the four proviral genes (gag, pro, pol and env) along with two long terminal repeats (LTRs) [3]. Over the course of time, most ERVs in the human genome have been severely damaged in their original genetic structure due to the accumulation of mutations, insertions, deletions and translocations that have spliced out the original coding region of proviral genes between two flanking LTRs [4, 5]. Solitary LTRs are the most common ERVs within the human genome [4, 5].

In this study, we provide a large-scale and systematic analysis of somatic SNVs in different HERV classes and their subgroups. In our study, we identified four HERV elements with mutation hotspots that overlap with exons of four genes. Those genes are TNN, KIR2DL1, ZNF99, and OR4K15. It is well known that LTRs have played a significant role in human gene evolution [76]. Yu et al. have suggested that the mutation hotspot located on the 3’-LTR of HERV-W element may have a regulatory role and might be involved in the activation of neighboring genes and its abnormal expression [30]. A few studies support the involvement of TNN and KIR2DL1 in tumorigenesis. TNN is an extracellular matrix protein. It regulates TGF-beta to drive breast carcinogenesis and facilitates the migration of cancer cells into bone [77]. KIR2DL1+-HLA-C2+ (human leukocyte antigen) genotype was found in oral cancer patients [78]. In Fig 2, multiple exons (Exon 1–5) of KIR2DL1, which overlap with MaLR/MST element seems associated with oral cancer. A study indicated that the ZNF99 gene is involved in the transcription of HERV-W/HERV17/LTR17, which has been implicated in the pathogenesis of multiple sclerosis [79]. Interestingly, the patients with multiple sclerosis have an increased risk of lung, liver, hematologic, and bladder cancer [80]. In Fig 2, ZNF99 corresponding to HERV-W/HERV17/LTR17 are found in few cancers; especially, lung cancer. Importantly, patients with the alteration of amino acid position 757 (A to G modification) in ZNF99 have a lower survival rate than the patients without this variation as shown in Fig 5. Overall, our data indicated over-representative SNVs (hotspots) strengthen the relationship between these four possible HERV-involved genes and multiple cancer types.

In this study, we explored the correlation between HERV elements on human protein coding and non-coding regions and multiple cancers based on SNV hotspots. In the HERV element on human protein coding regions, we found four HERV elements that had over-represented nsSNVs and also overlapped with exons of four genes. Additionally, these four HERV elements were associated with at least 14 cancer types—notably skin and lung cancer. We showed that kidney cancer patients with the specific amino acid mutation A757G in ZNF99 within the HERV-W/LTR17/HERV17 element had a lower survival rate based on a survival analysis. We believe that this key mutation could play an important role in kidney cancer.




Leave a Reply

Your email address will not be published.