Research Article: The Peculiarities of Large Intron Splicing in Animals

Date Published: November 16, 2009

Publisher: Public Library of Science

Author(s): Samuel Shepard, Mark McCreary, Alexei Fedorov, Alan Christoffels.

Abstract: In mammals a considerable 92% of genes contain introns, with hundreds and hundreds of these introns reaching the incredible size of over 50,000 nucleotides. These “large introns” must be spliced out of the pre-mRNA in a timely fashion, which involves bringing together distant 5′ and 3′ acceptor and donor splice sites. In invertebrates, especially Drosophila, it has been shown that larger introns can be spliced efficiently through a process known as recursive splicing—a consecutive splicing from the 5′-end at a series of combined donor-acceptor splice sites called RP-sites. Using a computational analysis of the genomic sequences, we show that vertebrates lack the proper enrichment of RP-sites in their large introns, and, therefore, require some other method to aid splicing. We analyzed over 15,000 non-redundant, large introns from six mammals, 1,600 from chicken and zebrafish, and 560 non-redundant large introns from five invertebrates. Our bioinformatic investigation demonstrates that, unlike the studied invertebrates, the studied vertebrate genomes contain consistently abundant amounts of direct and complementary strand interspersed repetitive elements (mainly SINEs and LINEs) that may form stems with each other in large introns. This examination showed that predicted stems are indeed abundant and stable in the large introns of mammals. We hypothesize that such stems with long loops within large introns allow intron splice sites to find each other more quickly by folding the intronic RNA upon itself at smaller intervals and, thus, reducing the distance between donor and acceptor sites.

Partial Text: Introns are found ubiquitously in eukaryotic genomes and yet their role is still poorly understood and underappreciated. A range of recent studies have suggested that introns may have even existed in what some regard to be primordial eukaryotes [1]–[4] or even earlier [5]–[6]. Different aspects of the evolution of introns have been well reviewed by [7]–[9].

The sequences of non-redundant, large introns (>50 kb) were obtained from the Exon-Intron Database [16]. Our datasets are available upon request. Supplementary Figure S1, Figure 3, and Tables 1–4 used these datasets. For the human intergenic region RP-site analysis we used the same data set as in [17], which contains over 3.5 million nucleotides. For the fruit fly intergenic region RP-site analysis we used the complete set of intergenic regions from FlyBase release 5.10 ( [18]. FlyBase release 5.10 was the same release used to build our Exon-Intron database from which we obtained the sample of large introns. See Supplementary Figure S3 for intergenic region RP-site analysis.

The timely removal of large introns from pre-mRNA poses a challenging problem to spliceosomal machinery. It has been experimentally and computationally proven that in Drosophila melanogaster there exists a special strategy named recursive splicing for the excision of large introns. Recursive splicing occurs via selective accumulation of combined donor-acceptor splicing sites called RP-sites [12]–[13]. In our research, we used the complete set of Drosophila large introns to confirm the previous computation by Burnette et al.[13]—showing again that fruit fly had more than 20 times the selective accumulation of RP-sites within large introns over their complementary strands. Similarly, all other studied Insecta species (mosquito, honey bee, and beetle) as well as more distant invertebrates (sea urchin) also had an accumulation of RP-sites within their large introns that was several times more abundant when compared to their complementary strand. We also showed that the accumulation of RP-sites is in particular with respect to intron class size in fruit fly but not in human (Supplementary Figure S1). On the other hand, all studied vertebrates, including six mammals, did not show significant accumulation of RP-sites (see Supplementary Figure S2 for a visual representation of this phenomena). Moreover, vertebrate species have overwhelmingly more large introns than the examined invertebrates. Therefore, vertebrates must mobilize another molecular mechanism for the removal of their large introns from pre-mRNA. We have hypothesized that multiple hairpins with large loops could form compact spatial structures within large introns that could help put the donor and acceptor splice sites in close proximity in order to facilitate splicing.