Limits...
Phylogenetic analysis of mRNA polyadenylation sites reveals a role of transposable elements in evolution of the 3'-end of genes.

Lee JY, Ji Z, Tian B - Nucleic Acids Res. (2008)

Bottom Line: We found that the 3'-most poly(A) sites tend to be more conserved than upstream ones, whereas poly(A) sites located upstream of the 3'-most exon, also termed intronic poly(A) sites, tend to be much less conserved.We also found that nonconserved poly(A) sites are associated with transposable elements (TEs) to a much greater extent than conserved ones, albeit less frequently utilized.Our results establish a conservation pattern for alternative poly(A) sites in several vertebrate species, and indicate that the 3'-end of genes can be dynamically modified by TEs through evolution.

View Article: PubMed Central - PubMed

Affiliation: Graduate School of Biomedical Sciences and Department of Biochemistry and Molecular Biology, New Jersey Medical School, University of Medicine and Dentistry of New Jersey, Newark, NJ 07103, USA.

ABSTRACT
mRNA polyadenylation is an essential step for the maturation of almost all eukaryotic mRNAs, and is tightly coupled with termination of transcription in defining the 3'-end of genes. Large numbers of human and mouse genes harbor alternative polyadenylation sites [poly(A) sites] that lead to mRNA variants containing different 3'-untranslated regions (UTRs) and/or encoding distinct protein sequences. Here, we examined the conservation and divergence of different types of alternative poly(A) sites across human, mouse, rat and chicken. We found that the 3'-most poly(A) sites tend to be more conserved than upstream ones, whereas poly(A) sites located upstream of the 3'-most exon, also termed intronic poly(A) sites, tend to be much less conserved. Genes with longer evolutionary history are more likely to have alternative polyadenylation, suggesting gain of poly(A) sites through evolution. We also found that nonconserved poly(A) sites are associated with transposable elements (TEs) to a much greater extent than conserved ones, albeit less frequently utilized. Different classes of TEs have different characteristics in their association with poly(A) sites via exaptation of TE sequences into polyadenylation elements. Our results establish a conservation pattern for alternative poly(A) sites in several vertebrate species, and indicate that the 3'-end of genes can be dynamically modified by TEs through evolution.

Show MeSH
Poly(A) sites and Alu. (A) Distribution of poly(A) sites in AluSx subfamily of Alu. See the legend of Figure 4B for description of the graph. (B) Schematic of mechanisms by which different regions of Alu give rise to cis-elements for polyadenylation.
© Copyright Policy - creative-commons
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC2553571&req=5

Figure 6: Poly(A) sites and Alu. (A) Distribution of poly(A) sites in AluSx subfamily of Alu. See the legend of Figure 4B for description of the graph. (B) Schematic of mechanisms by which different regions of Alu give rise to cis-elements for polyadenylation.

Mentions: Alu has the highest copy number in the human genome among all TE families, and is the second top SINE associated with poly(A) sites, after MIR. Alu sequences are derived from 7SL RNA elements, and are composed of two related monomers separated by a middle A-rich region. An Alu sequence has a RNA polymerase III promoter located at the 5′-end, and a poly(A) sequence at the 3′-end that is required for retrotransposition (55). For the top subfamily, AluSx, four hot spots can be discerned (Figure 6A). The 5′-end region of AluSx tends to be located downstream of poly(A) sites. This region is rich in CG. Further examination of poly(A) sites associated with this region indicated that this region tends to give rise to TG elements via transition of C to T. Interestingly, CG dinucleotides in Alu were found to have about 10 times higher mutation rate than other dinucleotides in the sequence (56,57). Thus, despite that the consensus sequence of the 5′-end region does not have apparent cis-elements for polyadenylation, it has propensity to mutate to poly(A) site downstream elements. A second hot spot is located in the middle region of the plus strand. This region contains the middle A-rich sequence followed by a CG-rich sequence that is highly similar to the 5′-end region described above. Further examination indicated that the middle A-rich sequence tends to mutate to PAS and the CG-rich sequence tends to mutate to TG elements. Consistent with these findings, poly(A) sites associated with this region are completely encoded by Alu sequences. The third and fourth hot spots correspond to the plus strand and minus strand of the 3′-end poly(A) tail sequence, respectively. Not surprisingly, this poly(A) tail sequence can give rise to upstream PAS hexamers when in the sense orientation, or downstream T-rich elements when in the antisense orientation. Thus, despite lack of cis-elements for polyadenylation in its consensus, Alu sequences provide favorable breeding ground for new poly(A) sites by four mechanisms through mutations, as illustrated in Figure 6B. Its contribution to the 3′-end definition of human genes can be highly significant due to its widespread nature in the human genome.Figure 6.


Phylogenetic analysis of mRNA polyadenylation sites reveals a role of transposable elements in evolution of the 3'-end of genes.

Lee JY, Ji Z, Tian B - Nucleic Acids Res. (2008)

Poly(A) sites and Alu. (A) Distribution of poly(A) sites in AluSx subfamily of Alu. See the legend of Figure 4B for description of the graph. (B) Schematic of mechanisms by which different regions of Alu give rise to cis-elements for polyadenylation.
© Copyright Policy - creative-commons
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC2553571&req=5

Figure 6: Poly(A) sites and Alu. (A) Distribution of poly(A) sites in AluSx subfamily of Alu. See the legend of Figure 4B for description of the graph. (B) Schematic of mechanisms by which different regions of Alu give rise to cis-elements for polyadenylation.
Mentions: Alu has the highest copy number in the human genome among all TE families, and is the second top SINE associated with poly(A) sites, after MIR. Alu sequences are derived from 7SL RNA elements, and are composed of two related monomers separated by a middle A-rich region. An Alu sequence has a RNA polymerase III promoter located at the 5′-end, and a poly(A) sequence at the 3′-end that is required for retrotransposition (55). For the top subfamily, AluSx, four hot spots can be discerned (Figure 6A). The 5′-end region of AluSx tends to be located downstream of poly(A) sites. This region is rich in CG. Further examination of poly(A) sites associated with this region indicated that this region tends to give rise to TG elements via transition of C to T. Interestingly, CG dinucleotides in Alu were found to have about 10 times higher mutation rate than other dinucleotides in the sequence (56,57). Thus, despite that the consensus sequence of the 5′-end region does not have apparent cis-elements for polyadenylation, it has propensity to mutate to poly(A) site downstream elements. A second hot spot is located in the middle region of the plus strand. This region contains the middle A-rich sequence followed by a CG-rich sequence that is highly similar to the 5′-end region described above. Further examination indicated that the middle A-rich sequence tends to mutate to PAS and the CG-rich sequence tends to mutate to TG elements. Consistent with these findings, poly(A) sites associated with this region are completely encoded by Alu sequences. The third and fourth hot spots correspond to the plus strand and minus strand of the 3′-end poly(A) tail sequence, respectively. Not surprisingly, this poly(A) tail sequence can give rise to upstream PAS hexamers when in the sense orientation, or downstream T-rich elements when in the antisense orientation. Thus, despite lack of cis-elements for polyadenylation in its consensus, Alu sequences provide favorable breeding ground for new poly(A) sites by four mechanisms through mutations, as illustrated in Figure 6B. Its contribution to the 3′-end definition of human genes can be highly significant due to its widespread nature in the human genome.Figure 6.

Bottom Line: We found that the 3'-most poly(A) sites tend to be more conserved than upstream ones, whereas poly(A) sites located upstream of the 3'-most exon, also termed intronic poly(A) sites, tend to be much less conserved.We also found that nonconserved poly(A) sites are associated with transposable elements (TEs) to a much greater extent than conserved ones, albeit less frequently utilized.Our results establish a conservation pattern for alternative poly(A) sites in several vertebrate species, and indicate that the 3'-end of genes can be dynamically modified by TEs through evolution.

View Article: PubMed Central - PubMed

Affiliation: Graduate School of Biomedical Sciences and Department of Biochemistry and Molecular Biology, New Jersey Medical School, University of Medicine and Dentistry of New Jersey, Newark, NJ 07103, USA.

ABSTRACT
mRNA polyadenylation is an essential step for the maturation of almost all eukaryotic mRNAs, and is tightly coupled with termination of transcription in defining the 3'-end of genes. Large numbers of human and mouse genes harbor alternative polyadenylation sites [poly(A) sites] that lead to mRNA variants containing different 3'-untranslated regions (UTRs) and/or encoding distinct protein sequences. Here, we examined the conservation and divergence of different types of alternative poly(A) sites across human, mouse, rat and chicken. We found that the 3'-most poly(A) sites tend to be more conserved than upstream ones, whereas poly(A) sites located upstream of the 3'-most exon, also termed intronic poly(A) sites, tend to be much less conserved. Genes with longer evolutionary history are more likely to have alternative polyadenylation, suggesting gain of poly(A) sites through evolution. We also found that nonconserved poly(A) sites are associated with transposable elements (TEs) to a much greater extent than conserved ones, albeit less frequently utilized. Different classes of TEs have different characteristics in their association with poly(A) sites via exaptation of TE sequences into polyadenylation elements. Our results establish a conservation pattern for alternative poly(A) sites in several vertebrate species, and indicate that the 3'-end of genes can be dynamically modified by TEs through evolution.

Show MeSH