Limits...
Genome wide survey, discovery and evolution of repetitive elements in three Entamoeba species.

Lorenzi H, Thiagarajan M, Haas B, Wortman J, Hall N, Caler E - BMC Genomics (2008)

Bottom Line: Additionally, we found a putative transposase-coding gene in E. histolytica and E. dispar related to the mariner transposon Hydargos from E. invadens.The mapping of all transposable elements found in these parasites shows that repeat coverage is up to three times higher than previously reported.Our work shows that transposable elements are organized in clusters, frequently found at syntenic break points providing insights into their contribution to chromosome instability and therefore, to genomic variation and speciation in these parasites.

View Article: PubMed Central - HTML - PubMed

Affiliation: J, Craig Venter Institute, 9704 Medical Center Drive, Rockville, MD 20850, USA. hlorenzi@jcvi.org

ABSTRACT

Background: Identification and mapping of repetitive elements is a key step for accurate gene prediction and overall structural annotation of genomes. During the assembly and annotation of three highly repetitive amoeba genomes, Entamoeba histolytica, Entamoeba dispar, and Entamoeba invadens, we performed comparative sequence analysis to identify and map all class I and class II transposable elements in their sequences.

Results: Here, we report the identification of two novel Entamoeba-specific repeats: ERE1 and ERE2; ERE1 is spread across the three genomes and associated with different repeats in a species-specific manner, while ERE2 is unique to E. histolytica. We also report the identification of two novel subfamilies of LINE and SINE retrotransposons in E. dispar and provide evidence for how the different LINE and SINE subfamilies evolved in these species. Additionally, we found a putative transposase-coding gene in E. histolytica and E. dispar related to the mariner transposon Hydargos from E. invadens. The distribution of transposable elements in these genomes is markedly skewed with a tendency of forming clusters. More than 70% of the three genomes have a repeat density below their corresponding average value indicating that transposable elements are not evenly distributed. We show that repeats and repeat-clusters are found at syntenic break points between E. histolytica and E. dispar and hence, could work as recombination hot spots promoting genome rearrangements.

Conclusion: The mapping of all transposable elements found in these parasites shows that repeat coverage is up to three times higher than previously reported. LINE, ERE1 and mariner elements were present in the common ancestor to the three Entamoeba species while ERE2 was likely acquired by E. histolytica after its separation from E. dispar. We demonstrate that E. histolytica and E. dispar share their entire repertoire of LINE and SINE retrotransposons and that Eh_SINE3/Ed_SINE1 originated as a chimeric SINE from Eh/Ed_SINE2 and Eh_SINE1/Ed_SINE3. Our work shows that transposable elements are organized in clusters, frequently found at syntenic break points providing insights into their contribution to chromosome instability and therefore, to genomic variation and speciation in these parasites.

Show MeSH

Related in: MedlinePlus

Characterization and phylogenetic analysis of LINE elements in Entamoeba sp. A) Multiple sequence alignment of the 5' and 3' ends of Ei_LINE and insertion sites. The 5' and 3' termini are highlighted in bold. Target site duplications (TSD) are underlined. Ei_LINEs from contigs AANW02000355, AANW02001294, AANW02001046 and AANW02001402 are truncated and lack either the 5' or 3' end of the element. Genomic coordinates for Ei_LINEs excluding ISD are: AANW02000718 (41,801–46,844), AANW02000022 (38,839–43,869), AANW02000355 (5,491–6,819), AANW02001294 (486–1,665), AANW02001046 (2,758-1,949) and AANW02001402 (3,418-2,447). GenBank accessions of E. invadens contigs are indicated on the left. B) Phylogenetic analysis of the reverse transcriptase sequences from all identified Entamoeba LINEs compared to reverse transcriptases derived from different families of retroelements and retroviruses. Thin black lines, branches with bootstrap values below 500; bold grey lines, branches containing bootstrap values between 500 and 750; bold black lines, branches supported by bootstrap values above 750. Nodes containing Entamoeba LINEs are highlighted in blue.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC2657916&req=5

Figure 4: Characterization and phylogenetic analysis of LINE elements in Entamoeba sp. A) Multiple sequence alignment of the 5' and 3' ends of Ei_LINE and insertion sites. The 5' and 3' termini are highlighted in bold. Target site duplications (TSD) are underlined. Ei_LINEs from contigs AANW02000355, AANW02001294, AANW02001046 and AANW02001402 are truncated and lack either the 5' or 3' end of the element. Genomic coordinates for Ei_LINEs excluding ISD are: AANW02000718 (41,801–46,844), AANW02000022 (38,839–43,869), AANW02000355 (5,491–6,819), AANW02001294 (486–1,665), AANW02001046 (2,758-1,949) and AANW02001402 (3,418-2,447). GenBank accessions of E. invadens contigs are indicated on the left. B) Phylogenetic analysis of the reverse transcriptase sequences from all identified Entamoeba LINEs compared to reverse transcriptases derived from different families of retroelements and retroviruses. Thin black lines, branches with bootstrap values below 500; bold grey lines, branches containing bootstrap values between 500 and 750; bold black lines, branches supported by bootstrap values above 750. Nodes containing Entamoeba LINEs are highlighted in blue.

Mentions: Characterization of the single LINE element previously identified in E. invadens [3] indicated that Ei_LINE is a 5,043 bp sequence flanked by TSDs (Fig. 4A). Only two complete copies of Ei_LINE were found in E. invadens and neither of them had a complete ORF coding for a reverse transcriptase protein.


Genome wide survey, discovery and evolution of repetitive elements in three Entamoeba species.

Lorenzi H, Thiagarajan M, Haas B, Wortman J, Hall N, Caler E - BMC Genomics (2008)

Characterization and phylogenetic analysis of LINE elements in Entamoeba sp. A) Multiple sequence alignment of the 5' and 3' ends of Ei_LINE and insertion sites. The 5' and 3' termini are highlighted in bold. Target site duplications (TSD) are underlined. Ei_LINEs from contigs AANW02000355, AANW02001294, AANW02001046 and AANW02001402 are truncated and lack either the 5' or 3' end of the element. Genomic coordinates for Ei_LINEs excluding ISD are: AANW02000718 (41,801–46,844), AANW02000022 (38,839–43,869), AANW02000355 (5,491–6,819), AANW02001294 (486–1,665), AANW02001046 (2,758-1,949) and AANW02001402 (3,418-2,447). GenBank accessions of E. invadens contigs are indicated on the left. B) Phylogenetic analysis of the reverse transcriptase sequences from all identified Entamoeba LINEs compared to reverse transcriptases derived from different families of retroelements and retroviruses. Thin black lines, branches with bootstrap values below 500; bold grey lines, branches containing bootstrap values between 500 and 750; bold black lines, branches supported by bootstrap values above 750. Nodes containing Entamoeba LINEs are highlighted in blue.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC2657916&req=5

Figure 4: Characterization and phylogenetic analysis of LINE elements in Entamoeba sp. A) Multiple sequence alignment of the 5' and 3' ends of Ei_LINE and insertion sites. The 5' and 3' termini are highlighted in bold. Target site duplications (TSD) are underlined. Ei_LINEs from contigs AANW02000355, AANW02001294, AANW02001046 and AANW02001402 are truncated and lack either the 5' or 3' end of the element. Genomic coordinates for Ei_LINEs excluding ISD are: AANW02000718 (41,801–46,844), AANW02000022 (38,839–43,869), AANW02000355 (5,491–6,819), AANW02001294 (486–1,665), AANW02001046 (2,758-1,949) and AANW02001402 (3,418-2,447). GenBank accessions of E. invadens contigs are indicated on the left. B) Phylogenetic analysis of the reverse transcriptase sequences from all identified Entamoeba LINEs compared to reverse transcriptases derived from different families of retroelements and retroviruses. Thin black lines, branches with bootstrap values below 500; bold grey lines, branches containing bootstrap values between 500 and 750; bold black lines, branches supported by bootstrap values above 750. Nodes containing Entamoeba LINEs are highlighted in blue.
Mentions: Characterization of the single LINE element previously identified in E. invadens [3] indicated that Ei_LINE is a 5,043 bp sequence flanked by TSDs (Fig. 4A). Only two complete copies of Ei_LINE were found in E. invadens and neither of them had a complete ORF coding for a reverse transcriptase protein.

Bottom Line: Additionally, we found a putative transposase-coding gene in E. histolytica and E. dispar related to the mariner transposon Hydargos from E. invadens.The mapping of all transposable elements found in these parasites shows that repeat coverage is up to three times higher than previously reported.Our work shows that transposable elements are organized in clusters, frequently found at syntenic break points providing insights into their contribution to chromosome instability and therefore, to genomic variation and speciation in these parasites.

View Article: PubMed Central - HTML - PubMed

Affiliation: J, Craig Venter Institute, 9704 Medical Center Drive, Rockville, MD 20850, USA. hlorenzi@jcvi.org

ABSTRACT

Background: Identification and mapping of repetitive elements is a key step for accurate gene prediction and overall structural annotation of genomes. During the assembly and annotation of three highly repetitive amoeba genomes, Entamoeba histolytica, Entamoeba dispar, and Entamoeba invadens, we performed comparative sequence analysis to identify and map all class I and class II transposable elements in their sequences.

Results: Here, we report the identification of two novel Entamoeba-specific repeats: ERE1 and ERE2; ERE1 is spread across the three genomes and associated with different repeats in a species-specific manner, while ERE2 is unique to E. histolytica. We also report the identification of two novel subfamilies of LINE and SINE retrotransposons in E. dispar and provide evidence for how the different LINE and SINE subfamilies evolved in these species. Additionally, we found a putative transposase-coding gene in E. histolytica and E. dispar related to the mariner transposon Hydargos from E. invadens. The distribution of transposable elements in these genomes is markedly skewed with a tendency of forming clusters. More than 70% of the three genomes have a repeat density below their corresponding average value indicating that transposable elements are not evenly distributed. We show that repeats and repeat-clusters are found at syntenic break points between E. histolytica and E. dispar and hence, could work as recombination hot spots promoting genome rearrangements.

Conclusion: The mapping of all transposable elements found in these parasites shows that repeat coverage is up to three times higher than previously reported. LINE, ERE1 and mariner elements were present in the common ancestor to the three Entamoeba species while ERE2 was likely acquired by E. histolytica after its separation from E. dispar. We demonstrate that E. histolytica and E. dispar share their entire repertoire of LINE and SINE retrotransposons and that Eh_SINE3/Ed_SINE1 originated as a chimeric SINE from Eh/Ed_SINE2 and Eh_SINE1/Ed_SINE3. Our work shows that transposable elements are organized in clusters, frequently found at syntenic break points providing insights into their contribution to chromosome instability and therefore, to genomic variation and speciation in these parasites.

Show MeSH
Related in: MedlinePlus