Limits...
Genome wide survey, discovery and evolution of repetitive elements in three Entamoeba species.

Lorenzi H, Thiagarajan M, Haas B, Wortman J, Hall N, Caler E - BMC Genomics (2008)

Bottom Line: Additionally, we found a putative transposase-coding gene in E. histolytica and E. dispar related to the mariner transposon Hydargos from E. invadens.The mapping of all transposable elements found in these parasites shows that repeat coverage is up to three times higher than previously reported.Our work shows that transposable elements are organized in clusters, frequently found at syntenic break points providing insights into their contribution to chromosome instability and therefore, to genomic variation and speciation in these parasites.

View Article: PubMed Central - HTML - PubMed

Affiliation: J, Craig Venter Institute, 9704 Medical Center Drive, Rockville, MD 20850, USA. hlorenzi@jcvi.org

ABSTRACT

Background: Identification and mapping of repetitive elements is a key step for accurate gene prediction and overall structural annotation of genomes. During the assembly and annotation of three highly repetitive amoeba genomes, Entamoeba histolytica, Entamoeba dispar, and Entamoeba invadens, we performed comparative sequence analysis to identify and map all class I and class II transposable elements in their sequences.

Results: Here, we report the identification of two novel Entamoeba-specific repeats: ERE1 and ERE2; ERE1 is spread across the three genomes and associated with different repeats in a species-specific manner, while ERE2 is unique to E. histolytica. We also report the identification of two novel subfamilies of LINE and SINE retrotransposons in E. dispar and provide evidence for how the different LINE and SINE subfamilies evolved in these species. Additionally, we found a putative transposase-coding gene in E. histolytica and E. dispar related to the mariner transposon Hydargos from E. invadens. The distribution of transposable elements in these genomes is markedly skewed with a tendency of forming clusters. More than 70% of the three genomes have a repeat density below their corresponding average value indicating that transposable elements are not evenly distributed. We show that repeats and repeat-clusters are found at syntenic break points between E. histolytica and E. dispar and hence, could work as recombination hot spots promoting genome rearrangements.

Conclusion: The mapping of all transposable elements found in these parasites shows that repeat coverage is up to three times higher than previously reported. LINE, ERE1 and mariner elements were present in the common ancestor to the three Entamoeba species while ERE2 was likely acquired by E. histolytica after its separation from E. dispar. We demonstrate that E. histolytica and E. dispar share their entire repertoire of LINE and SINE retrotransposons and that Eh_SINE3/Ed_SINE1 originated as a chimeric SINE from Eh/Ed_SINE2 and Eh_SINE1/Ed_SINE3. Our work shows that transposable elements are organized in clusters, frequently found at syntenic break points providing insights into their contribution to chromosome instability and therefore, to genomic variation and speciation in these parasites.

Show MeSH

Related in: MedlinePlus

Identification of mariner-related elements in E. histolytica and E. dispar. A) Phylogenetic position of Eh_mariner, Ed_mariner and Ei_Hydargos (highlighted in blue) in the IS630/Tc1/mariner superfamily. Mariner subfamilies and related transposons (Tc1, ItmD37E, and plant mariner-like elements) are shown. Elements are identified by host name and GeneInfo Identifier (gi). Branches supported by less than 500 bootstrap replicates are depicted as thin black lines; branches having bootstrap values between 500 and 750 are shown as bold grey lines; branches with values above 750 are represented as bold black lines. B) ClustalX alignment of the transposase domain found in Eh_mariner and Ed_mariner together with three closely related transposases. Amino acids conserved in at least 3 sequences are colored in black. Asterisks denote the three conserved glutamic residues typical of this type of transposases. Parentheses indicate number of residues between conserved blocks.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC2657916&req=5

Figure 3: Identification of mariner-related elements in E. histolytica and E. dispar. A) Phylogenetic position of Eh_mariner, Ed_mariner and Ei_Hydargos (highlighted in blue) in the IS630/Tc1/mariner superfamily. Mariner subfamilies and related transposons (Tc1, ItmD37E, and plant mariner-like elements) are shown. Elements are identified by host name and GeneInfo Identifier (gi). Branches supported by less than 500 bootstrap replicates are depicted as thin black lines; branches having bootstrap values between 500 and 750 are shown as bold grey lines; branches with values above 750 are represented as bold black lines. B) ClustalX alignment of the transposase domain found in Eh_mariner and Ed_mariner together with three closely related transposases. Amino acids conserved in at least 3 sequences are colored in black. Asterisks denote the three conserved glutamic residues typical of this type of transposases. Parentheses indicate number of residues between conserved blocks.

Mentions: Pritham et. al [3] reported the existence of five different families of transposons belonging to the Tc1/mariner superfamily in the genomes of E. invadens and E. moshkovskii (Table 1), suggesting that these TEs were already present in the common ancestor of the three Entamoeba species in this study. However, no such transposons have been identified in E. histolytica and E. dispar, raising the question whether E. invadens and E. moshkovskii acquired these mariner transposons by horizontal transfer or vertically from the common ancestor. To address this issue we used Transposon-PSI, an analysis tool developed in-house to identify sequences homologous to large and diverse families of transposable elements (see Methods) to look for mariner-related sequences in the genomes of E. histolytica and E. dispar. The program identified two genomic regions, one from each organism, that gave a highly significant hit (e-value < 1 × 10-14) against a mariner transposase from Drosophila melanogaster (gi1006789 in Fig. 3A). Both regions contained an ORF coding for a 335 aa and a 336 aa protein in E. histolytica (Eh_mariner) and E. dispar (Ed_mariner) respectively. These putative proteins shared 95% identity throughout their entire sequence suggesting that they could correspond to the same locus in both genomes. Further comparative analyses of a 20 Kb genomic region encompassing these ORFs confirmed that Eh_mariner and Ed_mariner were syntenic (data not shown). Unfortunately, it was not possible to determine the precise boundary of the elements due to the short nature of the mariner TIRs (less than 50 bp) [3].


Genome wide survey, discovery and evolution of repetitive elements in three Entamoeba species.

Lorenzi H, Thiagarajan M, Haas B, Wortman J, Hall N, Caler E - BMC Genomics (2008)

Identification of mariner-related elements in E. histolytica and E. dispar. A) Phylogenetic position of Eh_mariner, Ed_mariner and Ei_Hydargos (highlighted in blue) in the IS630/Tc1/mariner superfamily. Mariner subfamilies and related transposons (Tc1, ItmD37E, and plant mariner-like elements) are shown. Elements are identified by host name and GeneInfo Identifier (gi). Branches supported by less than 500 bootstrap replicates are depicted as thin black lines; branches having bootstrap values between 500 and 750 are shown as bold grey lines; branches with values above 750 are represented as bold black lines. B) ClustalX alignment of the transposase domain found in Eh_mariner and Ed_mariner together with three closely related transposases. Amino acids conserved in at least 3 sequences are colored in black. Asterisks denote the three conserved glutamic residues typical of this type of transposases. Parentheses indicate number of residues between conserved blocks.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC2657916&req=5

Figure 3: Identification of mariner-related elements in E. histolytica and E. dispar. A) Phylogenetic position of Eh_mariner, Ed_mariner and Ei_Hydargos (highlighted in blue) in the IS630/Tc1/mariner superfamily. Mariner subfamilies and related transposons (Tc1, ItmD37E, and plant mariner-like elements) are shown. Elements are identified by host name and GeneInfo Identifier (gi). Branches supported by less than 500 bootstrap replicates are depicted as thin black lines; branches having bootstrap values between 500 and 750 are shown as bold grey lines; branches with values above 750 are represented as bold black lines. B) ClustalX alignment of the transposase domain found in Eh_mariner and Ed_mariner together with three closely related transposases. Amino acids conserved in at least 3 sequences are colored in black. Asterisks denote the three conserved glutamic residues typical of this type of transposases. Parentheses indicate number of residues between conserved blocks.
Mentions: Pritham et. al [3] reported the existence of five different families of transposons belonging to the Tc1/mariner superfamily in the genomes of E. invadens and E. moshkovskii (Table 1), suggesting that these TEs were already present in the common ancestor of the three Entamoeba species in this study. However, no such transposons have been identified in E. histolytica and E. dispar, raising the question whether E. invadens and E. moshkovskii acquired these mariner transposons by horizontal transfer or vertically from the common ancestor. To address this issue we used Transposon-PSI, an analysis tool developed in-house to identify sequences homologous to large and diverse families of transposable elements (see Methods) to look for mariner-related sequences in the genomes of E. histolytica and E. dispar. The program identified two genomic regions, one from each organism, that gave a highly significant hit (e-value < 1 × 10-14) against a mariner transposase from Drosophila melanogaster (gi1006789 in Fig. 3A). Both regions contained an ORF coding for a 335 aa and a 336 aa protein in E. histolytica (Eh_mariner) and E. dispar (Ed_mariner) respectively. These putative proteins shared 95% identity throughout their entire sequence suggesting that they could correspond to the same locus in both genomes. Further comparative analyses of a 20 Kb genomic region encompassing these ORFs confirmed that Eh_mariner and Ed_mariner were syntenic (data not shown). Unfortunately, it was not possible to determine the precise boundary of the elements due to the short nature of the mariner TIRs (less than 50 bp) [3].

Bottom Line: Additionally, we found a putative transposase-coding gene in E. histolytica and E. dispar related to the mariner transposon Hydargos from E. invadens.The mapping of all transposable elements found in these parasites shows that repeat coverage is up to three times higher than previously reported.Our work shows that transposable elements are organized in clusters, frequently found at syntenic break points providing insights into their contribution to chromosome instability and therefore, to genomic variation and speciation in these parasites.

View Article: PubMed Central - HTML - PubMed

Affiliation: J, Craig Venter Institute, 9704 Medical Center Drive, Rockville, MD 20850, USA. hlorenzi@jcvi.org

ABSTRACT

Background: Identification and mapping of repetitive elements is a key step for accurate gene prediction and overall structural annotation of genomes. During the assembly and annotation of three highly repetitive amoeba genomes, Entamoeba histolytica, Entamoeba dispar, and Entamoeba invadens, we performed comparative sequence analysis to identify and map all class I and class II transposable elements in their sequences.

Results: Here, we report the identification of two novel Entamoeba-specific repeats: ERE1 and ERE2; ERE1 is spread across the three genomes and associated with different repeats in a species-specific manner, while ERE2 is unique to E. histolytica. We also report the identification of two novel subfamilies of LINE and SINE retrotransposons in E. dispar and provide evidence for how the different LINE and SINE subfamilies evolved in these species. Additionally, we found a putative transposase-coding gene in E. histolytica and E. dispar related to the mariner transposon Hydargos from E. invadens. The distribution of transposable elements in these genomes is markedly skewed with a tendency of forming clusters. More than 70% of the three genomes have a repeat density below their corresponding average value indicating that transposable elements are not evenly distributed. We show that repeats and repeat-clusters are found at syntenic break points between E. histolytica and E. dispar and hence, could work as recombination hot spots promoting genome rearrangements.

Conclusion: The mapping of all transposable elements found in these parasites shows that repeat coverage is up to three times higher than previously reported. LINE, ERE1 and mariner elements were present in the common ancestor to the three Entamoeba species while ERE2 was likely acquired by E. histolytica after its separation from E. dispar. We demonstrate that E. histolytica and E. dispar share their entire repertoire of LINE and SINE retrotransposons and that Eh_SINE3/Ed_SINE1 originated as a chimeric SINE from Eh/Ed_SINE2 and Eh_SINE1/Ed_SINE3. Our work shows that transposable elements are organized in clusters, frequently found at syntenic break points providing insights into their contribution to chromosome instability and therefore, to genomic variation and speciation in these parasites.

Show MeSH
Related in: MedlinePlus