Limits...
The diversification of PHIS transposon superfamily in eukaryotes.

Han MJ, Xiong CL, Zhang HB, Zhang MQ, Zhang HH, Zhang Z - Mob DNA (2015)

Bottom Line: These three groups have similar DDE domain-containing transposases; however, their coding capacity, species distribution, and target site duplications (TSDs) are significantly different.Furthermore, three new types of PHIS superfamily were identified.Our results not only enrich the transposon diversity but also have extensive significance for improving genome sequence assembly and annotation of higher organisms.

View Article: PubMed Central - PubMed

Affiliation: School of Life Sciences, Chongqing University, Chongqing, 400044 China.

ABSTRACT

Background: PHIS transposon superfamily belongs to DNA transposons and includes PIF/Harbinger, ISL2EU, and Spy transposon groups. These three groups have similar DDE domain-containing transposases; however, their coding capacity, species distribution, and target site duplications (TSDs) are significantly different.

Results: In this study, we systematically identified and analyzed PHIS transposons in 836 sequenced eukaryotic genomes using transposase homology search and structure approach. In total, 380 PHIS families were identified in 112 genomes and 168 of 380 families were firstly reported in this study. Besides previous identified PIF/Harbinger, ISL2EU, and Spy groups, three new types (called Pangu, NuwaI, and NuwaII) of PHIS superfamily were identified; each has its own distinctive characteristics, especially in TSDs. Pangu and NuwaII transposons are characterized by 5'-ANT-3' and 5'-C/TNA/G-3' TSDs, respectively. Both transposons are widely distributed in plants, fungi, and animals; the NuwaI transposons are characterized by 5'-CWG-3' TSDs and mainly distributed in animals.

Conclusions: Here, in total, 380 PHIS families were identified in eukaryotes. Among these 380 families, 168 were firstly reported in this study. Furthermore, three new types of PHIS superfamily were identified. Our results not only enrich the transposon diversity but also have extensive significance for improving genome sequence assembly and annotation of higher organisms.

No MeSH data available.


Distribution, abundance, and potential active families of PHIS transposons. a Taxonomic distribution of PHIS transposon groups across the eukaryotic tree of life. Different colored boxes indicate presence of the corresponding group, and flanking numbers represent the number of species. b The potential active families of each group in the eukaryotes. c The number of copies of each group in the eukaryotes
© Copyright Policy - open-access
Related In: Results  -  Collection

License 1 - License 2
getmorefigures.php?uid=PMC4482050&req=5

Fig2: Distribution, abundance, and potential active families of PHIS transposons. a Taxonomic distribution of PHIS transposon groups across the eukaryotic tree of life. Different colored boxes indicate presence of the corresponding group, and flanking numbers represent the number of species. b The potential active families of each group in the eukaryotes. c The number of copies of each group in the eukaryotes

Mentions: Based on the characteristics (TSDs, coding capacity, and secondary structure of transposase, etc.) of these 380 families, we found that 214 families belong to the PIF/Harbinger transposon group (Additional file 1: Table S1). Among the 214 families, 80 families had been previously identified and cataloged in RepBase, and 134 families were firstly identified in this study. These 214 families shared the following characteristics. (1) The TSD sequence is 5′-TWA-3′ tri-nucleotide (‘W’ represents A or T nucleotide) (Fig. 1). (2) Most candidate autonomous elements contain two open reading frames (ORFs), one ORF encoding the DDE and helix-turn-helix (HTH) motif-containing transposase and the other ORF encoding a DNA-binding protein with a Myb/SANT domain. The potential active families of PIF/Harbinger group were defined as those including both two intact ORFs. Finally, we identified 88 potential active families in the eukaryotic genomes (Additional file 1: Table S1 and Fig. 2b). (3) The TIR (terminal inverted repeat) lengths of different PIF/Harbinger families are highly variable (5–1042 bp), but the lengths of most TIRs (~93 %) are less than 60 bp, and the first nucleotide of TIRs is usually A or G (Fig. 1). (4) The average length of consensus sequences of candidate autonomous is ~4124 bp. (5) These families are distributed in 75 species including plants, fungi, and animals. The above-described characteristics of PIF/Harbinger transposons are consistent with previous reports [15, 16, 19].Fig. 1


The diversification of PHIS transposon superfamily in eukaryotes.

Han MJ, Xiong CL, Zhang HB, Zhang MQ, Zhang HH, Zhang Z - Mob DNA (2015)

Distribution, abundance, and potential active families of PHIS transposons. a Taxonomic distribution of PHIS transposon groups across the eukaryotic tree of life. Different colored boxes indicate presence of the corresponding group, and flanking numbers represent the number of species. b The potential active families of each group in the eukaryotes. c The number of copies of each group in the eukaryotes
© Copyright Policy - open-access
Related In: Results  -  Collection

License 1 - License 2
Show All Figures
getmorefigures.php?uid=PMC4482050&req=5

Fig2: Distribution, abundance, and potential active families of PHIS transposons. a Taxonomic distribution of PHIS transposon groups across the eukaryotic tree of life. Different colored boxes indicate presence of the corresponding group, and flanking numbers represent the number of species. b The potential active families of each group in the eukaryotes. c The number of copies of each group in the eukaryotes
Mentions: Based on the characteristics (TSDs, coding capacity, and secondary structure of transposase, etc.) of these 380 families, we found that 214 families belong to the PIF/Harbinger transposon group (Additional file 1: Table S1). Among the 214 families, 80 families had been previously identified and cataloged in RepBase, and 134 families were firstly identified in this study. These 214 families shared the following characteristics. (1) The TSD sequence is 5′-TWA-3′ tri-nucleotide (‘W’ represents A or T nucleotide) (Fig. 1). (2) Most candidate autonomous elements contain two open reading frames (ORFs), one ORF encoding the DDE and helix-turn-helix (HTH) motif-containing transposase and the other ORF encoding a DNA-binding protein with a Myb/SANT domain. The potential active families of PIF/Harbinger group were defined as those including both two intact ORFs. Finally, we identified 88 potential active families in the eukaryotic genomes (Additional file 1: Table S1 and Fig. 2b). (3) The TIR (terminal inverted repeat) lengths of different PIF/Harbinger families are highly variable (5–1042 bp), but the lengths of most TIRs (~93 %) are less than 60 bp, and the first nucleotide of TIRs is usually A or G (Fig. 1). (4) The average length of consensus sequences of candidate autonomous is ~4124 bp. (5) These families are distributed in 75 species including plants, fungi, and animals. The above-described characteristics of PIF/Harbinger transposons are consistent with previous reports [15, 16, 19].Fig. 1

Bottom Line: These three groups have similar DDE domain-containing transposases; however, their coding capacity, species distribution, and target site duplications (TSDs) are significantly different.Furthermore, three new types of PHIS superfamily were identified.Our results not only enrich the transposon diversity but also have extensive significance for improving genome sequence assembly and annotation of higher organisms.

View Article: PubMed Central - PubMed

Affiliation: School of Life Sciences, Chongqing University, Chongqing, 400044 China.

ABSTRACT

Background: PHIS transposon superfamily belongs to DNA transposons and includes PIF/Harbinger, ISL2EU, and Spy transposon groups. These three groups have similar DDE domain-containing transposases; however, their coding capacity, species distribution, and target site duplications (TSDs) are significantly different.

Results: In this study, we systematically identified and analyzed PHIS transposons in 836 sequenced eukaryotic genomes using transposase homology search and structure approach. In total, 380 PHIS families were identified in 112 genomes and 168 of 380 families were firstly reported in this study. Besides previous identified PIF/Harbinger, ISL2EU, and Spy groups, three new types (called Pangu, NuwaI, and NuwaII) of PHIS superfamily were identified; each has its own distinctive characteristics, especially in TSDs. Pangu and NuwaII transposons are characterized by 5'-ANT-3' and 5'-C/TNA/G-3' TSDs, respectively. Both transposons are widely distributed in plants, fungi, and animals; the NuwaI transposons are characterized by 5'-CWG-3' TSDs and mainly distributed in animals.

Conclusions: Here, in total, 380 PHIS families were identified in eukaryotes. Among these 380 families, 168 were firstly reported in this study. Furthermore, three new types of PHIS superfamily were identified. Our results not only enrich the transposon diversity but also have extensive significance for improving genome sequence assembly and annotation of higher organisms.

No MeSH data available.