Limits...
The diversification of PHIS transposon superfamily in eukaryotes.

Han MJ, Xiong CL, Zhang HB, Zhang MQ, Zhang HH, Zhang Z - Mob DNA (2015)

Bottom Line: These three groups have similar DDE domain-containing transposases; however, their coding capacity, species distribution, and target site duplications (TSDs) are significantly different.Furthermore, three new types of PHIS superfamily were identified.Our results not only enrich the transposon diversity but also have extensive significance for improving genome sequence assembly and annotation of higher organisms.

View Article: PubMed Central - PubMed

Affiliation: School of Life Sciences, Chongqing University, Chongqing, 400044 China.

ABSTRACT

Background: PHIS transposon superfamily belongs to DNA transposons and includes PIF/Harbinger, ISL2EU, and Spy transposon groups. These three groups have similar DDE domain-containing transposases; however, their coding capacity, species distribution, and target site duplications (TSDs) are significantly different.

Results: In this study, we systematically identified and analyzed PHIS transposons in 836 sequenced eukaryotic genomes using transposase homology search and structure approach. In total, 380 PHIS families were identified in 112 genomes and 168 of 380 families were firstly reported in this study. Besides previous identified PIF/Harbinger, ISL2EU, and Spy groups, three new types (called Pangu, NuwaI, and NuwaII) of PHIS superfamily were identified; each has its own distinctive characteristics, especially in TSDs. Pangu and NuwaII transposons are characterized by 5'-ANT-3' and 5'-C/TNA/G-3' TSDs, respectively. Both transposons are widely distributed in plants, fungi, and animals; the NuwaI transposons are characterized by 5'-CWG-3' TSDs and mainly distributed in animals.

Conclusions: Here, in total, 380 PHIS families were identified in eukaryotes. Among these 380 families, 168 were firstly reported in this study. Furthermore, three new types of PHIS superfamily were identified. Our results not only enrich the transposon diversity but also have extensive significance for improving genome sequence assembly and annotation of higher organisms.

No MeSH data available.


Related in: MedlinePlus

Characteristics of NuwaI transposons. a Sequence alignments for NuwaI-4_DRer family. The terminal inverted repeats (TIRs) and flanking sequences (10 bp) are shown. b Two examples of alignments of the flanking sequences of NuwaI-4_DRer insertions with a paralogous sequences found within the same genome but devoid of the transposon. The TIRs of the element are underlined. c Structure of NuwaI-4_DRer. Black triangles and solid black boxes represent the TIRs and ORFs, respectively, and the position of the DDE triad is shown. d Predicted secondary structure of the DDE motif-containing transposase of the NuwaI-4_DRer. The DDE triads is marked with red triangles below the sequence
© Copyright Policy - open-access
Related In: Results  -  Collection

License 1 - License 2
getmorefigures.php?uid=PMC4482050&req=5

Fig4: Characteristics of NuwaI transposons. a Sequence alignments for NuwaI-4_DRer family. The terminal inverted repeats (TIRs) and flanking sequences (10 bp) are shown. b Two examples of alignments of the flanking sequences of NuwaI-4_DRer insertions with a paralogous sequences found within the same genome but devoid of the transposon. The TIRs of the element are underlined. c Structure of NuwaI-4_DRer. Black triangles and solid black boxes represent the TIRs and ORFs, respectively, and the position of the DDE triad is shown. d Predicted secondary structure of the DDE motif-containing transposase of the NuwaI-4_DRer. The DDE triads is marked with red triangles below the sequence

Mentions: Twenty-three NuwaI families were identified in this study (Additional file 1: Table S5). The results of paralogous empty site confirmed that the TSDs of these families are 5′-CWG-3′ (‘W’ represents A or T nucleotide) (Fig. 4). This characteristic is significantly different from previously the identified PIF/Harbinger, ISL2EU, and Spy transposons (AT-rich TSDs). Most autonomous candidates of NuwaI transposons contain two ORFs, one ORF encoding the DDE motif-containing transposase and without any other domain, the other ORF encoding a DNA-binding protein with a Myb/SANT domain. We identified 11 potential active families in the eukaryotic genomes because these TEs contain the two intact ORFs (Additional file 1: Table S5 and Fig. 2b). The secondary structure of NuwaI transposase is very similar to the PIF/Harbinger, ISL2EU, and Pangu transposases. For instance, the first D is located between two beta-sheets, the second D is typically between a beta-sheet and an alpha-helix, and the last E occurs within an alpha-helix (Fig. 4). The TIR lengths of NuwaI families range from 12 to 61 bp, and the first three nucleotides of TIRs are usually ‘GGG’ tri-nucleotide (Fig. 1). The average length of consensus sequences of autonomous candidates is ~4462 bp. These NuwaI transposons are distributed in 16 animal genomes. These species include 12 bony fish, 1 coleopteran, 1 crustacean, 1 molluscan, and 1 anthozoan (Fig. 2a). However, these species are distributed only in the kingdom of animals. Thus, the NuwaI transposons could be relatively younger elements in the eukaryotes. Finally, 3845 copies of NuwaI group were identified in the eukaryotic genomes. The genomic abundance and copy number of each NuwaI family in each species were shown in Fig. 2c, Additional file 1: Table S5, and Additional file 3: Table S6.Fig. 4


The diversification of PHIS transposon superfamily in eukaryotes.

Han MJ, Xiong CL, Zhang HB, Zhang MQ, Zhang HH, Zhang Z - Mob DNA (2015)

Characteristics of NuwaI transposons. a Sequence alignments for NuwaI-4_DRer family. The terminal inverted repeats (TIRs) and flanking sequences (10 bp) are shown. b Two examples of alignments of the flanking sequences of NuwaI-4_DRer insertions with a paralogous sequences found within the same genome but devoid of the transposon. The TIRs of the element are underlined. c Structure of NuwaI-4_DRer. Black triangles and solid black boxes represent the TIRs and ORFs, respectively, and the position of the DDE triad is shown. d Predicted secondary structure of the DDE motif-containing transposase of the NuwaI-4_DRer. The DDE triads is marked with red triangles below the sequence
© Copyright Policy - open-access
Related In: Results  -  Collection

License 1 - License 2
Show All Figures
getmorefigures.php?uid=PMC4482050&req=5

Fig4: Characteristics of NuwaI transposons. a Sequence alignments for NuwaI-4_DRer family. The terminal inverted repeats (TIRs) and flanking sequences (10 bp) are shown. b Two examples of alignments of the flanking sequences of NuwaI-4_DRer insertions with a paralogous sequences found within the same genome but devoid of the transposon. The TIRs of the element are underlined. c Structure of NuwaI-4_DRer. Black triangles and solid black boxes represent the TIRs and ORFs, respectively, and the position of the DDE triad is shown. d Predicted secondary structure of the DDE motif-containing transposase of the NuwaI-4_DRer. The DDE triads is marked with red triangles below the sequence
Mentions: Twenty-three NuwaI families were identified in this study (Additional file 1: Table S5). The results of paralogous empty site confirmed that the TSDs of these families are 5′-CWG-3′ (‘W’ represents A or T nucleotide) (Fig. 4). This characteristic is significantly different from previously the identified PIF/Harbinger, ISL2EU, and Spy transposons (AT-rich TSDs). Most autonomous candidates of NuwaI transposons contain two ORFs, one ORF encoding the DDE motif-containing transposase and without any other domain, the other ORF encoding a DNA-binding protein with a Myb/SANT domain. We identified 11 potential active families in the eukaryotic genomes because these TEs contain the two intact ORFs (Additional file 1: Table S5 and Fig. 2b). The secondary structure of NuwaI transposase is very similar to the PIF/Harbinger, ISL2EU, and Pangu transposases. For instance, the first D is located between two beta-sheets, the second D is typically between a beta-sheet and an alpha-helix, and the last E occurs within an alpha-helix (Fig. 4). The TIR lengths of NuwaI families range from 12 to 61 bp, and the first three nucleotides of TIRs are usually ‘GGG’ tri-nucleotide (Fig. 1). The average length of consensus sequences of autonomous candidates is ~4462 bp. These NuwaI transposons are distributed in 16 animal genomes. These species include 12 bony fish, 1 coleopteran, 1 crustacean, 1 molluscan, and 1 anthozoan (Fig. 2a). However, these species are distributed only in the kingdom of animals. Thus, the NuwaI transposons could be relatively younger elements in the eukaryotes. Finally, 3845 copies of NuwaI group were identified in the eukaryotic genomes. The genomic abundance and copy number of each NuwaI family in each species were shown in Fig. 2c, Additional file 1: Table S5, and Additional file 3: Table S6.Fig. 4

Bottom Line: These three groups have similar DDE domain-containing transposases; however, their coding capacity, species distribution, and target site duplications (TSDs) are significantly different.Furthermore, three new types of PHIS superfamily were identified.Our results not only enrich the transposon diversity but also have extensive significance for improving genome sequence assembly and annotation of higher organisms.

View Article: PubMed Central - PubMed

Affiliation: School of Life Sciences, Chongqing University, Chongqing, 400044 China.

ABSTRACT

Background: PHIS transposon superfamily belongs to DNA transposons and includes PIF/Harbinger, ISL2EU, and Spy transposon groups. These three groups have similar DDE domain-containing transposases; however, their coding capacity, species distribution, and target site duplications (TSDs) are significantly different.

Results: In this study, we systematically identified and analyzed PHIS transposons in 836 sequenced eukaryotic genomes using transposase homology search and structure approach. In total, 380 PHIS families were identified in 112 genomes and 168 of 380 families were firstly reported in this study. Besides previous identified PIF/Harbinger, ISL2EU, and Spy groups, three new types (called Pangu, NuwaI, and NuwaII) of PHIS superfamily were identified; each has its own distinctive characteristics, especially in TSDs. Pangu and NuwaII transposons are characterized by 5'-ANT-3' and 5'-C/TNA/G-3' TSDs, respectively. Both transposons are widely distributed in plants, fungi, and animals; the NuwaI transposons are characterized by 5'-CWG-3' TSDs and mainly distributed in animals.

Conclusions: Here, in total, 380 PHIS families were identified in eukaryotes. Among these 380 families, 168 were firstly reported in this study. Furthermore, three new types of PHIS superfamily were identified. Our results not only enrich the transposon diversity but also have extensive significance for improving genome sequence assembly and annotation of higher organisms.

No MeSH data available.


Related in: MedlinePlus