Limits...
The diversification of PHIS transposon superfamily in eukaryotes.

Han MJ, Xiong CL, Zhang HB, Zhang MQ, Zhang HH, Zhang Z - Mob DNA (2015)

Bottom Line: These three groups have similar DDE domain-containing transposases; however, their coding capacity, species distribution, and target site duplications (TSDs) are significantly different.Furthermore, three new types of PHIS superfamily were identified.Our results not only enrich the transposon diversity but also have extensive significance for improving genome sequence assembly and annotation of higher organisms.

View Article: PubMed Central - PubMed

Affiliation: School of Life Sciences, Chongqing University, Chongqing, 400044 China.

ABSTRACT

Background: PHIS transposon superfamily belongs to DNA transposons and includes PIF/Harbinger, ISL2EU, and Spy transposon groups. These three groups have similar DDE domain-containing transposases; however, their coding capacity, species distribution, and target site duplications (TSDs) are significantly different.

Results: In this study, we systematically identified and analyzed PHIS transposons in 836 sequenced eukaryotic genomes using transposase homology search and structure approach. In total, 380 PHIS families were identified in 112 genomes and 168 of 380 families were firstly reported in this study. Besides previous identified PIF/Harbinger, ISL2EU, and Spy groups, three new types (called Pangu, NuwaI, and NuwaII) of PHIS superfamily were identified; each has its own distinctive characteristics, especially in TSDs. Pangu and NuwaII transposons are characterized by 5'-ANT-3' and 5'-C/TNA/G-3' TSDs, respectively. Both transposons are widely distributed in plants, fungi, and animals; the NuwaI transposons are characterized by 5'-CWG-3' TSDs and mainly distributed in animals.

Conclusions: Here, in total, 380 PHIS families were identified in eukaryotes. Among these 380 families, 168 were firstly reported in this study. Furthermore, three new types of PHIS superfamily were identified. Our results not only enrich the transposon diversity but also have extensive significance for improving genome sequence assembly and annotation of higher organisms.

No MeSH data available.


Related in: MedlinePlus

Characteristics of Pangu transposons. a Sequence alignments for Pangu_CGig family. The terminal inverted repeats (TIRs) and flanking sequences (10 bp) are shown. b Two examples of alignments of the flanking sequences of Pangu_CGig insertions with a paralogous sequences found within the same genome but devoid of the transposon. The TIRs of the element are underlined. c Structure of Pangu_CGig. Black triangles and solid black boxes represent the TIRs and ORFs, respectively, and the position of the DDE triad is shown. d Predicted secondary structure of the DDE motif-containing transposase of the Pangu_CGig. The DDE triads is marked with red triangles below the sequence
© Copyright Policy - open-access
Related In: Results  -  Collection

License 1 - License 2
getmorefigures.php?uid=PMC4482050&req=5

Fig3: Characteristics of Pangu transposons. a Sequence alignments for Pangu_CGig family. The terminal inverted repeats (TIRs) and flanking sequences (10 bp) are shown. b Two examples of alignments of the flanking sequences of Pangu_CGig insertions with a paralogous sequences found within the same genome but devoid of the transposon. The TIRs of the element are underlined. c Structure of Pangu_CGig. Black triangles and solid black boxes represent the TIRs and ORFs, respectively, and the position of the DDE triad is shown. d Predicted secondary structure of the DDE motif-containing transposase of the Pangu_CGig. The DDE triads is marked with red triangles below the sequence

Mentions: Thirty four Pangu families were identified in this study (Additional file 1: Table S3). The length of TIRs in these families varies from 11 to 40 bp, and the first two nucleotides of TIRs are usually “AG” and “GG” di-nucleotide (Fig. 1). The average consensus sequence length of autonomous candidates is ~3487 bp. Most autonomous candidates of Pangu transposon contain two ORFs, one ORF encoding the DDE motif-containing transposase and without any other domains. Meanwhile, we did not detect any known motifs in the other ORF. Given that the potential active families should contain the two intact ORFs, we identified two potential active families of Pangu group in the eukaryotic genomes (Additional file 1: Table S3 and Fig. 2b). Secondary structure prediction of Pangu DDE-containing transposases suggests that the first D is located between two beta-sheets, the second D is located between a beta-sheet and an alpha-helix, and the last E is present within an alpha-helix (Fig. 3). This result is consistent with the eukaryotic PIF/Harbinger and ISL2EU transposons [15]. The results of paralogous empty site confirmed that the TSDs of these families are 5′-ANT-3′ (‘N’ represents A, T, C, or G nucleotide) (Fig. 3). This characteristic of TSDs is significantly different from eukaryotic PIF/Harbinger, ISL2EU, and Spy transposons but consistent with the bacterial IS5 transposons. Thus, both Pangu and IS5 transposons could belong to the same group or were derived from the same ancient element.Fig. 3


The diversification of PHIS transposon superfamily in eukaryotes.

Han MJ, Xiong CL, Zhang HB, Zhang MQ, Zhang HH, Zhang Z - Mob DNA (2015)

Characteristics of Pangu transposons. a Sequence alignments for Pangu_CGig family. The terminal inverted repeats (TIRs) and flanking sequences (10 bp) are shown. b Two examples of alignments of the flanking sequences of Pangu_CGig insertions with a paralogous sequences found within the same genome but devoid of the transposon. The TIRs of the element are underlined. c Structure of Pangu_CGig. Black triangles and solid black boxes represent the TIRs and ORFs, respectively, and the position of the DDE triad is shown. d Predicted secondary structure of the DDE motif-containing transposase of the Pangu_CGig. The DDE triads is marked with red triangles below the sequence
© Copyright Policy - open-access
Related In: Results  -  Collection

License 1 - License 2
Show All Figures
getmorefigures.php?uid=PMC4482050&req=5

Fig3: Characteristics of Pangu transposons. a Sequence alignments for Pangu_CGig family. The terminal inverted repeats (TIRs) and flanking sequences (10 bp) are shown. b Two examples of alignments of the flanking sequences of Pangu_CGig insertions with a paralogous sequences found within the same genome but devoid of the transposon. The TIRs of the element are underlined. c Structure of Pangu_CGig. Black triangles and solid black boxes represent the TIRs and ORFs, respectively, and the position of the DDE triad is shown. d Predicted secondary structure of the DDE motif-containing transposase of the Pangu_CGig. The DDE triads is marked with red triangles below the sequence
Mentions: Thirty four Pangu families were identified in this study (Additional file 1: Table S3). The length of TIRs in these families varies from 11 to 40 bp, and the first two nucleotides of TIRs are usually “AG” and “GG” di-nucleotide (Fig. 1). The average consensus sequence length of autonomous candidates is ~3487 bp. Most autonomous candidates of Pangu transposon contain two ORFs, one ORF encoding the DDE motif-containing transposase and without any other domains. Meanwhile, we did not detect any known motifs in the other ORF. Given that the potential active families should contain the two intact ORFs, we identified two potential active families of Pangu group in the eukaryotic genomes (Additional file 1: Table S3 and Fig. 2b). Secondary structure prediction of Pangu DDE-containing transposases suggests that the first D is located between two beta-sheets, the second D is located between a beta-sheet and an alpha-helix, and the last E is present within an alpha-helix (Fig. 3). This result is consistent with the eukaryotic PIF/Harbinger and ISL2EU transposons [15]. The results of paralogous empty site confirmed that the TSDs of these families are 5′-ANT-3′ (‘N’ represents A, T, C, or G nucleotide) (Fig. 3). This characteristic of TSDs is significantly different from eukaryotic PIF/Harbinger, ISL2EU, and Spy transposons but consistent with the bacterial IS5 transposons. Thus, both Pangu and IS5 transposons could belong to the same group or were derived from the same ancient element.Fig. 3

Bottom Line: These three groups have similar DDE domain-containing transposases; however, their coding capacity, species distribution, and target site duplications (TSDs) are significantly different.Furthermore, three new types of PHIS superfamily were identified.Our results not only enrich the transposon diversity but also have extensive significance for improving genome sequence assembly and annotation of higher organisms.

View Article: PubMed Central - PubMed

Affiliation: School of Life Sciences, Chongqing University, Chongqing, 400044 China.

ABSTRACT

Background: PHIS transposon superfamily belongs to DNA transposons and includes PIF/Harbinger, ISL2EU, and Spy transposon groups. These three groups have similar DDE domain-containing transposases; however, their coding capacity, species distribution, and target site duplications (TSDs) are significantly different.

Results: In this study, we systematically identified and analyzed PHIS transposons in 836 sequenced eukaryotic genomes using transposase homology search and structure approach. In total, 380 PHIS families were identified in 112 genomes and 168 of 380 families were firstly reported in this study. Besides previous identified PIF/Harbinger, ISL2EU, and Spy groups, three new types (called Pangu, NuwaI, and NuwaII) of PHIS superfamily were identified; each has its own distinctive characteristics, especially in TSDs. Pangu and NuwaII transposons are characterized by 5'-ANT-3' and 5'-C/TNA/G-3' TSDs, respectively. Both transposons are widely distributed in plants, fungi, and animals; the NuwaI transposons are characterized by 5'-CWG-3' TSDs and mainly distributed in animals.

Conclusions: Here, in total, 380 PHIS families were identified in eukaryotes. Among these 380 families, 168 were firstly reported in this study. Furthermore, three new types of PHIS superfamily were identified. Our results not only enrich the transposon diversity but also have extensive significance for improving genome sequence assembly and annotation of higher organisms.

No MeSH data available.


Related in: MedlinePlus