Limits...
Prediction of piRNAs using transposon interaction and a support vector machine.

Wang K, Liang C, Liu J, Xiao H, Huang S, Xu J, Li F - BMC Bioinformatics (2014)

Bottom Line: Accurate prediction of piRNAs remains a significant challenge.As a result, 82,639 piRNAs were predicted in C. suppressalis.Piano demonstrates excellent piRNA prediction performance by using both structure and sequence features of transposon-piRNAs interactions.

View Article: PubMed Central - PubMed

Affiliation: Department of Entomology, College of Plant Protection, Nanjing Agricultural University, Nanjing, 210095, China. wangk4@miamioh.edu.

ABSTRACT

Background: Piwi-interacting RNAs (piRNAs) are a class of small non-coding RNA primarily expressed in germ cells that can silence transposons at the post-transcriptional level. Accurate prediction of piRNAs remains a significant challenge.

Results: We developed a program for piRNA annotation (Piano) using piRNA-transposon interaction information. We downloaded 13,848 Drosophila piRNAs and 261,500 Drosophila transposons. The piRNAs were aligned to transposons with a maximum of three mismatches. Then, piRNA-transposon interactions were predicted by RNAplex. Triplet elements combining structure and sequence information were extracted from piRNA-transposon matching/pairing duplexes. A support vector machine (SVM) was used on these triplet elements to classify real and pseudo piRNAs, achieving 95.3 ± 0.33% accuracy and 96.0 ± 0.5% sensitivity. The SVM classifier can be used to correctly predict human, mouse and rat piRNAs, with overall accuracy of 90.6%. We used Piano to predict piRNAs for the rice stem borer, Chilo suppressalis, an important rice insect pest that causes huge yield loss. As a result, 82,639 piRNAs were predicted in C. suppressalis.

Conclusions: Piano demonstrates excellent piRNA prediction performance by using both structure and sequence features of transposon-piRNAs interactions. Piano is freely available to the academic community at http://ento.njau.edu.cn/Piano.html .

Show MeSH

Related in: MedlinePlus

The distribution of triplet elements in two datasets (pseudo piRNA vs. real piRNA).
© Copyright Policy - open-access
Related In: Results  -  Collection

License 1 - License 2
getmorefigures.php?uid=PMC4308892&req=5

Fig3: The distribution of triplet elements in two datasets (pseudo piRNA vs. real piRNA).

Mentions: We calculated the average frequencies of the 32 structure-sequence triplet elements in the real piRNAs and pseudo piRNAs. Our data analysis indicated that "(((G" and "(((C" appear at higher frequencies in real piRNAs than in pseudo piRNAs. The group of two-paired nucleotides and one unpaired (e.g., "((.A") appears more often in pseudo piRNAs than in real piRNAs (Figure 3). We calculated the F-value to estimate the discriminative power of the different triplet elements [31,32].Figure 3


Prediction of piRNAs using transposon interaction and a support vector machine.

Wang K, Liang C, Liu J, Xiao H, Huang S, Xu J, Li F - BMC Bioinformatics (2014)

The distribution of triplet elements in two datasets (pseudo piRNA vs. real piRNA).
© Copyright Policy - open-access
Related In: Results  -  Collection

License 1 - License 2
Show All Figures
getmorefigures.php?uid=PMC4308892&req=5

Fig3: The distribution of triplet elements in two datasets (pseudo piRNA vs. real piRNA).
Mentions: We calculated the average frequencies of the 32 structure-sequence triplet elements in the real piRNAs and pseudo piRNAs. Our data analysis indicated that "(((G" and "(((C" appear at higher frequencies in real piRNAs than in pseudo piRNAs. The group of two-paired nucleotides and one unpaired (e.g., "((.A") appears more often in pseudo piRNAs than in real piRNAs (Figure 3). We calculated the F-value to estimate the discriminative power of the different triplet elements [31,32].Figure 3

Bottom Line: Accurate prediction of piRNAs remains a significant challenge.As a result, 82,639 piRNAs were predicted in C. suppressalis.Piano demonstrates excellent piRNA prediction performance by using both structure and sequence features of transposon-piRNAs interactions.

View Article: PubMed Central - PubMed

Affiliation: Department of Entomology, College of Plant Protection, Nanjing Agricultural University, Nanjing, 210095, China. wangk4@miamioh.edu.

ABSTRACT

Background: Piwi-interacting RNAs (piRNAs) are a class of small non-coding RNA primarily expressed in germ cells that can silence transposons at the post-transcriptional level. Accurate prediction of piRNAs remains a significant challenge.

Results: We developed a program for piRNA annotation (Piano) using piRNA-transposon interaction information. We downloaded 13,848 Drosophila piRNAs and 261,500 Drosophila transposons. The piRNAs were aligned to transposons with a maximum of three mismatches. Then, piRNA-transposon interactions were predicted by RNAplex. Triplet elements combining structure and sequence information were extracted from piRNA-transposon matching/pairing duplexes. A support vector machine (SVM) was used on these triplet elements to classify real and pseudo piRNAs, achieving 95.3 ± 0.33% accuracy and 96.0 ± 0.5% sensitivity. The SVM classifier can be used to correctly predict human, mouse and rat piRNAs, with overall accuracy of 90.6%. We used Piano to predict piRNAs for the rice stem borer, Chilo suppressalis, an important rice insect pest that causes huge yield loss. As a result, 82,639 piRNAs were predicted in C. suppressalis.

Conclusions: Piano demonstrates excellent piRNA prediction performance by using both structure and sequence features of transposon-piRNAs interactions. Piano is freely available to the academic community at http://ento.njau.edu.cn/Piano.html .

Show MeSH
Related in: MedlinePlus