Integration of mapped RNA-Seq reads into automatic training of eukaryotic gene finding algorithm.
Bottom Line:
Use of 'assembled' RNA-Seq transcripts is far from trivial; significant error rate of assembly was revealed in recent assessments.We demonstrated in computational experiments that the proposed method of incorporation of 'unassembled' RNA-Seq reads improves the accuracy of gene prediction; particularly, for the 1.3 GB genome of Aedes aegypti the mean value of prediction Sensitivity and Specificity at the gene level increased over GeneMark-ES by 24.5%.In the current surge of genomic data when the need for accurate sequence annotation is higher than ever, GeneMark-ET will be a valuable addition to the narrow arsenal of automatic gene prediction tools.
View Article:
PubMed Central - PubMed
Affiliation: Joint Georgia Tech and Emory Wallace H. Coulter Department of Biomedical Engineering, Atlanta, GA, USA 30332.
Show MeSH
Related in: MedlinePlus |
![]() Related In:
Results -
Collection
License getmorefigures.php?uid=PMC4150757&req=5
Figure 4: Observed dynamics of change in iterations of the mean of Sn and Sp internal exon prediction values for the GeneMark-ET and GeneMark-ES algorithms in cases of Drosophila melanogaster (A) and Anopheles aegypti (B) genomes. Mentions: We analyzed the dependence of mean values of internal exon Sn and Sp on iteration index for D. melanogaster and A. aegypti genomes for both GeneMark-ES and GeneMark-ET (Figure 4). The GeneMark-ET initial parameterization integrating information from mapped RNA-Seq reads improved accuracy of predictions in the first iteration by 55–60% in comparison with GeneMark-ES. For D. melanogaster, further iterations reduced the large initial gap in accuracy down to 4%. In contrast, for the large A. aegypti genome, although the gap was reduced with iterations, the accuracy of GeneMark-ET at convergence remained almost 20% higher than one of GeneMark-ES. Also, GeneMark-ET reached convergence 2–3 iterations earlier (Figure 4). The reduction in number of iterations was observed for the other three genomes as well (data not shown). |
View Article: PubMed Central - PubMed
Affiliation: Joint Georgia Tech and Emory Wallace H. Coulter Department of Biomedical Engineering, Atlanta, GA, USA 30332.