Limits...
Integration of mapped RNA-Seq reads into automatic training of eukaryotic gene finding algorithm.

Lomsadze A, Burns PD, Borodovsky M - Nucleic Acids Res. (2014)

Bottom Line: Use of 'assembled' RNA-Seq transcripts is far from trivial; significant error rate of assembly was revealed in recent assessments.We demonstrated in computational experiments that the proposed method of incorporation of 'unassembled' RNA-Seq reads improves the accuracy of gene prediction; particularly, for the 1.3 GB genome of Aedes aegypti the mean value of prediction Sensitivity and Specificity at the gene level increased over GeneMark-ES by 24.5%.In the current surge of genomic data when the need for accurate sequence annotation is higher than ever, GeneMark-ET will be a valuable addition to the narrow arsenal of automatic gene prediction tools.

View Article: PubMed Central - PubMed

Affiliation: Joint Georgia Tech and Emory Wallace H. Coulter Department of Biomedical Engineering, Atlanta, GA, USA 30332.

Show MeSH

Related in: MedlinePlus

Diagram of the iterative semi-supervised training of GeneMark-ET.
© Copyright Policy - creative-commons
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC4150757&req=5

Figure 2: Diagram of the iterative semi-supervised training of GeneMark-ET.

Mentions: The input data include assembled genomic sequences and RNA-Seq reads as shown in the diagram of GeneMark-ET algorithm (Figure 2). Effectively, the use of mapped RNA-Seq reads, the external (extrinsic) evidence, changes the unsupervised training algorithm GeneMark-ES into an algorithm with semi-supervised training, GeneMark-ET.


Integration of mapped RNA-Seq reads into automatic training of eukaryotic gene finding algorithm.

Lomsadze A, Burns PD, Borodovsky M - Nucleic Acids Res. (2014)

Diagram of the iterative semi-supervised training of GeneMark-ET.
© Copyright Policy - creative-commons
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC4150757&req=5

Figure 2: Diagram of the iterative semi-supervised training of GeneMark-ET.
Mentions: The input data include assembled genomic sequences and RNA-Seq reads as shown in the diagram of GeneMark-ET algorithm (Figure 2). Effectively, the use of mapped RNA-Seq reads, the external (extrinsic) evidence, changes the unsupervised training algorithm GeneMark-ES into an algorithm with semi-supervised training, GeneMark-ET.

Bottom Line: Use of 'assembled' RNA-Seq transcripts is far from trivial; significant error rate of assembly was revealed in recent assessments.We demonstrated in computational experiments that the proposed method of incorporation of 'unassembled' RNA-Seq reads improves the accuracy of gene prediction; particularly, for the 1.3 GB genome of Aedes aegypti the mean value of prediction Sensitivity and Specificity at the gene level increased over GeneMark-ES by 24.5%.In the current surge of genomic data when the need for accurate sequence annotation is higher than ever, GeneMark-ET will be a valuable addition to the narrow arsenal of automatic gene prediction tools.

View Article: PubMed Central - PubMed

Affiliation: Joint Georgia Tech and Emory Wallace H. Coulter Department of Biomedical Engineering, Atlanta, GA, USA 30332.

Show MeSH
Related in: MedlinePlus