Limits...
Improving RNA-Seq Precision with MapAl.

Labaj PP, Linggi BE, Wiley HS, Kreil DP - Front Genet (2012)

Bottom Line: We here introduce MapAl, a tool for RNA-Seq expression profiling that builds on the established programs Bowtie and Cufflinks.In the post-processing of RNA-Seq reads, it incorporates gene models already at the stage of read alignment, increasing the number of reliably measured known transcripts consistently by 50%.Adding genes identified de novo then allows a reliable assessment of double the total number of transcripts compared to other available pipelines.

View Article: PubMed Central - PubMed

Affiliation: Department of Biotechnology, Boku University Vienna Vienna, Austria.

ABSTRACT
With currently available RNA-Seq pipelines, expression estimates for most genes are very noisy. We here introduce MapAl, a tool for RNA-Seq expression profiling that builds on the established programs Bowtie and Cufflinks. In the post-processing of RNA-Seq reads, it incorporates gene models already at the stage of read alignment, increasing the number of reliably measured known transcripts consistently by 50%. Adding genes identified de novo then allows a reliable assessment of double the total number of transcripts compared to other available pipelines. This substantial improvement is of general relevance: Measurement precision determines the power of any analysis to reliably identify significant signals, such as in screens for differential expression, independent of whether the experimental design incorporates replicates or not.

No MeSH data available.


Related in: MedlinePlus

Exemplary gene models. Schematic diagrams presents three example gene models. The top row displays a hypothetical coverage assuming a uniform distribution of reads falling entirely within the exons. The models (A–C) exhibit increasing complexity. In the first model, it is still possible to assess the expression of alternative splice forms even without reads covering exon junctions. Additional junction spanning reads will moderately affect expression level estimates. In the next model, adding read alignments that fall on splice junctions can already considerably affect presence calls and estimates of specific splice-form expression levels. In the most complex model, adding read alignments that fall on splice junctions plays a critical role in assessing the specific splice-form expression levels. In this scenario additional evidence will boost the expression level estimate for the dominant splice-form, while depressing the expression level estimates for the others.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC3316937&req=5

FA3: Exemplary gene models. Schematic diagrams presents three example gene models. The top row displays a hypothetical coverage assuming a uniform distribution of reads falling entirely within the exons. The models (A–C) exhibit increasing complexity. In the first model, it is still possible to assess the expression of alternative splice forms even without reads covering exon junctions. Additional junction spanning reads will moderately affect expression level estimates. In the next model, adding read alignments that fall on splice junctions can already considerably affect presence calls and estimates of specific splice-form expression levels. In the most complex model, adding read alignments that fall on splice junctions plays a critical role in assessing the specific splice-form expression levels. In this scenario additional evidence will boost the expression level estimate for the dominant splice-form, while depressing the expression level estimates for the others.

Mentions: First, consider the simple gene models of Figure A3A. At sufficient coverage, it is possible to assess the expression of both splice forms even without reads spanning exon junction. Adding read alignments that fall on splice junctions will therefore slightly increase the coverage at exon boundaries and thus increase the respective expression levels. This contributes to the observation in the scatter plots (Figure A2), that the expression levels for MapAl are in general higher (densities below the diagonal).


Improving RNA-Seq Precision with MapAl.

Labaj PP, Linggi BE, Wiley HS, Kreil DP - Front Genet (2012)

Exemplary gene models. Schematic diagrams presents three example gene models. The top row displays a hypothetical coverage assuming a uniform distribution of reads falling entirely within the exons. The models (A–C) exhibit increasing complexity. In the first model, it is still possible to assess the expression of alternative splice forms even without reads covering exon junctions. Additional junction spanning reads will moderately affect expression level estimates. In the next model, adding read alignments that fall on splice junctions can already considerably affect presence calls and estimates of specific splice-form expression levels. In the most complex model, adding read alignments that fall on splice junctions plays a critical role in assessing the specific splice-form expression levels. In this scenario additional evidence will boost the expression level estimate for the dominant splice-form, while depressing the expression level estimates for the others.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC3316937&req=5

FA3: Exemplary gene models. Schematic diagrams presents three example gene models. The top row displays a hypothetical coverage assuming a uniform distribution of reads falling entirely within the exons. The models (A–C) exhibit increasing complexity. In the first model, it is still possible to assess the expression of alternative splice forms even without reads covering exon junctions. Additional junction spanning reads will moderately affect expression level estimates. In the next model, adding read alignments that fall on splice junctions can already considerably affect presence calls and estimates of specific splice-form expression levels. In the most complex model, adding read alignments that fall on splice junctions plays a critical role in assessing the specific splice-form expression levels. In this scenario additional evidence will boost the expression level estimate for the dominant splice-form, while depressing the expression level estimates for the others.
Mentions: First, consider the simple gene models of Figure A3A. At sufficient coverage, it is possible to assess the expression of both splice forms even without reads spanning exon junction. Adding read alignments that fall on splice junctions will therefore slightly increase the coverage at exon boundaries and thus increase the respective expression levels. This contributes to the observation in the scatter plots (Figure A2), that the expression levels for MapAl are in general higher (densities below the diagonal).

Bottom Line: We here introduce MapAl, a tool for RNA-Seq expression profiling that builds on the established programs Bowtie and Cufflinks.In the post-processing of RNA-Seq reads, it incorporates gene models already at the stage of read alignment, increasing the number of reliably measured known transcripts consistently by 50%.Adding genes identified de novo then allows a reliable assessment of double the total number of transcripts compared to other available pipelines.

View Article: PubMed Central - PubMed

Affiliation: Department of Biotechnology, Boku University Vienna Vienna, Austria.

ABSTRACT
With currently available RNA-Seq pipelines, expression estimates for most genes are very noisy. We here introduce MapAl, a tool for RNA-Seq expression profiling that builds on the established programs Bowtie and Cufflinks. In the post-processing of RNA-Seq reads, it incorporates gene models already at the stage of read alignment, increasing the number of reliably measured known transcripts consistently by 50%. Adding genes identified de novo then allows a reliable assessment of double the total number of transcripts compared to other available pipelines. This substantial improvement is of general relevance: Measurement precision determines the power of any analysis to reliably identify significant signals, such as in screens for differential expression, independent of whether the experimental design incorporates replicates or not.

No MeSH data available.


Related in: MedlinePlus