Limits...
Preferred analysis methods for single genomic regions in RNA sequencing revealed by processing the shape of coverage.

Okoniewski MJ, Leśniewska A, Szabelska A, Zyprych-Walczak J, Ryan M, Wachtel M, Morzy T, Schäfer B, Schlapbach R - Nucleic Acids Res. (2011)

Bottom Line: Then, 160 pipelines (5 types of generator × 4 normalizations × 8 difference measures) are compared.As a result, the best analysis pipelines are selected based on linearity of the differential expression estimation and the area under the ROC curve.They point out the exons with differential expression or internal splicing, even if the counts of reads may not show this.

View Article: PubMed Central - PubMed

Affiliation: Functional Genomics Center Zurich, UNI ETH Zurich, Winterthurerstrasse 190, CH-8057 Zurich, Switzerland. michal@fgcz.ethz.ch

ABSTRACT
The informational content of RNA sequencing is currently far from being completely explored. Most of the analyses focus on processing tables of counts or finding isoform deconvolution via exon junctions. This article presents a comparison of several techniques that can be used to estimate differential expression of exons or small genomic regions of expression, based on their coverage function shapes. The problem is defined as finding the differentially expressed exons between two samples using local expression profile normalization and statistical measures to spot the differences between two profile shapes. Initial experiments have been done using synthetic data, and real data modified with synthetically created differential patterns. Then, 160 pipelines (5 types of generator × 4 normalizations × 8 difference measures) are compared. As a result, the best analysis pipelines are selected based on linearity of the differential expression estimation and the area under the ROC curve. These platform-independent techniques have been implemented in the Bioconductor package rnaSeqMap. They point out the exons with differential expression or internal splicing, even if the counts of reads may not show this. The areas of application include significant difference searches, splicing identification algorithms and finding suitable regions for QPCR primers.

Show MeSH

Related in: MedlinePlus

RNA seq coverage profiles for a single exon, transformed by data generators with the degeneration coefficient d = 0.4. The red profile is the original one, while blue (partially overlapping with the red) is the modified profile. (a) Original coverage function (b) Synthetic data of the same domain length (c) Peak generator, s = 0.5, rl = 50 (d) Additive generator, s = 0.5 (e) Truncation generator (f) Multiplicative generator, s = 0.5.
© Copyright Policy - creative-commons
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC3351146&req=5

gkr1249-F1: RNA seq coverage profiles for a single exon, transformed by data generators with the degeneration coefficient d = 0.4. The red profile is the original one, while blue (partially overlapping with the red) is the modified profile. (a) Original coverage function (b) Synthetic data of the same domain length (c) Peak generator, s = 0.5, rl = 50 (d) Additive generator, s = 0.5 (e) Truncation generator (f) Multiplicative generator, s = 0.5.

Mentions: Generators are, in this context, functions that convert a coverage on a given region into another coverage function by imposing a specific type of degeneration, measured by the level of degeneration d (see Figure 1).


Preferred analysis methods for single genomic regions in RNA sequencing revealed by processing the shape of coverage.

Okoniewski MJ, Leśniewska A, Szabelska A, Zyprych-Walczak J, Ryan M, Wachtel M, Morzy T, Schäfer B, Schlapbach R - Nucleic Acids Res. (2011)

RNA seq coverage profiles for a single exon, transformed by data generators with the degeneration coefficient d = 0.4. The red profile is the original one, while blue (partially overlapping with the red) is the modified profile. (a) Original coverage function (b) Synthetic data of the same domain length (c) Peak generator, s = 0.5, rl = 50 (d) Additive generator, s = 0.5 (e) Truncation generator (f) Multiplicative generator, s = 0.5.
© Copyright Policy - creative-commons
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC3351146&req=5

gkr1249-F1: RNA seq coverage profiles for a single exon, transformed by data generators with the degeneration coefficient d = 0.4. The red profile is the original one, while blue (partially overlapping with the red) is the modified profile. (a) Original coverage function (b) Synthetic data of the same domain length (c) Peak generator, s = 0.5, rl = 50 (d) Additive generator, s = 0.5 (e) Truncation generator (f) Multiplicative generator, s = 0.5.
Mentions: Generators are, in this context, functions that convert a coverage on a given region into another coverage function by imposing a specific type of degeneration, measured by the level of degeneration d (see Figure 1).

Bottom Line: Then, 160 pipelines (5 types of generator × 4 normalizations × 8 difference measures) are compared.As a result, the best analysis pipelines are selected based on linearity of the differential expression estimation and the area under the ROC curve.They point out the exons with differential expression or internal splicing, even if the counts of reads may not show this.

View Article: PubMed Central - PubMed

Affiliation: Functional Genomics Center Zurich, UNI ETH Zurich, Winterthurerstrasse 190, CH-8057 Zurich, Switzerland. michal@fgcz.ethz.ch

ABSTRACT
The informational content of RNA sequencing is currently far from being completely explored. Most of the analyses focus on processing tables of counts or finding isoform deconvolution via exon junctions. This article presents a comparison of several techniques that can be used to estimate differential expression of exons or small genomic regions of expression, based on their coverage function shapes. The problem is defined as finding the differentially expressed exons between two samples using local expression profile normalization and statistical measures to spot the differences between two profile shapes. Initial experiments have been done using synthetic data, and real data modified with synthetically created differential patterns. Then, 160 pipelines (5 types of generator × 4 normalizations × 8 difference measures) are compared. As a result, the best analysis pipelines are selected based on linearity of the differential expression estimation and the area under the ROC curve. These platform-independent techniques have been implemented in the Bioconductor package rnaSeqMap. They point out the exons with differential expression or internal splicing, even if the counts of reads may not show this. The areas of application include significant difference searches, splicing identification algorithms and finding suitable regions for QPCR primers.

Show MeSH
Related in: MedlinePlus