Limits...
Inference of alternative splicing from RNA-Seq data with probabilistic splice graphs.

LeGault LH, Dewey CN - Bioinformatics (2013)

Bottom Line: Alternative splicing and other processes that allow for different transcripts to be derived from the same gene are significant forces in the eukaryotic cell.RNA-Seq is a promising technology for analyzing alternative transcripts, as it does not require prior knowledge of transcript structures or genome sequences.We present RNA-Seq models and associated inference algorithms based on the concept of probabilistic splice graphs, which alleviate these issues.

View Article: PubMed Central - PubMed

Affiliation: Department of Computer Sciences, University of Wisconsin, Madison, WI 53706, USA.

ABSTRACT

Motivation: Alternative splicing and other processes that allow for different transcripts to be derived from the same gene are significant forces in the eukaryotic cell. RNA-Seq is a promising technology for analyzing alternative transcripts, as it does not require prior knowledge of transcript structures or genome sequences. However, analysis of RNA-Seq data in the presence of genes with large numbers of alternative transcripts is currently challenging due to efficiency, identifiability and representation issues.

Results: We present RNA-Seq models and associated inference algorithms based on the concept of probabilistic splice graphs, which alleviate these issues. We prove that our models are often identifiable and demonstrate that our inference methods for quantification and differential processing detection are efficient and accurate.

Availability: Software implementing our methods is available at http://deweylab.biostat.wisc.edu/psginfer.

Contact: cdewey@biostat.wisc.edu

Supplementary information: Supplementary data are available at Bioinformatics online.

Show MeSH
Example PSG representations for the mouse gene Gfra4. (A) A UCSC Genome Browser visualization of the seven annotated isoforms of this gene. (B) The line graph PSG. (C) The first-order exon graph. (D) A higher-order exon graph. In this graph, the AP events immediately following the longest exon are allowed to depend on the AP event directly preceding the exon, in contrast to the first-order exon graph, in which these AP events are independent of each other, given that the longest exon is included in the transcript. (E) An unfactorized PSG
© Copyright Policy - creative-commons
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC3753571&req=5

btt396-F2: Example PSG representations for the mouse gene Gfra4. (A) A UCSC Genome Browser visualization of the seven annotated isoforms of this gene. (B) The line graph PSG. (C) The first-order exon graph. (D) A higher-order exon graph. In this graph, the AP events immediately following the longest exon are allowed to depend on the AP event directly preceding the exon, in contrast to the first-order exon graph, in which these AP events are independent of each other, given that the longest exon is included in the transcript. (E) An unfactorized PSG

Mentions: Many PSG structures can be used to model the set of isoforms for a gene. For example, Figure 2 gives four PSGs that all represent the mouse gene Gfra4, which has seven possible isoforms according to the UCSC Genes annotation (Hsu et al., 2006). These different PSGs are closely related to the various forms of splice graphs that have been used in splice graph databases (Bollina et al., 2006). Like the splice graphs in these databases, PSGs can vary in the number of isoforms they allow. In addition, PSGs can vary in the family of probability distributions they define over the set of isoforms.Fig. 2.


Inference of alternative splicing from RNA-Seq data with probabilistic splice graphs.

LeGault LH, Dewey CN - Bioinformatics (2013)

Example PSG representations for the mouse gene Gfra4. (A) A UCSC Genome Browser visualization of the seven annotated isoforms of this gene. (B) The line graph PSG. (C) The first-order exon graph. (D) A higher-order exon graph. In this graph, the AP events immediately following the longest exon are allowed to depend on the AP event directly preceding the exon, in contrast to the first-order exon graph, in which these AP events are independent of each other, given that the longest exon is included in the transcript. (E) An unfactorized PSG
© Copyright Policy - creative-commons
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC3753571&req=5

btt396-F2: Example PSG representations for the mouse gene Gfra4. (A) A UCSC Genome Browser visualization of the seven annotated isoforms of this gene. (B) The line graph PSG. (C) The first-order exon graph. (D) A higher-order exon graph. In this graph, the AP events immediately following the longest exon are allowed to depend on the AP event directly preceding the exon, in contrast to the first-order exon graph, in which these AP events are independent of each other, given that the longest exon is included in the transcript. (E) An unfactorized PSG
Mentions: Many PSG structures can be used to model the set of isoforms for a gene. For example, Figure 2 gives four PSGs that all represent the mouse gene Gfra4, which has seven possible isoforms according to the UCSC Genes annotation (Hsu et al., 2006). These different PSGs are closely related to the various forms of splice graphs that have been used in splice graph databases (Bollina et al., 2006). Like the splice graphs in these databases, PSGs can vary in the number of isoforms they allow. In addition, PSGs can vary in the family of probability distributions they define over the set of isoforms.Fig. 2.

Bottom Line: Alternative splicing and other processes that allow for different transcripts to be derived from the same gene are significant forces in the eukaryotic cell.RNA-Seq is a promising technology for analyzing alternative transcripts, as it does not require prior knowledge of transcript structures or genome sequences.We present RNA-Seq models and associated inference algorithms based on the concept of probabilistic splice graphs, which alleviate these issues.

View Article: PubMed Central - PubMed

Affiliation: Department of Computer Sciences, University of Wisconsin, Madison, WI 53706, USA.

ABSTRACT

Motivation: Alternative splicing and other processes that allow for different transcripts to be derived from the same gene are significant forces in the eukaryotic cell. RNA-Seq is a promising technology for analyzing alternative transcripts, as it does not require prior knowledge of transcript structures or genome sequences. However, analysis of RNA-Seq data in the presence of genes with large numbers of alternative transcripts is currently challenging due to efficiency, identifiability and representation issues.

Results: We present RNA-Seq models and associated inference algorithms based on the concept of probabilistic splice graphs, which alleviate these issues. We prove that our models are often identifiable and demonstrate that our inference methods for quantification and differential processing detection are efficient and accurate.

Availability: Software implementing our methods is available at http://deweylab.biostat.wisc.edu/psginfer.

Contact: cdewey@biostat.wisc.edu

Supplementary information: Supplementary data are available at Bioinformatics online.

Show MeSH