Limits...
A hierarchical Bayesian model for comparing transcriptomes at the individual transcript isoform level.

Zheng S, Chen L - Nucleic Acids Res. (2009)

Bottom Line: Model parameters were inferred based on an ergodic Markov chain generated by our Gibbs sampler.We applied BASIS to a human tiling-array data set and a mouse RNA-seq data set.Some of the predictions were validated by quantitative real-time RT-PCR experiments.

View Article: PubMed Central - PubMed

Affiliation: Howard Hughes Medical Institute, University of California, Los Angeles, Los Angeles, CA 90095, USA.

ABSTRACT
The complexity of mammalian transcriptomes is compounded by alternative splicing which allows one gene to produce multiple transcript isoforms. However, transcriptome comparison has been limited to differential analysis at the gene level instead of the individual transcript isoform level. High-throughput sequencing technologies and high-resolution tiling arrays provide an unprecedented opportunity to compare transcriptomes at the level of individual splice variants. However, sequence read coverage or probe intensity at each position may represent a family of splice variants instead of one single isoform. Here we propose a hierarchical Bayesian model, BASIS (Bayesian Analysis of Splicing IsoformS), to infer the differential expression level of each transcript isoform in response to two conditions. A latent variable was introduced to perform direct statistical selection of differentially expressed isoforms. Model parameters were inferred based on an ergodic Markov chain generated by our Gibbs sampler. BASIS has the ability to borrow information across different probes (or positions) from the same genes and different genes. BASIS can handle the heteroskedasticity of probe intensity or sequence read coverage. We applied BASIS to a human tiling-array data set and a mouse RNA-seq data set. Some of the predictions were validated by quantitative real-time RT-PCR experiments.

Show MeSH

Related in: MedlinePlus

Experimental validation of BASIS prediction. Real time RT–PCR barplots of tested transcripts’ relative expression levels between mouse brain and liver (A), between mouse brain and muscle (B), and between HeLa and HepG2 cells (C). Relative expression ratio (condition 1/condition 2) = 1 means no differential expression between two conditions. Relative expression ratio >1 means higher expression in condition 1. Relative expression ratio <1 means higher expression in condition 2. Black bars are transcripts predicted to have higher expression levels in condition 1 by BASIS and white bars are transcripts predicted to have higher expression levels in condition 2. Gray bars are those predicted not to be differentially expressed between two conditions. Value represents mean ± SEM, N = 3.
© Copyright Policy - creative-commons
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC2691848&req=5

Figure 6: Experimental validation of BASIS prediction. Real time RT–PCR barplots of tested transcripts’ relative expression levels between mouse brain and liver (A), between mouse brain and muscle (B), and between HeLa and HepG2 cells (C). Relative expression ratio (condition 1/condition 2) = 1 means no differential expression between two conditions. Relative expression ratio >1 means higher expression in condition 1. Relative expression ratio <1 means higher expression in condition 2. Black bars are transcripts predicted to have higher expression levels in condition 1 by BASIS and white bars are transcripts predicted to have higher expression levels in condition 2. Gray bars are those predicted not to be differentially expressed between two conditions. Value represents mean ± SEM, N = 3.

Mentions: To further examine the prediction power of BASIS, we subsequently performed real time RT–PCR experiments to assay transcript isoforms’ relative expression levels between adult mouse brain and liver, between adult mouse brain and muscle, and between HeLa and HepG2 cells. We were particularly interested in genes whose isoforms show distinct differential expression patterns between the two conditions. For example, one transcript isoform is up-regulated in brain than in liver, whereas anther transcript isoform of the same gene is down-regulated or is not differentially expressed. For each tested transcript isoform, we designed one of the two PCR primers from the isoform-specific exonic region or exon junction that exclusively represents the isoform. For the RNA-seq data, we randomly tested the relative expression levels of 14 transcript isoforms between mouse brain and liver (Figure 6A), but the transcript isoforms were required to have an isoform-specific exonic region or exon junction and the selection was biased toward genes with isoforms showing distinct expression patterns. Transcripts TRAN00000157032 (Slc25a25), ENSMUST00000115599 (Pcdh1), TRAN00000139600 (Mrps12), TRAN00000123912 (M6prbp1) and TRAN00000143381 (Clu) were predicted to be up-regulated in brain than in liver by BASIS (black bars in Figure 6A). Transcripts TRAN00000157033 (Slc25a25), ENSMUST00000057185 (Pcdh1), ENSMUST00000019726 (M6prbp1), TRAN00000161590 (Esd), TRAN00000143382 (Clu) and ENSMUST00000000335 (Comt) were predicted to be down-regulated in brain than in liver (white bars). Transcripts TRAN00000139599 (Mrps12), TRAN00000161592 (Esd) and ENSMUST00000115609 (Comt) were predicted not to be differentially expressed between the two tissues (grey bars). As shown in Figure 6A, all of the transcripts except TRAN00000143381 (Clu) and ENSMUST00000115609 (Comt) show the predicted differential expression patterns. We also tested these transcripts’ relative expression ratios between mouse brain and muscle (Figure 6B). All transcripts except Transcripts TRAN00000157033 (Slc25a25), TRAN00000161592 (Esd), ENSMUST00000115609 (Comt) and ENSMUST00000000335 (Comt) show the predicted differential expression patterns. More importantly, most of genes (except Clu in Figure 6A and B; Pcdh1 and Esd in Figure 6B) have their two transcript isoforms showing significantly different relative expression ratios (P-values based on Student's t-test ≤ 0.05). It shows that transcript isoforms of the same gene can have distinct expression patterns. However, the standard differentially expressed gene analysis cannot detect such subtle differences.Figure 6.


A hierarchical Bayesian model for comparing transcriptomes at the individual transcript isoform level.

Zheng S, Chen L - Nucleic Acids Res. (2009)

Experimental validation of BASIS prediction. Real time RT–PCR barplots of tested transcripts’ relative expression levels between mouse brain and liver (A), between mouse brain and muscle (B), and between HeLa and HepG2 cells (C). Relative expression ratio (condition 1/condition 2) = 1 means no differential expression between two conditions. Relative expression ratio >1 means higher expression in condition 1. Relative expression ratio <1 means higher expression in condition 2. Black bars are transcripts predicted to have higher expression levels in condition 1 by BASIS and white bars are transcripts predicted to have higher expression levels in condition 2. Gray bars are those predicted not to be differentially expressed between two conditions. Value represents mean ± SEM, N = 3.
© Copyright Policy - creative-commons
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC2691848&req=5

Figure 6: Experimental validation of BASIS prediction. Real time RT–PCR barplots of tested transcripts’ relative expression levels between mouse brain and liver (A), between mouse brain and muscle (B), and between HeLa and HepG2 cells (C). Relative expression ratio (condition 1/condition 2) = 1 means no differential expression between two conditions. Relative expression ratio >1 means higher expression in condition 1. Relative expression ratio <1 means higher expression in condition 2. Black bars are transcripts predicted to have higher expression levels in condition 1 by BASIS and white bars are transcripts predicted to have higher expression levels in condition 2. Gray bars are those predicted not to be differentially expressed between two conditions. Value represents mean ± SEM, N = 3.
Mentions: To further examine the prediction power of BASIS, we subsequently performed real time RT–PCR experiments to assay transcript isoforms’ relative expression levels between adult mouse brain and liver, between adult mouse brain and muscle, and between HeLa and HepG2 cells. We were particularly interested in genes whose isoforms show distinct differential expression patterns between the two conditions. For example, one transcript isoform is up-regulated in brain than in liver, whereas anther transcript isoform of the same gene is down-regulated or is not differentially expressed. For each tested transcript isoform, we designed one of the two PCR primers from the isoform-specific exonic region or exon junction that exclusively represents the isoform. For the RNA-seq data, we randomly tested the relative expression levels of 14 transcript isoforms between mouse brain and liver (Figure 6A), but the transcript isoforms were required to have an isoform-specific exonic region or exon junction and the selection was biased toward genes with isoforms showing distinct expression patterns. Transcripts TRAN00000157032 (Slc25a25), ENSMUST00000115599 (Pcdh1), TRAN00000139600 (Mrps12), TRAN00000123912 (M6prbp1) and TRAN00000143381 (Clu) were predicted to be up-regulated in brain than in liver by BASIS (black bars in Figure 6A). Transcripts TRAN00000157033 (Slc25a25), ENSMUST00000057185 (Pcdh1), ENSMUST00000019726 (M6prbp1), TRAN00000161590 (Esd), TRAN00000143382 (Clu) and ENSMUST00000000335 (Comt) were predicted to be down-regulated in brain than in liver (white bars). Transcripts TRAN00000139599 (Mrps12), TRAN00000161592 (Esd) and ENSMUST00000115609 (Comt) were predicted not to be differentially expressed between the two tissues (grey bars). As shown in Figure 6A, all of the transcripts except TRAN00000143381 (Clu) and ENSMUST00000115609 (Comt) show the predicted differential expression patterns. We also tested these transcripts’ relative expression ratios between mouse brain and muscle (Figure 6B). All transcripts except Transcripts TRAN00000157033 (Slc25a25), TRAN00000161592 (Esd), ENSMUST00000115609 (Comt) and ENSMUST00000000335 (Comt) show the predicted differential expression patterns. More importantly, most of genes (except Clu in Figure 6A and B; Pcdh1 and Esd in Figure 6B) have their two transcript isoforms showing significantly different relative expression ratios (P-values based on Student's t-test ≤ 0.05). It shows that transcript isoforms of the same gene can have distinct expression patterns. However, the standard differentially expressed gene analysis cannot detect such subtle differences.Figure 6.

Bottom Line: Model parameters were inferred based on an ergodic Markov chain generated by our Gibbs sampler.We applied BASIS to a human tiling-array data set and a mouse RNA-seq data set.Some of the predictions were validated by quantitative real-time RT-PCR experiments.

View Article: PubMed Central - PubMed

Affiliation: Howard Hughes Medical Institute, University of California, Los Angeles, Los Angeles, CA 90095, USA.

ABSTRACT
The complexity of mammalian transcriptomes is compounded by alternative splicing which allows one gene to produce multiple transcript isoforms. However, transcriptome comparison has been limited to differential analysis at the gene level instead of the individual transcript isoform level. High-throughput sequencing technologies and high-resolution tiling arrays provide an unprecedented opportunity to compare transcriptomes at the level of individual splice variants. However, sequence read coverage or probe intensity at each position may represent a family of splice variants instead of one single isoform. Here we propose a hierarchical Bayesian model, BASIS (Bayesian Analysis of Splicing IsoformS), to infer the differential expression level of each transcript isoform in response to two conditions. A latent variable was introduced to perform direct statistical selection of differentially expressed isoforms. Model parameters were inferred based on an ergodic Markov chain generated by our Gibbs sampler. BASIS has the ability to borrow information across different probes (or positions) from the same genes and different genes. BASIS can handle the heteroskedasticity of probe intensity or sequence read coverage. We applied BASIS to a human tiling-array data set and a mouse RNA-seq data set. Some of the predictions were validated by quantitative real-time RT-PCR experiments.

Show MeSH
Related in: MedlinePlus