Limits...
RNAseq by Total RNA Library Identifies Additional RNAs Compared to Poly(A) RNA Library.

Guo Y, Zhao S, Sheng Q, Guo M, Lehmann B, Pietenpol J, Samuels DC, Shyr Y - Biomed Res Int (2015)

Bottom Line: We found that the RNA expression values captured by both RNA libraries were highly correlated.However, the number of RNAs captured was significantly higher for the total RNA library.One of the most noticeable is the histone-encode genes, which lack the poly(A) tail.

View Article: PubMed Central - PubMed

Affiliation: Center for Quantitative Sciences, Vanderbilt University, Nashville, TN 37232, USA.

ABSTRACT
The most popular RNA library used for RNA sequencing is the poly(A) captured RNA library. This library captures RNA based on the presence of poly(A) tails at the 3' end. Another type of RNA library for RNA sequencing is the total RNA library which differs from the poly(A) library by capture method and price. The total RNA library costs more and its capture of RNA is not dependent on the presence of poly(A) tails. In practice, only ribosomal RNAs and small RNAs are washed out in the total RNA library preparation. To evaluate the ability of detecting RNA for both RNA libraries we designed a study using RNA sequencing data of the same two breast cancer cell lines from both RNA libraries. We found that the RNA expression values captured by both RNA libraries were highly correlated. However, the number of RNAs captured was significantly higher for the total RNA library. Furthermore, we identify several subsets of protein coding RNAs that were not captured efficiently by the poly(A) library. One of the most noticeable is the histone-encode genes, which lack the poly(A) tail.

No MeSH data available.


Related in: MedlinePlus

(a) Enrichment plot of histone-encoding genes from GSEA. Based on fold change ranked (total RNA versus poly(A)) gene list, histone-encoding genes were highly enriched (adjust P < 0.0001). (b) Normalized read count distribution of the 38 histone-encoding genes between poly(A) and total RNA libraries.
© Copyright Policy
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC4620295&req=5

fig5: (a) Enrichment plot of histone-encoding genes from GSEA. Based on fold change ranked (total RNA versus poly(A)) gene list, histone-encoding genes were highly enriched (adjust P < 0.0001). (b) Normalized read count distribution of the 38 histone-encoding genes between poly(A) and total RNA libraries.

Mentions: It has been shown that not all mRNAs necessarily contain a poly(A) tail at their 3′ ends [35]. For example, the mRNA that encodes histone proteins is nonpolyadenylated [36]. Another study has shown that a significant portion of the mRNA transcript has no poly(A) tail [37]. This can potentially explain why we observe more protein coding RNA detected by total RNA than the poly(A) method. To test this hypothesis, we searched through the ENSEMBL database and found 38 histone-encoding genes. We conducted enrichment analysis in GSEA using results from DESeq2 against the histone-encoding genes and found that our dataset was highly enriched (FDR < 0.0001) (Figure 5(a)). The expression value of the histone-encoding genes was clearly higher for total RNA library samples (Figure 5(b)). The GSEA showed that total RNA library samples captured histone-encoding genes at a much higher efficiency than the poly(A) library samples. Based on fold change results from DESeq2, there were 737 protein coding RNAs that have a log2 fold change greater than 2 (overexpressed in total RNA samples), which suggests that additional subsets of protein coding RNAs may be better captured using total RNA methods. To better categorize these potential subcategories of protein coding RNAs, we conducted GO analysis using WebGestalt (Figure S2) (Table 1). The top 10 subcategories of genes were found within all three big GO categories: biological process, molecular function, and cellular component. Eleven out of the 30 subcategories primarily consisted of histone-encoding genes. The other 19 subcategories were protein-DNA complex, chromatin, and so forth. No obvious pattern was recognizable. There were also 592 protein coding genes that were captured better by the poly(A) library samples (log2 fold change < −2). We also performed GO analysis on these genes (Figure S3) (Table 2). No clear gene pattern was detected.


RNAseq by Total RNA Library Identifies Additional RNAs Compared to Poly(A) RNA Library.

Guo Y, Zhao S, Sheng Q, Guo M, Lehmann B, Pietenpol J, Samuels DC, Shyr Y - Biomed Res Int (2015)

(a) Enrichment plot of histone-encoding genes from GSEA. Based on fold change ranked (total RNA versus poly(A)) gene list, histone-encoding genes were highly enriched (adjust P < 0.0001). (b) Normalized read count distribution of the 38 histone-encoding genes between poly(A) and total RNA libraries.
© Copyright Policy
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC4620295&req=5

fig5: (a) Enrichment plot of histone-encoding genes from GSEA. Based on fold change ranked (total RNA versus poly(A)) gene list, histone-encoding genes were highly enriched (adjust P < 0.0001). (b) Normalized read count distribution of the 38 histone-encoding genes between poly(A) and total RNA libraries.
Mentions: It has been shown that not all mRNAs necessarily contain a poly(A) tail at their 3′ ends [35]. For example, the mRNA that encodes histone proteins is nonpolyadenylated [36]. Another study has shown that a significant portion of the mRNA transcript has no poly(A) tail [37]. This can potentially explain why we observe more protein coding RNA detected by total RNA than the poly(A) method. To test this hypothesis, we searched through the ENSEMBL database and found 38 histone-encoding genes. We conducted enrichment analysis in GSEA using results from DESeq2 against the histone-encoding genes and found that our dataset was highly enriched (FDR < 0.0001) (Figure 5(a)). The expression value of the histone-encoding genes was clearly higher for total RNA library samples (Figure 5(b)). The GSEA showed that total RNA library samples captured histone-encoding genes at a much higher efficiency than the poly(A) library samples. Based on fold change results from DESeq2, there were 737 protein coding RNAs that have a log2 fold change greater than 2 (overexpressed in total RNA samples), which suggests that additional subsets of protein coding RNAs may be better captured using total RNA methods. To better categorize these potential subcategories of protein coding RNAs, we conducted GO analysis using WebGestalt (Figure S2) (Table 1). The top 10 subcategories of genes were found within all three big GO categories: biological process, molecular function, and cellular component. Eleven out of the 30 subcategories primarily consisted of histone-encoding genes. The other 19 subcategories were protein-DNA complex, chromatin, and so forth. No obvious pattern was recognizable. There were also 592 protein coding genes that were captured better by the poly(A) library samples (log2 fold change < −2). We also performed GO analysis on these genes (Figure S3) (Table 2). No clear gene pattern was detected.

Bottom Line: We found that the RNA expression values captured by both RNA libraries were highly correlated.However, the number of RNAs captured was significantly higher for the total RNA library.One of the most noticeable is the histone-encode genes, which lack the poly(A) tail.

View Article: PubMed Central - PubMed

Affiliation: Center for Quantitative Sciences, Vanderbilt University, Nashville, TN 37232, USA.

ABSTRACT
The most popular RNA library used for RNA sequencing is the poly(A) captured RNA library. This library captures RNA based on the presence of poly(A) tails at the 3' end. Another type of RNA library for RNA sequencing is the total RNA library which differs from the poly(A) library by capture method and price. The total RNA library costs more and its capture of RNA is not dependent on the presence of poly(A) tails. In practice, only ribosomal RNAs and small RNAs are washed out in the total RNA library preparation. To evaluate the ability of detecting RNA for both RNA libraries we designed a study using RNA sequencing data of the same two breast cancer cell lines from both RNA libraries. We found that the RNA expression values captured by both RNA libraries were highly correlated. However, the number of RNAs captured was significantly higher for the total RNA library. Furthermore, we identify several subsets of protein coding RNAs that were not captured efficiently by the poly(A) library. One of the most noticeable is the histone-encode genes, which lack the poly(A) tail.

No MeSH data available.


Related in: MedlinePlus