Limits...
RNA-seq analysis for detecting quantitative trait-associated genes.

Seo M, Kim K, Yoon J, Jeong JY, Lee HJ, Cho S, Kim H - Sci Rep (2016)

Bottom Line: Many recent RNA-seq studies were focused mainly on detecting the differentially expressed genes (DEGs) between two or more conditions.In contrast, only a few attempts have been made to detect genes associated with quantitative traits, such as obesity index and milk yield, on RNA-seq experiment with large number of biological replicates.This study illustrates the linear model application on trait associated genes (TAGs) detection in two real RNA-seq datasets: 89 replicated human obesity related data and 21 replicated Holsteins' milk production related RNA-seq data.

View Article: PubMed Central - PubMed

Affiliation: Interdisciplinary Program in Bioinformatics, Seoul National University, Kwan-ak St. 599, Kwan-ak Gu, Seoul, South Korea 151-741, Republic of Korea.

ABSTRACT
Many recent RNA-seq studies were focused mainly on detecting the differentially expressed genes (DEGs) between two or more conditions. In contrast, only a few attempts have been made to detect genes associated with quantitative traits, such as obesity index and milk yield, on RNA-seq experiment with large number of biological replicates. This study illustrates the linear model application on trait associated genes (TAGs) detection in two real RNA-seq datasets: 89 replicated human obesity related data and 21 replicated Holsteins' milk production related RNA-seq data. Based on these two datasets, the performance between suggesting methods, such as ordinary regression and robust regression, and existing methods: DESeq2 and Voom, were compared. The results indicate that suggesting methods have much lower false discoveries compared to the precedent two group comparisons based approaches in our simulation study and qRT-PCR experiment. In particular, the robust regression outperforms existing DEG finding method as well as ordinary regression in terms of precision. Given the current trend in RNA-seq pricing, we expect our methods to be successfully applied in various RNA-seq studies with numerous biological replicates that handle continuous response traits.

No MeSH data available.


Related in: MedlinePlus

Calculation of the proportion of false discoveries using mock comparison.The result of proportion of false discoveries using mock comparison. As minimum number of replicates required for least square estimator (LSE) is 3 samples, 3 to 37 replicates in Human RNA-seq data and 3 to 9 replicates in bovine RNA-seq data in each group, were employed for mock comparison. The x-axis is number of biological replicate in each group and y-axis is proportion of false discoveries or number of significantly detected gene in each methods, (a,c) and (b,d), respectively, with 5% significance level. We observed suggesting approaches have smaller portion of false discoveries than existing methods especially in DESeq2.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC4829873&req=5

f4: Calculation of the proportion of false discoveries using mock comparison.The result of proportion of false discoveries using mock comparison. As minimum number of replicates required for least square estimator (LSE) is 3 samples, 3 to 37 replicates in Human RNA-seq data and 3 to 9 replicates in bovine RNA-seq data in each group, were employed for mock comparison. The x-axis is number of biological replicate in each group and y-axis is proportion of false discoveries or number of significantly detected gene in each methods, (a,c) and (b,d), respectively, with 5% significance level. We observed suggesting approaches have smaller portion of false discoveries than existing methods especially in DESeq2.

Mentions: In order to measure proportion of false discoveries in suggesting (Ordinary and robust regression) and existing methods (DESeq2 and limma voom), mock comparison was performed. The outcome, of the simulation study using human data, suggests ordinary regression and Voom have smaller proportion of false discoveries compared to the robust regression and DESeq2 as shown in Fig. 4a. In the figure, estimated proportion of false discoveries were similar in these three methods: robust regression, ordinary regression, and Voom, corresponding to the increase in number of biological replicates in each group. In Fig. 4b, in the three methods, we observed that number of significantly detected genes were increased as the biological replicates increased. Above all, DESeq2 and robust regression detect a larger number of significant genes compared to the ordinary regression and Voom, in same conditions. From these results (Fig. 4a,b), although DESeq2 can detect more significant genes than other methods, larger proportion of false discoveries was identified than the others. On the other hand, Voom and ordinary regression can detect comparatively smaller numbers of significant genes, but proportions of false discoveries are smaller than the others. Notably, robust regression show not only similar number of significantly detected genes as DESeq2, but also a lower proportion of false discoveries like the ordinary regression and Voom.


RNA-seq analysis for detecting quantitative trait-associated genes.

Seo M, Kim K, Yoon J, Jeong JY, Lee HJ, Cho S, Kim H - Sci Rep (2016)

Calculation of the proportion of false discoveries using mock comparison.The result of proportion of false discoveries using mock comparison. As minimum number of replicates required for least square estimator (LSE) is 3 samples, 3 to 37 replicates in Human RNA-seq data and 3 to 9 replicates in bovine RNA-seq data in each group, were employed for mock comparison. The x-axis is number of biological replicate in each group and y-axis is proportion of false discoveries or number of significantly detected gene in each methods, (a,c) and (b,d), respectively, with 5% significance level. We observed suggesting approaches have smaller portion of false discoveries than existing methods especially in DESeq2.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC4829873&req=5

f4: Calculation of the proportion of false discoveries using mock comparison.The result of proportion of false discoveries using mock comparison. As minimum number of replicates required for least square estimator (LSE) is 3 samples, 3 to 37 replicates in Human RNA-seq data and 3 to 9 replicates in bovine RNA-seq data in each group, were employed for mock comparison. The x-axis is number of biological replicate in each group and y-axis is proportion of false discoveries or number of significantly detected gene in each methods, (a,c) and (b,d), respectively, with 5% significance level. We observed suggesting approaches have smaller portion of false discoveries than existing methods especially in DESeq2.
Mentions: In order to measure proportion of false discoveries in suggesting (Ordinary and robust regression) and existing methods (DESeq2 and limma voom), mock comparison was performed. The outcome, of the simulation study using human data, suggests ordinary regression and Voom have smaller proportion of false discoveries compared to the robust regression and DESeq2 as shown in Fig. 4a. In the figure, estimated proportion of false discoveries were similar in these three methods: robust regression, ordinary regression, and Voom, corresponding to the increase in number of biological replicates in each group. In Fig. 4b, in the three methods, we observed that number of significantly detected genes were increased as the biological replicates increased. Above all, DESeq2 and robust regression detect a larger number of significant genes compared to the ordinary regression and Voom, in same conditions. From these results (Fig. 4a,b), although DESeq2 can detect more significant genes than other methods, larger proportion of false discoveries was identified than the others. On the other hand, Voom and ordinary regression can detect comparatively smaller numbers of significant genes, but proportions of false discoveries are smaller than the others. Notably, robust regression show not only similar number of significantly detected genes as DESeq2, but also a lower proportion of false discoveries like the ordinary regression and Voom.

Bottom Line: Many recent RNA-seq studies were focused mainly on detecting the differentially expressed genes (DEGs) between two or more conditions.In contrast, only a few attempts have been made to detect genes associated with quantitative traits, such as obesity index and milk yield, on RNA-seq experiment with large number of biological replicates.This study illustrates the linear model application on trait associated genes (TAGs) detection in two real RNA-seq datasets: 89 replicated human obesity related data and 21 replicated Holsteins' milk production related RNA-seq data.

View Article: PubMed Central - PubMed

Affiliation: Interdisciplinary Program in Bioinformatics, Seoul National University, Kwan-ak St. 599, Kwan-ak Gu, Seoul, South Korea 151-741, Republic of Korea.

ABSTRACT
Many recent RNA-seq studies were focused mainly on detecting the differentially expressed genes (DEGs) between two or more conditions. In contrast, only a few attempts have been made to detect genes associated with quantitative traits, such as obesity index and milk yield, on RNA-seq experiment with large number of biological replicates. This study illustrates the linear model application on trait associated genes (TAGs) detection in two real RNA-seq datasets: 89 replicated human obesity related data and 21 replicated Holsteins' milk production related RNA-seq data. Based on these two datasets, the performance between suggesting methods, such as ordinary regression and robust regression, and existing methods: DESeq2 and Voom, were compared. The results indicate that suggesting methods have much lower false discoveries compared to the precedent two group comparisons based approaches in our simulation study and qRT-PCR experiment. In particular, the robust regression outperforms existing DEG finding method as well as ordinary regression in terms of precision. Given the current trend in RNA-seq pricing, we expect our methods to be successfully applied in various RNA-seq studies with numerous biological replicates that handle continuous response traits.

No MeSH data available.


Related in: MedlinePlus