Limits...
RNA-seq analysis for detecting quantitative trait-associated genes.

Seo M, Kim K, Yoon J, Jeong JY, Lee HJ, Cho S, Kim H - Sci Rep (2016)

Bottom Line: Many recent RNA-seq studies were focused mainly on detecting the differentially expressed genes (DEGs) between two or more conditions.In contrast, only a few attempts have been made to detect genes associated with quantitative traits, such as obesity index and milk yield, on RNA-seq experiment with large number of biological replicates.This study illustrates the linear model application on trait associated genes (TAGs) detection in two real RNA-seq datasets: 89 replicated human obesity related data and 21 replicated Holsteins' milk production related RNA-seq data.

View Article: PubMed Central - PubMed

Affiliation: Interdisciplinary Program in Bioinformatics, Seoul National University, Kwan-ak St. 599, Kwan-ak Gu, Seoul, South Korea 151-741, Republic of Korea.

ABSTRACT
Many recent RNA-seq studies were focused mainly on detecting the differentially expressed genes (DEGs) between two or more conditions. In contrast, only a few attempts have been made to detect genes associated with quantitative traits, such as obesity index and milk yield, on RNA-seq experiment with large number of biological replicates. This study illustrates the linear model application on trait associated genes (TAGs) detection in two real RNA-seq datasets: 89 replicated human obesity related data and 21 replicated Holsteins' milk production related RNA-seq data. Based on these two datasets, the performance between suggesting methods, such as ordinary regression and robust regression, and existing methods: DESeq2 and Voom, were compared. The results indicate that suggesting methods have much lower false discoveries compared to the precedent two group comparisons based approaches in our simulation study and qRT-PCR experiment. In particular, the robust regression outperforms existing DEG finding method as well as ordinary regression in terms of precision. Given the current trend in RNA-seq pricing, we expect our methods to be successfully applied in various RNA-seq studies with numerous biological replicates that handle continuous response traits.

No MeSH data available.


Related in: MedlinePlus

Identification of trait associated genes using robust regression adjusting for outlier effects.(a) The top 10 genes with a large difference between ordinary and robust regression, which were visualized as fitted plot from the analysis of human BMI related RNA-seq data. (b) Dramatically different top 10 genes were visualized as fitted plot in the bovine RNA-seq data. In the (a,b), blue and red lines represent fitted line in the ordinary and robust regression, respectively, and the standard errors represent same color with fitted line. (c) Venn-diagram for comparing TAG list in each model with human RNA-seq data. (d) Venn-diagram to compare significantly detected gene list in the bovine RNA-seq data analysis. In the (c,d), raw P-value < 0.05 is used as cutoff for comparison among different methods.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC4829873&req=5

f2: Identification of trait associated genes using robust regression adjusting for outlier effects.(a) The top 10 genes with a large difference between ordinary and robust regression, which were visualized as fitted plot from the analysis of human BMI related RNA-seq data. (b) Dramatically different top 10 genes were visualized as fitted plot in the bovine RNA-seq data. In the (a,b), blue and red lines represent fitted line in the ordinary and robust regression, respectively, and the standard errors represent same color with fitted line. (c) Venn-diagram for comparing TAG list in each model with human RNA-seq data. (d) Venn-diagram to compare significantly detected gene list in the bovine RNA-seq data analysis. In the (c,d), raw P-value < 0.05 is used as cutoff for comparison among different methods.

Mentions: In the human RNA-seq data analysis, 932 genes were significantly detected as BMI associated gene (FDR adjusted P-value < 0.1) (Supplementary file 3). Compared to the results of the ordinary linear regression, relatively larger numbers of BMI associated genes were detected in the robust regression. In order to the visually check the differences between two methods, top 10 differentially expressed genes (based on the P-value difference) were visualized as fitted plots in Fig. 2a. A small discrepancy was observed between the two methods in terms of estimated slopes and standard errors. In Fig. 2c, numbers of significantly detected genes were compared among three statistical models based on their p-values (P-value < 0.05). As illustrated, a large number of detected genes were commonly identified among three methods. In addition, we observed that robust regression detects larger number of significant genes than ordinary regression by adjusting outlier effects.


RNA-seq analysis for detecting quantitative trait-associated genes.

Seo M, Kim K, Yoon J, Jeong JY, Lee HJ, Cho S, Kim H - Sci Rep (2016)

Identification of trait associated genes using robust regression adjusting for outlier effects.(a) The top 10 genes with a large difference between ordinary and robust regression, which were visualized as fitted plot from the analysis of human BMI related RNA-seq data. (b) Dramatically different top 10 genes were visualized as fitted plot in the bovine RNA-seq data. In the (a,b), blue and red lines represent fitted line in the ordinary and robust regression, respectively, and the standard errors represent same color with fitted line. (c) Venn-diagram for comparing TAG list in each model with human RNA-seq data. (d) Venn-diagram to compare significantly detected gene list in the bovine RNA-seq data analysis. In the (c,d), raw P-value < 0.05 is used as cutoff for comparison among different methods.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC4829873&req=5

f2: Identification of trait associated genes using robust regression adjusting for outlier effects.(a) The top 10 genes with a large difference between ordinary and robust regression, which were visualized as fitted plot from the analysis of human BMI related RNA-seq data. (b) Dramatically different top 10 genes were visualized as fitted plot in the bovine RNA-seq data. In the (a,b), blue and red lines represent fitted line in the ordinary and robust regression, respectively, and the standard errors represent same color with fitted line. (c) Venn-diagram for comparing TAG list in each model with human RNA-seq data. (d) Venn-diagram to compare significantly detected gene list in the bovine RNA-seq data analysis. In the (c,d), raw P-value < 0.05 is used as cutoff for comparison among different methods.
Mentions: In the human RNA-seq data analysis, 932 genes were significantly detected as BMI associated gene (FDR adjusted P-value < 0.1) (Supplementary file 3). Compared to the results of the ordinary linear regression, relatively larger numbers of BMI associated genes were detected in the robust regression. In order to the visually check the differences between two methods, top 10 differentially expressed genes (based on the P-value difference) were visualized as fitted plots in Fig. 2a. A small discrepancy was observed between the two methods in terms of estimated slopes and standard errors. In Fig. 2c, numbers of significantly detected genes were compared among three statistical models based on their p-values (P-value < 0.05). As illustrated, a large number of detected genes were commonly identified among three methods. In addition, we observed that robust regression detects larger number of significant genes than ordinary regression by adjusting outlier effects.

Bottom Line: Many recent RNA-seq studies were focused mainly on detecting the differentially expressed genes (DEGs) between two or more conditions.In contrast, only a few attempts have been made to detect genes associated with quantitative traits, such as obesity index and milk yield, on RNA-seq experiment with large number of biological replicates.This study illustrates the linear model application on trait associated genes (TAGs) detection in two real RNA-seq datasets: 89 replicated human obesity related data and 21 replicated Holsteins' milk production related RNA-seq data.

View Article: PubMed Central - PubMed

Affiliation: Interdisciplinary Program in Bioinformatics, Seoul National University, Kwan-ak St. 599, Kwan-ak Gu, Seoul, South Korea 151-741, Republic of Korea.

ABSTRACT
Many recent RNA-seq studies were focused mainly on detecting the differentially expressed genes (DEGs) between two or more conditions. In contrast, only a few attempts have been made to detect genes associated with quantitative traits, such as obesity index and milk yield, on RNA-seq experiment with large number of biological replicates. This study illustrates the linear model application on trait associated genes (TAGs) detection in two real RNA-seq datasets: 89 replicated human obesity related data and 21 replicated Holsteins' milk production related RNA-seq data. Based on these two datasets, the performance between suggesting methods, such as ordinary regression and robust regression, and existing methods: DESeq2 and Voom, were compared. The results indicate that suggesting methods have much lower false discoveries compared to the precedent two group comparisons based approaches in our simulation study and qRT-PCR experiment. In particular, the robust regression outperforms existing DEG finding method as well as ordinary regression in terms of precision. Given the current trend in RNA-seq pricing, we expect our methods to be successfully applied in various RNA-seq studies with numerous biological replicates that handle continuous response traits.

No MeSH data available.


Related in: MedlinePlus