Limits...
Quantitative set analysis for gene expression: a method to quantify gene set differential expression including gene-gene correlations.

Yaari G, Bolen CR, Thakar J, Kleinstein SH - Nucleic Acids Res. (2013)

Bottom Line: Existing tests are affected by inter-gene correlations, resulting in a high Type I error.From this probability density function, P-values and confidence intervals can be extracted and post hoc analysis can be carried out while maintaining statistical traceability.QuSAGE is available as an R package, which includes the core functions for the method as well as functions to plot and visualize the results.

View Article: PubMed Central - PubMed

Affiliation: Department of Pathology, Yale University School of Medicine, New Haven, CT 06511, USA, Bioengineering program, Faculty of engineering, Bar Ilan University, 5290002, Ramat Gan, Israel and Interdepartmental Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT 06511, USA.

ABSTRACT
Enrichment analysis of gene sets is a popular approach that provides a functional interpretation of genome-wide expression data. Existing tests are affected by inter-gene correlations, resulting in a high Type I error. The most widely used test, Gene Set Enrichment Analysis, relies on computationally intensive permutations of sample labels to generate a distribution that preserves gene-gene correlations. A more recent approach, CAMERA, attempts to correct for these correlations by estimating a variance inflation factor directly from the data. Although these methods generate P-values for detecting gene set activity, they are unable to produce confidence intervals or allow for post hoc comparisons. We have developed a new computational framework for Quantitative Set Analysis of Gene Expression (QuSAGE). QuSAGE accounts for inter-gene correlations, improves the estimation of the variance inflation factor and, rather than evaluating the deviation from a hypothesis with a P-value, it quantifies gene-set activity with a complete probability density function. From this probability density function, P-values and confidence intervals can be extracted and post hoc analysis can be carried out while maintaining statistical traceability. Compared with Gene Set Enrichment Analysis and CAMERA, QuSAGE exhibits better sensitivity and specificity on real data profiling the response to interferon therapy (in chronic Hepatitis C virus patients) and Influenza A virus infection. QuSAGE is available as an R package, which includes the core functions for the method as well as functions to plot and visualize the results.

Show MeSH

Related in: MedlinePlus

QuSAGE detects earlier and more significant ISG activity in symptomatic (versus asymptomatic) human subjects following influenza exposure. ISG activity was quantified at each time point using (A) QuSAGE and (B) GSEA. Color-coding indicates the P-values for detecting activity in asymptomatic (circles) and symptomatic (squares) subjects relative to pre-exposure levels. (C) ISG activity was compared directly between the asymptomatic and symptomatic subject groups using QuSAGE, GSEA and CAMERA. Color-coding indicates the P-values using the same color key as panels (A) and (B). QuSAGE and CAMERA both estimate the average activity using the same statistic, although the P-values can differ.
© Copyright Policy - creative-commons
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC3794608&req=5

gkt660-F7: QuSAGE detects earlier and more significant ISG activity in symptomatic (versus asymptomatic) human subjects following influenza exposure. ISG activity was quantified at each time point using (A) QuSAGE and (B) GSEA. Color-coding indicates the P-values for detecting activity in asymptomatic (circles) and symptomatic (squares) subjects relative to pre-exposure levels. (C) ISG activity was compared directly between the asymptomatic and symptomatic subject groups using QuSAGE, GSEA and CAMERA. Color-coding indicates the P-values using the same color key as panels (A) and (B). QuSAGE and CAMERA both estimate the average activity using the same statistic, although the P-values can differ.

Mentions: QuSAGE was also used to quantify ISG activity in asymptomatic and symptomatic subjects following influenza infection. In a previous study, these genes were specifically associated with symptomatic infections (20). Briefly, 17 healthy human subjects were exposed to live influenza and classified as asymptomatic or symptomatic based on the severity of symptoms. Peripheral blood was collected at approximately 8 h intervals up to 108 h post-exposure for gene expression analysis. To compare the sensitivity of QuSAGE with existing methods (GSEA and CAMERA), we quantified the activity of ISGs at each time-point relative to the pre-exposure levels (Figure 7A). As expected, all approaches generally showed stronger ISG activity in symptomatic patients. However, while the qualitative activity patterns were similar, QuSAGE was able to detect statistically significant activity (P < 0.05) at earlier time points (36 h post-exposure for QuSAGE and CAMERA versus 45 h for GSEA). The P-values produced by QuSAGE were also consistently smaller. QuSAGE was also able to detect stronger and earlier differences in ISG activity when comparing asymptomatic and symptomatic subjects directly to each other (36 versus 45 and 53 h post-exposure, for QuSAGE, CAMERA and GSEA, respectively) (Figure 7C). Although none of the approaches detected significant activity in asymptomatic subjects, the activity estimated by QuSAGE was much smoother and closer to zero compared with GSEA (compare Figure 7A and B). Thus, QuSAGE exhibits increased sensitivity compared with both GSEA and CAMERA.Figure 7.


Quantitative set analysis for gene expression: a method to quantify gene set differential expression including gene-gene correlations.

Yaari G, Bolen CR, Thakar J, Kleinstein SH - Nucleic Acids Res. (2013)

QuSAGE detects earlier and more significant ISG activity in symptomatic (versus asymptomatic) human subjects following influenza exposure. ISG activity was quantified at each time point using (A) QuSAGE and (B) GSEA. Color-coding indicates the P-values for detecting activity in asymptomatic (circles) and symptomatic (squares) subjects relative to pre-exposure levels. (C) ISG activity was compared directly between the asymptomatic and symptomatic subject groups using QuSAGE, GSEA and CAMERA. Color-coding indicates the P-values using the same color key as panels (A) and (B). QuSAGE and CAMERA both estimate the average activity using the same statistic, although the P-values can differ.
© Copyright Policy - creative-commons
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC3794608&req=5

gkt660-F7: QuSAGE detects earlier and more significant ISG activity in symptomatic (versus asymptomatic) human subjects following influenza exposure. ISG activity was quantified at each time point using (A) QuSAGE and (B) GSEA. Color-coding indicates the P-values for detecting activity in asymptomatic (circles) and symptomatic (squares) subjects relative to pre-exposure levels. (C) ISG activity was compared directly between the asymptomatic and symptomatic subject groups using QuSAGE, GSEA and CAMERA. Color-coding indicates the P-values using the same color key as panels (A) and (B). QuSAGE and CAMERA both estimate the average activity using the same statistic, although the P-values can differ.
Mentions: QuSAGE was also used to quantify ISG activity in asymptomatic and symptomatic subjects following influenza infection. In a previous study, these genes were specifically associated with symptomatic infections (20). Briefly, 17 healthy human subjects were exposed to live influenza and classified as asymptomatic or symptomatic based on the severity of symptoms. Peripheral blood was collected at approximately 8 h intervals up to 108 h post-exposure for gene expression analysis. To compare the sensitivity of QuSAGE with existing methods (GSEA and CAMERA), we quantified the activity of ISGs at each time-point relative to the pre-exposure levels (Figure 7A). As expected, all approaches generally showed stronger ISG activity in symptomatic patients. However, while the qualitative activity patterns were similar, QuSAGE was able to detect statistically significant activity (P < 0.05) at earlier time points (36 h post-exposure for QuSAGE and CAMERA versus 45 h for GSEA). The P-values produced by QuSAGE were also consistently smaller. QuSAGE was also able to detect stronger and earlier differences in ISG activity when comparing asymptomatic and symptomatic subjects directly to each other (36 versus 45 and 53 h post-exposure, for QuSAGE, CAMERA and GSEA, respectively) (Figure 7C). Although none of the approaches detected significant activity in asymptomatic subjects, the activity estimated by QuSAGE was much smoother and closer to zero compared with GSEA (compare Figure 7A and B). Thus, QuSAGE exhibits increased sensitivity compared with both GSEA and CAMERA.Figure 7.

Bottom Line: Existing tests are affected by inter-gene correlations, resulting in a high Type I error.From this probability density function, P-values and confidence intervals can be extracted and post hoc analysis can be carried out while maintaining statistical traceability.QuSAGE is available as an R package, which includes the core functions for the method as well as functions to plot and visualize the results.

View Article: PubMed Central - PubMed

Affiliation: Department of Pathology, Yale University School of Medicine, New Haven, CT 06511, USA, Bioengineering program, Faculty of engineering, Bar Ilan University, 5290002, Ramat Gan, Israel and Interdepartmental Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT 06511, USA.

ABSTRACT
Enrichment analysis of gene sets is a popular approach that provides a functional interpretation of genome-wide expression data. Existing tests are affected by inter-gene correlations, resulting in a high Type I error. The most widely used test, Gene Set Enrichment Analysis, relies on computationally intensive permutations of sample labels to generate a distribution that preserves gene-gene correlations. A more recent approach, CAMERA, attempts to correct for these correlations by estimating a variance inflation factor directly from the data. Although these methods generate P-values for detecting gene set activity, they are unable to produce confidence intervals or allow for post hoc comparisons. We have developed a new computational framework for Quantitative Set Analysis of Gene Expression (QuSAGE). QuSAGE accounts for inter-gene correlations, improves the estimation of the variance inflation factor and, rather than evaluating the deviation from a hypothesis with a P-value, it quantifies gene-set activity with a complete probability density function. From this probability density function, P-values and confidence intervals can be extracted and post hoc analysis can be carried out while maintaining statistical traceability. Compared with Gene Set Enrichment Analysis and CAMERA, QuSAGE exhibits better sensitivity and specificity on real data profiling the response to interferon therapy (in chronic Hepatitis C virus patients) and Influenza A virus infection. QuSAGE is available as an R package, which includes the core functions for the method as well as functions to plot and visualize the results.

Show MeSH
Related in: MedlinePlus