Limits...
A shortcut for multiple testing on the directed acyclic graph of gene ontology.

Saunders G, Stevens JR, Isom SC - BMC Bioinformatics (2014)

Bottom Line: Often, the large number of gene sets that are tested simultaneously require some sort of multiplicity correction to account for the multiplicity effect.The computational and power differences of the Short Focus Level procedure as compared to the original Focus Level procedure are demonstrated both through simulation and using real data.The Short Focus Level procedure shows a significant increase in computation speed over the original Focus Level procedure (as much as ~15,000 times faster).

View Article: PubMed Central - PubMed

Affiliation: Utah State University, Department of Mathematics & Statistics, Logan, Utah, USA. saundersg@byui.edu.

ABSTRACT

Background: Gene set testing has become an important analysis technique in high throughput microarray and next generation sequencing studies for uncovering patterns of differential expression of various biological processes. Often, the large number of gene sets that are tested simultaneously require some sort of multiplicity correction to account for the multiplicity effect. This work provides a substantial computational improvement to an existing familywise error rate controlling multiplicity approach (the Focus Level method) for gene set testing in high throughput microarray and next generation sequencing studies using Gene Ontology graphs, which we call the Short Focus Level.

Results: The Short Focus Level procedure, which performs a shortcut of the full Focus Level procedure, is achieved by extending the reach of graphical weighted Bonferroni testing to closed testing situations where restricted hypotheses are present, such as in the Gene Ontology graphs. The Short Focus Level multiplicity adjustment can perform the full top-down approach of the original Focus Level procedure, overcoming a significant disadvantage of the otherwise powerful Focus Level multiplicity adjustment. The computational and power differences of the Short Focus Level procedure as compared to the original Focus Level procedure are demonstrated both through simulation and using real data.

Conclusions: The Short Focus Level procedure shows a significant increase in computation speed over the original Focus Level procedure (as much as ~15,000 times faster). The Short Focus Level should be used in place of the Focus Level procedure whenever the logical assumptions of the Gene Ontology graph structure are appropriate for the study objectives and when either no a priori focus level of interest can be specified or the focus level is selected at a higher level of the graph, where the Focus Level procedure is computationally intractable.

Show MeSH

Related in: MedlinePlus

Adjusted p-values for each of the 249 biological processes considered in the Golub example. The Focus Level (FL) method with its default focus level is compared to the Short Focus Level (SFL) method using (a) the same focus level as the FL method and (b) the root node focus level (which is computationally intractable for the FL method). Red dashed lines correspond to the familywise error rate of 0.01, and the solid black line represents the line of equality. All axes are on the log scale.
© Copyright Policy - open-access
Related In: Results  -  Collection

License 1 - License 2
getmorefigures.php?uid=PMC4232707&req=5

Fig10: Adjusted p-values for each of the 249 biological processes considered in the Golub example. The Focus Level (FL) method with its default focus level is compared to the Short Focus Level (SFL) method using (a) the same focus level as the FL method and (b) the root node focus level (which is computationally intractable for the FL method). Red dashed lines correspond to the familywise error rate of 0.01, and the solid black line represents the line of equality. All axes are on the log scale.

Mentions: Figure 10 compares the resulting adjusted p-values for each of the 249 biological processes considered. Figure 10a shows that, when using the same focus level, the Focus Level (FL) and Short Focus Level (SFL) methods can result in different (though largely overlapping in this case) sets of GO terms called significant. This results from the previously discussed key difference between the FL and SFL methods, namely that the SFL method allows any test (not just the Global Test) for the elementary hypotheses (the individual GO ID hypotheses) and then performs weighted Bonferroni tests for all intersection hypotheses.Figure 10


A shortcut for multiple testing on the directed acyclic graph of gene ontology.

Saunders G, Stevens JR, Isom SC - BMC Bioinformatics (2014)

Adjusted p-values for each of the 249 biological processes considered in the Golub example. The Focus Level (FL) method with its default focus level is compared to the Short Focus Level (SFL) method using (a) the same focus level as the FL method and (b) the root node focus level (which is computationally intractable for the FL method). Red dashed lines correspond to the familywise error rate of 0.01, and the solid black line represents the line of equality. All axes are on the log scale.
© Copyright Policy - open-access
Related In: Results  -  Collection

License 1 - License 2
Show All Figures
getmorefigures.php?uid=PMC4232707&req=5

Fig10: Adjusted p-values for each of the 249 biological processes considered in the Golub example. The Focus Level (FL) method with its default focus level is compared to the Short Focus Level (SFL) method using (a) the same focus level as the FL method and (b) the root node focus level (which is computationally intractable for the FL method). Red dashed lines correspond to the familywise error rate of 0.01, and the solid black line represents the line of equality. All axes are on the log scale.
Mentions: Figure 10 compares the resulting adjusted p-values for each of the 249 biological processes considered. Figure 10a shows that, when using the same focus level, the Focus Level (FL) and Short Focus Level (SFL) methods can result in different (though largely overlapping in this case) sets of GO terms called significant. This results from the previously discussed key difference between the FL and SFL methods, namely that the SFL method allows any test (not just the Global Test) for the elementary hypotheses (the individual GO ID hypotheses) and then performs weighted Bonferroni tests for all intersection hypotheses.Figure 10

Bottom Line: Often, the large number of gene sets that are tested simultaneously require some sort of multiplicity correction to account for the multiplicity effect.The computational and power differences of the Short Focus Level procedure as compared to the original Focus Level procedure are demonstrated both through simulation and using real data.The Short Focus Level procedure shows a significant increase in computation speed over the original Focus Level procedure (as much as ~15,000 times faster).

View Article: PubMed Central - PubMed

Affiliation: Utah State University, Department of Mathematics & Statistics, Logan, Utah, USA. saundersg@byui.edu.

ABSTRACT

Background: Gene set testing has become an important analysis technique in high throughput microarray and next generation sequencing studies for uncovering patterns of differential expression of various biological processes. Often, the large number of gene sets that are tested simultaneously require some sort of multiplicity correction to account for the multiplicity effect. This work provides a substantial computational improvement to an existing familywise error rate controlling multiplicity approach (the Focus Level method) for gene set testing in high throughput microarray and next generation sequencing studies using Gene Ontology graphs, which we call the Short Focus Level.

Results: The Short Focus Level procedure, which performs a shortcut of the full Focus Level procedure, is achieved by extending the reach of graphical weighted Bonferroni testing to closed testing situations where restricted hypotheses are present, such as in the Gene Ontology graphs. The Short Focus Level multiplicity adjustment can perform the full top-down approach of the original Focus Level procedure, overcoming a significant disadvantage of the otherwise powerful Focus Level multiplicity adjustment. The computational and power differences of the Short Focus Level procedure as compared to the original Focus Level procedure are demonstrated both through simulation and using real data.

Conclusions: The Short Focus Level procedure shows a significant increase in computation speed over the original Focus Level procedure (as much as ~15,000 times faster). The Short Focus Level should be used in place of the Focus Level procedure whenever the logical assumptions of the Gene Ontology graph structure are appropriate for the study objectives and when either no a priori focus level of interest can be specified or the focus level is selected at a higher level of the graph, where the Focus Level procedure is computationally intractable.

Show MeSH
Related in: MedlinePlus