Limits...
Directing experimental biology: a case study in mitochondrial biogenesis.

Hibbs MA, Myers CL, Huttenhower C, Hess DC, Li K, Caudy AA, Troyanskaya OG - PLoS Comput. Biol. (2009)

Bottom Line: Here we analyze and explore the results of this study that are broadly applicable for computationalists applying gene function prediction techniques, including a new experimental comparison with 48 genes representing the genomic background.Our study leads to several conclusions that are important to consider when driving laboratory investigations using computational prediction approaches.While this study focused on a specific functional area in yeast, many of these observations may be useful in the contexts of other processes and organisms.

View Article: PubMed Central - PubMed

Affiliation: Lewis-Sigler Institute for Integrative Genomics, Princeton University, Carl Icahn Laboratory, Princeton, New Jersey, United States of America.

ABSTRACT
Computational approaches have promised to organize collections of functional genomics data into testable predictions of gene and protein involvement in biological processes and pathways. However, few such predictions have been experimentally validated on a large scale, leaving many bioinformatic methods unproven and underutilized in the biology community. Further, it remains unclear what biological concerns should be taken into account when using computational methods to drive real-world experimental efforts. To investigate these concerns and to establish the utility of computational predictions of gene function, we experimentally tested hundreds of predictions generated from an ensemble of three complementary methods for the process of mitochondrial organization and biogenesis in Saccharomyces cerevisiae. The biological data with respect to the mitochondria are presented in a companion manuscript published in PLoS Genetics (doi:10.1371/journal.pgen.1000407). Here we analyze and explore the results of this study that are broadly applicable for computationalists applying gene function prediction techniques, including a new experimental comparison with 48 genes representing the genomic background. Our study leads to several conclusions that are important to consider when driving laboratory investigations using computational prediction approaches. While most genes in yeast are already known to participate in at least one biological process, we confirm that genes with known functions can still be strong candidates for annotation of additional gene functions. We find that different analysis techniques and different underlying data can both greatly affect the types of functional predictions produced by computational methods. This diversity allows an ensemble of techniques to substantially broaden the biological scope and breadth of predictions. We also find that performing prediction and validation steps iteratively allows us to more completely characterize a biological area of interest. While this study focused on a specific functional area in yeast, many of these observations may be useful in the contexts of other processes and organisms.

Show MeSH
Annotations and phenotypic results for mitochondrion organization and biogenesis.(A) The number of genes involved in ‘mitochondrial organization and biogenesis’ after each stage of this study. Our study began with the 106 genes annotated to the GO term ‘mitochondrion organization and biogenesis.’ In the first round of our iterative computational prediction and laboratory experimentation, we confirmed 123 additional genes. 40 of these confirmations had previously existing literature evidence for involvement in mitochondrial biogenesis, leaving 83 entirely novel discoveries from the first iteration. Based on further literature searches, we found an additional 95 genes with literature evidence for inclusion in this term (including 2 tested genes that did not exhibit a significant phenotype). During our second iteration of testing, we confirmed an additional 17 predictions. (B) The results of our petite frequency assay for genes with previous literature evidence (positive controls), our novel first iteration predictions, novel second iteration predictions, and a random selection of genes. Note that the majority of novel confirmations exhibited the more modest phenotype of “altered mitochondrial inheritance,” whereas the majority of previously known genes are “respiratory deficient,” a more extreme phenotype more easily discovered by high-throughput screens.
© Copyright Policy
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC2654405&req=5

pcbi-1000322-g002: Annotations and phenotypic results for mitochondrion organization and biogenesis.(A) The number of genes involved in ‘mitochondrial organization and biogenesis’ after each stage of this study. Our study began with the 106 genes annotated to the GO term ‘mitochondrion organization and biogenesis.’ In the first round of our iterative computational prediction and laboratory experimentation, we confirmed 123 additional genes. 40 of these confirmations had previously existing literature evidence for involvement in mitochondrial biogenesis, leaving 83 entirely novel discoveries from the first iteration. Based on further literature searches, we found an additional 95 genes with literature evidence for inclusion in this term (including 2 tested genes that did not exhibit a significant phenotype). During our second iteration of testing, we confirmed an additional 17 predictions. (B) The results of our petite frequency assay for genes with previous literature evidence (positive controls), our novel first iteration predictions, novel second iteration predictions, and a random selection of genes. Note that the majority of novel confirmations exhibited the more modest phenotype of “altered mitochondrial inheritance,” whereas the majority of previously known genes are “respiratory deficient,” a more extreme phenotype more easily discovered by high-throughput screens.

Mentions: Our second iteration of prediction and validation used a set of 324 genes as input to the computational methods (106 original annotations, 83 newly confirmed genes with no prior literature evidence, and 135 “under-annotated” genes). We evaluated the 52 most confident predictions that were not previously tested, and 17 (33%) were validated. While this confirmation rate is still high, the reduction suggests that we may be nearing the edge of genes that can be confidently identified using our assays (details below). Altogether, our study identified 235 new annotations to the process of mitochondrial organization and biogenesis, which more than triples the number of genes previously annotated to this area (Figure 2A). A summary of these results is shown in Figure 1B, and a full catalog of predictions and experimental results is available in Table S1. While these biological results are striking and important, they also have significant ramifications in the application of computational techniques as a whole and in their integration with experimental biology, which we discuss in detail below.


Directing experimental biology: a case study in mitochondrial biogenesis.

Hibbs MA, Myers CL, Huttenhower C, Hess DC, Li K, Caudy AA, Troyanskaya OG - PLoS Comput. Biol. (2009)

Annotations and phenotypic results for mitochondrion organization and biogenesis.(A) The number of genes involved in ‘mitochondrial organization and biogenesis’ after each stage of this study. Our study began with the 106 genes annotated to the GO term ‘mitochondrion organization and biogenesis.’ In the first round of our iterative computational prediction and laboratory experimentation, we confirmed 123 additional genes. 40 of these confirmations had previously existing literature evidence for involvement in mitochondrial biogenesis, leaving 83 entirely novel discoveries from the first iteration. Based on further literature searches, we found an additional 95 genes with literature evidence for inclusion in this term (including 2 tested genes that did not exhibit a significant phenotype). During our second iteration of testing, we confirmed an additional 17 predictions. (B) The results of our petite frequency assay for genes with previous literature evidence (positive controls), our novel first iteration predictions, novel second iteration predictions, and a random selection of genes. Note that the majority of novel confirmations exhibited the more modest phenotype of “altered mitochondrial inheritance,” whereas the majority of previously known genes are “respiratory deficient,” a more extreme phenotype more easily discovered by high-throughput screens.
© Copyright Policy
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC2654405&req=5

pcbi-1000322-g002: Annotations and phenotypic results for mitochondrion organization and biogenesis.(A) The number of genes involved in ‘mitochondrial organization and biogenesis’ after each stage of this study. Our study began with the 106 genes annotated to the GO term ‘mitochondrion organization and biogenesis.’ In the first round of our iterative computational prediction and laboratory experimentation, we confirmed 123 additional genes. 40 of these confirmations had previously existing literature evidence for involvement in mitochondrial biogenesis, leaving 83 entirely novel discoveries from the first iteration. Based on further literature searches, we found an additional 95 genes with literature evidence for inclusion in this term (including 2 tested genes that did not exhibit a significant phenotype). During our second iteration of testing, we confirmed an additional 17 predictions. (B) The results of our petite frequency assay for genes with previous literature evidence (positive controls), our novel first iteration predictions, novel second iteration predictions, and a random selection of genes. Note that the majority of novel confirmations exhibited the more modest phenotype of “altered mitochondrial inheritance,” whereas the majority of previously known genes are “respiratory deficient,” a more extreme phenotype more easily discovered by high-throughput screens.
Mentions: Our second iteration of prediction and validation used a set of 324 genes as input to the computational methods (106 original annotations, 83 newly confirmed genes with no prior literature evidence, and 135 “under-annotated” genes). We evaluated the 52 most confident predictions that were not previously tested, and 17 (33%) were validated. While this confirmation rate is still high, the reduction suggests that we may be nearing the edge of genes that can be confidently identified using our assays (details below). Altogether, our study identified 235 new annotations to the process of mitochondrial organization and biogenesis, which more than triples the number of genes previously annotated to this area (Figure 2A). A summary of these results is shown in Figure 1B, and a full catalog of predictions and experimental results is available in Table S1. While these biological results are striking and important, they also have significant ramifications in the application of computational techniques as a whole and in their integration with experimental biology, which we discuss in detail below.

Bottom Line: Here we analyze and explore the results of this study that are broadly applicable for computationalists applying gene function prediction techniques, including a new experimental comparison with 48 genes representing the genomic background.Our study leads to several conclusions that are important to consider when driving laboratory investigations using computational prediction approaches.While this study focused on a specific functional area in yeast, many of these observations may be useful in the contexts of other processes and organisms.

View Article: PubMed Central - PubMed

Affiliation: Lewis-Sigler Institute for Integrative Genomics, Princeton University, Carl Icahn Laboratory, Princeton, New Jersey, United States of America.

ABSTRACT
Computational approaches have promised to organize collections of functional genomics data into testable predictions of gene and protein involvement in biological processes and pathways. However, few such predictions have been experimentally validated on a large scale, leaving many bioinformatic methods unproven and underutilized in the biology community. Further, it remains unclear what biological concerns should be taken into account when using computational methods to drive real-world experimental efforts. To investigate these concerns and to establish the utility of computational predictions of gene function, we experimentally tested hundreds of predictions generated from an ensemble of three complementary methods for the process of mitochondrial organization and biogenesis in Saccharomyces cerevisiae. The biological data with respect to the mitochondria are presented in a companion manuscript published in PLoS Genetics (doi:10.1371/journal.pgen.1000407). Here we analyze and explore the results of this study that are broadly applicable for computationalists applying gene function prediction techniques, including a new experimental comparison with 48 genes representing the genomic background. Our study leads to several conclusions that are important to consider when driving laboratory investigations using computational prediction approaches. While most genes in yeast are already known to participate in at least one biological process, we confirm that genes with known functions can still be strong candidates for annotation of additional gene functions. We find that different analysis techniques and different underlying data can both greatly affect the types of functional predictions produced by computational methods. This diversity allows an ensemble of techniques to substantially broaden the biological scope and breadth of predictions. We also find that performing prediction and validation steps iteratively allows us to more completely characterize a biological area of interest. While this study focused on a specific functional area in yeast, many of these observations may be useful in the contexts of other processes and organisms.

Show MeSH