Limits...
Directing experimental biology: a case study in mitochondrial biogenesis.

Hibbs MA, Myers CL, Huttenhower C, Hess DC, Li K, Caudy AA, Troyanskaya OG - PLoS Comput. Biol. (2009)

Bottom Line: Here we analyze and explore the results of this study that are broadly applicable for computationalists applying gene function prediction techniques, including a new experimental comparison with 48 genes representing the genomic background.Our study leads to several conclusions that are important to consider when driving laboratory investigations using computational prediction approaches.While this study focused on a specific functional area in yeast, many of these observations may be useful in the contexts of other processes and organisms.

View Article: PubMed Central - PubMed

Affiliation: Lewis-Sigler Institute for Integrative Genomics, Princeton University, Carl Icahn Laboratory, Princeton, New Jersey, United States of America.

ABSTRACT
Computational approaches have promised to organize collections of functional genomics data into testable predictions of gene and protein involvement in biological processes and pathways. However, few such predictions have been experimentally validated on a large scale, leaving many bioinformatic methods unproven and underutilized in the biology community. Further, it remains unclear what biological concerns should be taken into account when using computational methods to drive real-world experimental efforts. To investigate these concerns and to establish the utility of computational predictions of gene function, we experimentally tested hundreds of predictions generated from an ensemble of three complementary methods for the process of mitochondrial organization and biogenesis in Saccharomyces cerevisiae. The biological data with respect to the mitochondria are presented in a companion manuscript published in PLoS Genetics (doi:10.1371/journal.pgen.1000407). Here we analyze and explore the results of this study that are broadly applicable for computationalists applying gene function prediction techniques, including a new experimental comparison with 48 genes representing the genomic background. Our study leads to several conclusions that are important to consider when driving laboratory investigations using computational prediction approaches. While most genes in yeast are already known to participate in at least one biological process, we confirm that genes with known functions can still be strong candidates for annotation of additional gene functions. We find that different analysis techniques and different underlying data can both greatly affect the types of functional predictions produced by computational methods. This diversity allows an ensemble of techniques to substantially broaden the biological scope and breadth of predictions. We also find that performing prediction and validation steps iteratively allows us to more completely characterize a biological area of interest. While this study focused on a specific functional area in yeast, many of these observations may be useful in the contexts of other processes and organisms.

Show MeSH

Related in: MedlinePlus

Historical progression of gene function discovery.We examined the historical context of SGD annotations to GO based on the dates of publications used to assign genes to biological processes. Here we define a “known function” as an annotation to a GO term within the GO functional slim mapping [37] for S. cerevisiae. Function annotation accelerated after the publication of the yeast genome in 1996, but annotation of multiple functions did not accelerate accordingly.
© Copyright Policy
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC2654405&req=5

pcbi-1000322-g003: Historical progression of gene function discovery.We examined the historical context of SGD annotations to GO based on the dates of publications used to assign genes to biological processes. Here we define a “known function” as an annotation to a GO term within the GO functional slim mapping [37] for S. cerevisiae. Function annotation accelerated after the publication of the yeast genome in 1996, but annotation of multiple functions did not accelerate accordingly.

Mentions: These results are particularly striking within the historical context of the rates at which gene functions have been characterized. Since the full sequence of S. cerevisiae was published in 1996 [32], nearly 3,000 genes have had their first known function characterized, while only ∼1,700 genes have had a second function characterized (Figure 3). It remains unknown how many genes are truly involved in multiple processes, but it is clear that even if single functions were known for all yeast genes, we would still be far from a complete understanding of the complex network that supports most cellular processes. This further underscores the importance of developing approaches for fast and accurate discovery of protein function.


Directing experimental biology: a case study in mitochondrial biogenesis.

Hibbs MA, Myers CL, Huttenhower C, Hess DC, Li K, Caudy AA, Troyanskaya OG - PLoS Comput. Biol. (2009)

Historical progression of gene function discovery.We examined the historical context of SGD annotations to GO based on the dates of publications used to assign genes to biological processes. Here we define a “known function” as an annotation to a GO term within the GO functional slim mapping [37] for S. cerevisiae. Function annotation accelerated after the publication of the yeast genome in 1996, but annotation of multiple functions did not accelerate accordingly.
© Copyright Policy
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC2654405&req=5

pcbi-1000322-g003: Historical progression of gene function discovery.We examined the historical context of SGD annotations to GO based on the dates of publications used to assign genes to biological processes. Here we define a “known function” as an annotation to a GO term within the GO functional slim mapping [37] for S. cerevisiae. Function annotation accelerated after the publication of the yeast genome in 1996, but annotation of multiple functions did not accelerate accordingly.
Mentions: These results are particularly striking within the historical context of the rates at which gene functions have been characterized. Since the full sequence of S. cerevisiae was published in 1996 [32], nearly 3,000 genes have had their first known function characterized, while only ∼1,700 genes have had a second function characterized (Figure 3). It remains unknown how many genes are truly involved in multiple processes, but it is clear that even if single functions were known for all yeast genes, we would still be far from a complete understanding of the complex network that supports most cellular processes. This further underscores the importance of developing approaches for fast and accurate discovery of protein function.

Bottom Line: Here we analyze and explore the results of this study that are broadly applicable for computationalists applying gene function prediction techniques, including a new experimental comparison with 48 genes representing the genomic background.Our study leads to several conclusions that are important to consider when driving laboratory investigations using computational prediction approaches.While this study focused on a specific functional area in yeast, many of these observations may be useful in the contexts of other processes and organisms.

View Article: PubMed Central - PubMed

Affiliation: Lewis-Sigler Institute for Integrative Genomics, Princeton University, Carl Icahn Laboratory, Princeton, New Jersey, United States of America.

ABSTRACT
Computational approaches have promised to organize collections of functional genomics data into testable predictions of gene and protein involvement in biological processes and pathways. However, few such predictions have been experimentally validated on a large scale, leaving many bioinformatic methods unproven and underutilized in the biology community. Further, it remains unclear what biological concerns should be taken into account when using computational methods to drive real-world experimental efforts. To investigate these concerns and to establish the utility of computational predictions of gene function, we experimentally tested hundreds of predictions generated from an ensemble of three complementary methods for the process of mitochondrial organization and biogenesis in Saccharomyces cerevisiae. The biological data with respect to the mitochondria are presented in a companion manuscript published in PLoS Genetics (doi:10.1371/journal.pgen.1000407). Here we analyze and explore the results of this study that are broadly applicable for computationalists applying gene function prediction techniques, including a new experimental comparison with 48 genes representing the genomic background. Our study leads to several conclusions that are important to consider when driving laboratory investigations using computational prediction approaches. While most genes in yeast are already known to participate in at least one biological process, we confirm that genes with known functions can still be strong candidates for annotation of additional gene functions. We find that different analysis techniques and different underlying data can both greatly affect the types of functional predictions produced by computational methods. This diversity allows an ensemble of techniques to substantially broaden the biological scope and breadth of predictions. We also find that performing prediction and validation steps iteratively allows us to more completely characterize a biological area of interest. While this study focused on a specific functional area in yeast, many of these observations may be useful in the contexts of other processes and organisms.

Show MeSH
Related in: MedlinePlus