Limits...
Functionally enigmatic genes: a case study of the brain ignorome.

Pandey AK, Lu L, Wang X, Homayouni R, Williams RW - PLoS ONE (2014)

Bottom Line: Surprisingly, ignorome genes do not differ from well-studied genes in terms of connectivity in coexpression networks.The major distinguishing characteristic between these sets of genes is date of discovery, early discovery being associated with greater research momentum--a genomic bandwagon effect.In a majority of cases we have been able to extract and add significant information for these neglected genes.

View Article: PubMed Central - PubMed

Affiliation: UT Center for Integrative and Translational Genomics and Department of Anatomy and Neurobiology, University of Tennessee Health Science Center, Memphis, Tennessee, United States of America.

ABSTRACT
What proportion of genes with intense and selective expression in specific tissues, cells, or systems are still almost completely uncharacterized with respect to biological function? In what ways do these functionally enigmatic genes differ from well-studied genes? To address these two questions, we devised a computational approach that defines so-called ignoromes. As proof of principle, we extracted and analyzed a large subset of genes with intense and selective expression in brain. We find that publications associated with this set are highly skewed--the top 5% of genes absorb 70% of the relevant literature. In contrast, approximately 20% of genes have essentially no neuroscience literature. Analysis of the ignorome over the past decade demonstrates that it is stubbornly persistent, and the rapid expansion of the neuroscience literature has not had the expected effect on numbers of these genes. Surprisingly, ignorome genes do not differ from well-studied genes in terms of connectivity in coexpression networks. Nor do they differ with respect to numbers of orthologs, paralogs, or protein domains. The major distinguishing characteristic between these sets of genes is date of discovery, early discovery being associated with greater research momentum--a genomic bandwagon effect. Finally we ask to what extent massive genomic, imaging, and phenotype data sets can be used to provide high-throughput functional annotation for an entire ignorome. In a majority of cases we have been able to extract and add significant information for these neglected genes. In several cases--ELMOD1, TMEM88B, and DZANK1--we have exploited sequence polymorphisms, large phenome data sets, and reverse genetic methods to evaluate the function of ignorome genes.

Show MeSH
Shrinkage of the ignorome.The x-axis represents the timeline. The solid line represents the percentage of ignorome genes in the ISE brain set. The dotted line represents the number of neuroscience specific literature (in thousands).
© Copyright Policy
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC3921226&req=5

pone-0088889-g005: Shrinkage of the ignorome.The x-axis represents the timeline. The solid line represents the percentage of ignorome genes in the ISE brain set. The dotted line represents the number of neuroscience specific literature (in thousands).

Mentions: Will untargeted and semi-random community research effectively remove the ignorome in the next few years? To address this question we calculated the rate at which the ignorome has shrunk over the past two decades? Our starting point for this analysis was 1991. At this early stage of genomics, two-thirds of our reference set of 648 ISE genes had no literature at all. This number has been reduced by 90% and only 67 genes are still part of an absolute ignorome with no neuroscience literature and almost no literature in an area of research. While the average rate of decrease was rapid between 1991 and 2000 (−25 genes/year), the rate has been lethargic over the past five years (−6.4 genes/yr, Figure 5). This trend is surprising given the sharp increase in the rate of addition to the neuroscience literature. As a result, the number of neuroscience articles associated with the elimination of a single ignorome gene has gone up by a factor of three between 1991 and 2012 (Figure 5). The rate at which the ignorome is shrinking is approaching an asymptote, and without focused effort to functionally annotate the ignorome, it will likely make up 40–50 functionally important genes for more than a decade.


Functionally enigmatic genes: a case study of the brain ignorome.

Pandey AK, Lu L, Wang X, Homayouni R, Williams RW - PLoS ONE (2014)

Shrinkage of the ignorome.The x-axis represents the timeline. The solid line represents the percentage of ignorome genes in the ISE brain set. The dotted line represents the number of neuroscience specific literature (in thousands).
© Copyright Policy
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC3921226&req=5

pone-0088889-g005: Shrinkage of the ignorome.The x-axis represents the timeline. The solid line represents the percentage of ignorome genes in the ISE brain set. The dotted line represents the number of neuroscience specific literature (in thousands).
Mentions: Will untargeted and semi-random community research effectively remove the ignorome in the next few years? To address this question we calculated the rate at which the ignorome has shrunk over the past two decades? Our starting point for this analysis was 1991. At this early stage of genomics, two-thirds of our reference set of 648 ISE genes had no literature at all. This number has been reduced by 90% and only 67 genes are still part of an absolute ignorome with no neuroscience literature and almost no literature in an area of research. While the average rate of decrease was rapid between 1991 and 2000 (−25 genes/year), the rate has been lethargic over the past five years (−6.4 genes/yr, Figure 5). This trend is surprising given the sharp increase in the rate of addition to the neuroscience literature. As a result, the number of neuroscience articles associated with the elimination of a single ignorome gene has gone up by a factor of three between 1991 and 2012 (Figure 5). The rate at which the ignorome is shrinking is approaching an asymptote, and without focused effort to functionally annotate the ignorome, it will likely make up 40–50 functionally important genes for more than a decade.

Bottom Line: Surprisingly, ignorome genes do not differ from well-studied genes in terms of connectivity in coexpression networks.The major distinguishing characteristic between these sets of genes is date of discovery, early discovery being associated with greater research momentum--a genomic bandwagon effect.In a majority of cases we have been able to extract and add significant information for these neglected genes.

View Article: PubMed Central - PubMed

Affiliation: UT Center for Integrative and Translational Genomics and Department of Anatomy and Neurobiology, University of Tennessee Health Science Center, Memphis, Tennessee, United States of America.

ABSTRACT
What proportion of genes with intense and selective expression in specific tissues, cells, or systems are still almost completely uncharacterized with respect to biological function? In what ways do these functionally enigmatic genes differ from well-studied genes? To address these two questions, we devised a computational approach that defines so-called ignoromes. As proof of principle, we extracted and analyzed a large subset of genes with intense and selective expression in brain. We find that publications associated with this set are highly skewed--the top 5% of genes absorb 70% of the relevant literature. In contrast, approximately 20% of genes have essentially no neuroscience literature. Analysis of the ignorome over the past decade demonstrates that it is stubbornly persistent, and the rapid expansion of the neuroscience literature has not had the expected effect on numbers of these genes. Surprisingly, ignorome genes do not differ from well-studied genes in terms of connectivity in coexpression networks. Nor do they differ with respect to numbers of orthologs, paralogs, or protein domains. The major distinguishing characteristic between these sets of genes is date of discovery, early discovery being associated with greater research momentum--a genomic bandwagon effect. Finally we ask to what extent massive genomic, imaging, and phenotype data sets can be used to provide high-throughput functional annotation for an entire ignorome. In a majority of cases we have been able to extract and add significant information for these neglected genes. In several cases--ELMOD1, TMEM88B, and DZANK1--we have exploited sequence polymorphisms, large phenome data sets, and reverse genetic methods to evaluate the function of ignorome genes.

Show MeSH