Limits...
OrthoList: a compendium of C. elegans genes with human orthologs.

Shaye DD, Greenwald I - PLoS ONE (2011)

Bottom Line: We performed a meta-analysis of results from four orthology prediction programs and generated a compendium, "OrthoList", containing 7,663 C. elegans protein-coding genes.We compiled Ortholist by InterPro domains and Gene Ontology annotation, making it easy to identify C. elegans orthologs of human disease genes for potential functional analysis.Moreover, we find that OrthoList provides a useful basis for annotating orthology and reveals more C. elegans orthologs of human genes in various functional groups, such as transcription factors, than previously described.

View Article: PubMed Central - PubMed

Affiliation: Howard Hughes Medical Institute, Columbia University, College of Physicians and Surgeons, New York, New York, United States of America. ds451@columbia.edu

ABSTRACT

Background: C. elegans is an important model for genetic studies relevant to human biology and disease. We sought to assess the orthology between C. elegans and human genes to understand better the relationship between their genomes and to generate a compelling list of candidates to streamline RNAi-based screens in this model.

Results: We performed a meta-analysis of results from four orthology prediction programs and generated a compendium, "OrthoList", containing 7,663 C. elegans protein-coding genes. Various assessments indicate that OrthoList has extensive coverage with low false-positive and false-negative rates. Part of this evaluation examined the conservation of components of the receptor tyrosine kinase, Notch, Wnt, TGF-ß and insulin signaling pathways, and led us to update compendia of conserved C. elegans kinases, nuclear hormone receptors, F-box proteins, and transcription factors. Comparison with two published genome-wide RNAi screens indicated that virtually all of the conserved hits would have been obtained had just the OrthoList set (∼38% of the genome) been targeted. We compiled Ortholist by InterPro domains and Gene Ontology annotation, making it easy to identify C. elegans orthologs of human disease genes for potential functional analysis.

Conclusions: We anticipate that OrthoList will be of considerable utility to C. elegans researchers for streamlining RNAi screens, by focusing on genes with apparent human orthologs, thus reducing screening effort by ∼60%. Moreover, we find that OrthoList provides a useful basis for annotating orthology and reveals more C. elegans orthologs of human genes in various functional groups, such as transcription factors, than previously described.

Show MeSH
Comparison of four orthology prediction programs queried for C. elegans orthologs of human proteins.This diagram is modified from VENNY (see Materials and Methods). Each program is named above the oval representing its results, with the number of C. elegans orthologs and in-paralogs found by the program shown. The table gives an overall measure of how many genes were found by one or more programs (regardless of which one(s) found them). The numbers in the overlapping and non-overlapping areas of the Venn diagram indicate how many genes were found by overlapping or unique sets of programs. The font size used for these numbers indicate how many programs that number of genes was found by: numbers corresponding to genes found by a single program are shown smallest, whereas the largest font denotes the number of genes found by all programs. The data underlying this diagram can be seen in Table S1. A measure of the similarity and divergence between programs can be found in Table S2.
© Copyright Policy
Related In: Results  -  Collection


getmorefigures.php?uid=PMC3102077&req=5

pone-0020085-g001: Comparison of four orthology prediction programs queried for C. elegans orthologs of human proteins.This diagram is modified from VENNY (see Materials and Methods). Each program is named above the oval representing its results, with the number of C. elegans orthologs and in-paralogs found by the program shown. The table gives an overall measure of how many genes were found by one or more programs (regardless of which one(s) found them). The numbers in the overlapping and non-overlapping areas of the Venn diagram indicate how many genes were found by overlapping or unique sets of programs. The font size used for these numbers indicate how many programs that number of genes was found by: numbers corresponding to genes found by a single program are shown smallest, whereas the largest font denotes the number of genes found by all programs. The data underlying this diagram can be seen in Table S1. A measure of the similarity and divergence between programs can be found in Table S2.

Mentions: When assayed for C. elegans-human orthologs, the four methods analyzed yielded different and overlapping results (see Figure 1 and Tables S1, S2). Comparison of these results (see Materials and Methods) resulted in a list of 7,663 unique protein-coding genes, which we call OrthoList. This list represents ∼38% of the 20,250 protein-coding genes predicted in C. elegans (WormBase referential release WS210).


OrthoList: a compendium of C. elegans genes with human orthologs.

Shaye DD, Greenwald I - PLoS ONE (2011)

Comparison of four orthology prediction programs queried for C. elegans orthologs of human proteins.This diagram is modified from VENNY (see Materials and Methods). Each program is named above the oval representing its results, with the number of C. elegans orthologs and in-paralogs found by the program shown. The table gives an overall measure of how many genes were found by one or more programs (regardless of which one(s) found them). The numbers in the overlapping and non-overlapping areas of the Venn diagram indicate how many genes were found by overlapping or unique sets of programs. The font size used for these numbers indicate how many programs that number of genes was found by: numbers corresponding to genes found by a single program are shown smallest, whereas the largest font denotes the number of genes found by all programs. The data underlying this diagram can be seen in Table S1. A measure of the similarity and divergence between programs can be found in Table S2.
© Copyright Policy
Related In: Results  -  Collection

Show All Figures
getmorefigures.php?uid=PMC3102077&req=5

pone-0020085-g001: Comparison of four orthology prediction programs queried for C. elegans orthologs of human proteins.This diagram is modified from VENNY (see Materials and Methods). Each program is named above the oval representing its results, with the number of C. elegans orthologs and in-paralogs found by the program shown. The table gives an overall measure of how many genes were found by one or more programs (regardless of which one(s) found them). The numbers in the overlapping and non-overlapping areas of the Venn diagram indicate how many genes were found by overlapping or unique sets of programs. The font size used for these numbers indicate how many programs that number of genes was found by: numbers corresponding to genes found by a single program are shown smallest, whereas the largest font denotes the number of genes found by all programs. The data underlying this diagram can be seen in Table S1. A measure of the similarity and divergence between programs can be found in Table S2.
Mentions: When assayed for C. elegans-human orthologs, the four methods analyzed yielded different and overlapping results (see Figure 1 and Tables S1, S2). Comparison of these results (see Materials and Methods) resulted in a list of 7,663 unique protein-coding genes, which we call OrthoList. This list represents ∼38% of the 20,250 protein-coding genes predicted in C. elegans (WormBase referential release WS210).

Bottom Line: We performed a meta-analysis of results from four orthology prediction programs and generated a compendium, "OrthoList", containing 7,663 C. elegans protein-coding genes.We compiled Ortholist by InterPro domains and Gene Ontology annotation, making it easy to identify C. elegans orthologs of human disease genes for potential functional analysis.Moreover, we find that OrthoList provides a useful basis for annotating orthology and reveals more C. elegans orthologs of human genes in various functional groups, such as transcription factors, than previously described.

View Article: PubMed Central - PubMed

Affiliation: Howard Hughes Medical Institute, Columbia University, College of Physicians and Surgeons, New York, New York, United States of America. ds451@columbia.edu

ABSTRACT

Background: C. elegans is an important model for genetic studies relevant to human biology and disease. We sought to assess the orthology between C. elegans and human genes to understand better the relationship between their genomes and to generate a compelling list of candidates to streamline RNAi-based screens in this model.

Results: We performed a meta-analysis of results from four orthology prediction programs and generated a compendium, "OrthoList", containing 7,663 C. elegans protein-coding genes. Various assessments indicate that OrthoList has extensive coverage with low false-positive and false-negative rates. Part of this evaluation examined the conservation of components of the receptor tyrosine kinase, Notch, Wnt, TGF-ß and insulin signaling pathways, and led us to update compendia of conserved C. elegans kinases, nuclear hormone receptors, F-box proteins, and transcription factors. Comparison with two published genome-wide RNAi screens indicated that virtually all of the conserved hits would have been obtained had just the OrthoList set (∼38% of the genome) been targeted. We compiled Ortholist by InterPro domains and Gene Ontology annotation, making it easy to identify C. elegans orthologs of human disease genes for potential functional analysis.

Conclusions: We anticipate that OrthoList will be of considerable utility to C. elegans researchers for streamlining RNAi screens, by focusing on genes with apparent human orthologs, thus reducing screening effort by ∼60%. Moreover, we find that OrthoList provides a useful basis for annotating orthology and reveals more C. elegans orthologs of human genes in various functional groups, such as transcription factors, than previously described.

Show MeSH