Limits...
Gene Ontology annotation of sequence-specific DNA binding transcription factors: setting the stage for a large-scale curation effort.

Tripathi S, Christie KR, Balakrishnan R, Huntley R, Hill DP, Thommesen L, Blake JA, Kuiper M, Lægreid A - Database (Oxford) (2013)

Bottom Line: However, existing transcription factor knowledge bases are still lacking in well-documented functional information.The completion of this task will significantly enrich Gene Ontology-based information resources for the research community.Database URL: www.tfcheckpoint.org.

View Article: PubMed Central - PubMed

Affiliation: Department of Cancer Research and Molecular Medicine, Norwegian University of Science and Technology, NTNU, N-7489 Trondheim, Norway.

ABSTRACT
Transcription factors control which information in a genome becomes transcribed to produce RNAs that function in the biological systems of cells and organisms. Reliable and comprehensive information about transcription factors is invaluable for large-scale network-based studies. However, existing transcription factor knowledge bases are still lacking in well-documented functional information. Here, we provide guidelines for a curation strategy, which constitutes a robust framework for using the controlled vocabularies defined by the Gene Ontology Consortium to annotate specific DNA binding transcription factors (DbTFs) based on experimental evidence reported in literature. Our standardized protocol and workflow for annotating specific DNA binding RNA polymerase II transcription factors is designed to document high-quality and decisive evidence from valid experimental methods. Within a collaborative biocuration effort involving the user community, we are now in the process of exhaustively annotating the full repertoire of human, mouse and rat proteins that qualify as DbTFs in as much as they are experimentally documented in the biomedical literature today. The completion of this task will significantly enrich Gene Ontology-based information resources for the research community. Database URL: www.tfcheckpoint.org.

Show MeSH
UniProt-GOA screenshot of some of the DbTF annotations. The annotations generated using the DbTF curation guidelines discussed here can be accessed from the GO database using the QuickGO tool.
© Copyright Policy - creative-commons
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC3753819&req=5

bat062-F3: UniProt-GOA screenshot of some of the DbTF annotations. The annotations generated using the DbTF curation guidelines discussed here can be accessed from the GO database using the QuickGO tool.

Mentions: The annotation workflow is depicted in Figure 2. An annotation effort typically starts with one of the scientific papers suggested in databases such as TFCat and JASPAR to document a candidate DbTF, or by searching for adequate literature in one of the following resources: UniProt (http://www.uniprot.org/), NCBI’s Entrez Gene (25), iHOP (26), Gene Cards (27) or NCBI’s PubMed (http://www.ncbi.nlm.nih.gov/pubmed/). Each scientific paper is first checked for information providing correct identification of species origin of the TF studied. Because we are focusing on DbTFs from human, mouse and rat studies, only papers allowing identification of a DbTF from one of these species will proceed to further curation. Thus, a number of papers that fail to clearly identify the species of the gene(s) used in their construct(s) have to be omitted from the curation process. Then, the paper is searched for adequate experimental evidence to support one or several DbTF annotations. If either TF species origin or sufficient experimental evidence is not identifiable, the curator returns to the scientific literature corpus to search for other suitable papers. When both criteria are fulfilled, the individual GO annotations (i.e. specific DNA binding and/or TF binding and transcription regulation) are assigned together with a supporting evidence code. Finally, the composite TF activity MF GO term(s) is inferred. TF annotation data are submitted to UniProt-GOA in the form of a gene association file (GAF2.0; http://www.geneontology.org/GO.format.gaf-2_0.shtml) and will subsequently appear in the GOC database via tools such as AmiGO (http://amigo.geneontology.org/) and QuickGO (http://www.ebi.ac.uk/QuickGO/; Figure 3).Figure 2.


Gene Ontology annotation of sequence-specific DNA binding transcription factors: setting the stage for a large-scale curation effort.

Tripathi S, Christie KR, Balakrishnan R, Huntley R, Hill DP, Thommesen L, Blake JA, Kuiper M, Lægreid A - Database (Oxford) (2013)

UniProt-GOA screenshot of some of the DbTF annotations. The annotations generated using the DbTF curation guidelines discussed here can be accessed from the GO database using the QuickGO tool.
© Copyright Policy - creative-commons
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC3753819&req=5

bat062-F3: UniProt-GOA screenshot of some of the DbTF annotations. The annotations generated using the DbTF curation guidelines discussed here can be accessed from the GO database using the QuickGO tool.
Mentions: The annotation workflow is depicted in Figure 2. An annotation effort typically starts with one of the scientific papers suggested in databases such as TFCat and JASPAR to document a candidate DbTF, or by searching for adequate literature in one of the following resources: UniProt (http://www.uniprot.org/), NCBI’s Entrez Gene (25), iHOP (26), Gene Cards (27) or NCBI’s PubMed (http://www.ncbi.nlm.nih.gov/pubmed/). Each scientific paper is first checked for information providing correct identification of species origin of the TF studied. Because we are focusing on DbTFs from human, mouse and rat studies, only papers allowing identification of a DbTF from one of these species will proceed to further curation. Thus, a number of papers that fail to clearly identify the species of the gene(s) used in their construct(s) have to be omitted from the curation process. Then, the paper is searched for adequate experimental evidence to support one or several DbTF annotations. If either TF species origin or sufficient experimental evidence is not identifiable, the curator returns to the scientific literature corpus to search for other suitable papers. When both criteria are fulfilled, the individual GO annotations (i.e. specific DNA binding and/or TF binding and transcription regulation) are assigned together with a supporting evidence code. Finally, the composite TF activity MF GO term(s) is inferred. TF annotation data are submitted to UniProt-GOA in the form of a gene association file (GAF2.0; http://www.geneontology.org/GO.format.gaf-2_0.shtml) and will subsequently appear in the GOC database via tools such as AmiGO (http://amigo.geneontology.org/) and QuickGO (http://www.ebi.ac.uk/QuickGO/; Figure 3).Figure 2.

Bottom Line: However, existing transcription factor knowledge bases are still lacking in well-documented functional information.The completion of this task will significantly enrich Gene Ontology-based information resources for the research community.Database URL: www.tfcheckpoint.org.

View Article: PubMed Central - PubMed

Affiliation: Department of Cancer Research and Molecular Medicine, Norwegian University of Science and Technology, NTNU, N-7489 Trondheim, Norway.

ABSTRACT
Transcription factors control which information in a genome becomes transcribed to produce RNAs that function in the biological systems of cells and organisms. Reliable and comprehensive information about transcription factors is invaluable for large-scale network-based studies. However, existing transcription factor knowledge bases are still lacking in well-documented functional information. Here, we provide guidelines for a curation strategy, which constitutes a robust framework for using the controlled vocabularies defined by the Gene Ontology Consortium to annotate specific DNA binding transcription factors (DbTFs) based on experimental evidence reported in literature. Our standardized protocol and workflow for annotating specific DNA binding RNA polymerase II transcription factors is designed to document high-quality and decisive evidence from valid experimental methods. Within a collaborative biocuration effort involving the user community, we are now in the process of exhaustively annotating the full repertoire of human, mouse and rat proteins that qualify as DbTFs in as much as they are experimentally documented in the biomedical literature today. The completion of this task will significantly enrich Gene Ontology-based information resources for the research community. Database URL: www.tfcheckpoint.org.

Show MeSH