Limits...
Gene Ontology annotation of sequence-specific DNA binding transcription factors: setting the stage for a large-scale curation effort.

Tripathi S, Christie KR, Balakrishnan R, Huntley R, Hill DP, Thommesen L, Blake JA, Kuiper M, Lægreid A - Database (Oxford) (2013)

Bottom Line: However, existing transcription factor knowledge bases are still lacking in well-documented functional information.The completion of this task will significantly enrich Gene Ontology-based information resources for the research community.Database URL: www.tfcheckpoint.org.

View Article: PubMed Central - PubMed

Affiliation: Department of Cancer Research and Molecular Medicine, Norwegian University of Science and Technology, NTNU, N-7489 Trondheim, Norway.

ABSTRACT
Transcription factors control which information in a genome becomes transcribed to produce RNAs that function in the biological systems of cells and organisms. Reliable and comprehensive information about transcription factors is invaluable for large-scale network-based studies. However, existing transcription factor knowledge bases are still lacking in well-documented functional information. Here, we provide guidelines for a curation strategy, which constitutes a robust framework for using the controlled vocabularies defined by the Gene Ontology Consortium to annotate specific DNA binding transcription factors (DbTFs) based on experimental evidence reported in literature. Our standardized protocol and workflow for annotating specific DNA binding RNA polymerase II transcription factors is designed to document high-quality and decisive evidence from valid experimental methods. Within a collaborative biocuration effort involving the user community, we are now in the process of exhaustively annotating the full repertoire of human, mouse and rat proteins that qualify as DbTFs in as much as they are experimentally documented in the biomedical literature today. The completion of this task will significantly enrich Gene Ontology-based information resources for the research community. Database URL: www.tfcheckpoint.org.

Show MeSH
Primary GO terms/subgraphs used for DbTF annotation. (A) GO subgraph used for sequence-specific DbTF. In this graph, sequence-specific DNA binding MF terms (yellow), sequence-specific DNA binding TF activity MF terms (green) and transcription regulation BP (blue) are shown along with the relationships between terms in the graph structure. (B) GO subgraph used for transcription factor binding transcription factors. In this graph, the different color coding represents the following: TF binding MF terms (yellow), transcription regulation BP (blue) and TF binding TF activity MF terms (green). I, P and H on top of the lines stand for relationships ‘is_a’, ‘part_of’ and ‘has_part’.
© Copyright Policy - creative-commons
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC3753819&req=5

bat062-F1: Primary GO terms/subgraphs used for DbTF annotation. (A) GO subgraph used for sequence-specific DbTF. In this graph, sequence-specific DNA binding MF terms (yellow), sequence-specific DNA binding TF activity MF terms (green) and transcription regulation BP (blue) are shown along with the relationships between terms in the graph structure. (B) GO subgraph used for transcription factor binding transcription factors. In this graph, the different color coding represents the following: TF binding MF terms (yellow), transcription regulation BP (blue) and TF binding TF activity MF terms (green). I, P and H on top of the lines stand for relationships ‘is_a’, ‘part_of’ and ‘has_part’.

Mentions: For example, nucleic acid-binding transcription factors must have nucleic acid-binding activity to function and also must regulate transcription. Thus, the MF terms for types of ‘nucleic acid binding transcription factor activity’ are required to have ‘has_part’ relationships to the appropriate MF terms for ‘nucleic acid binding’ [e.g. ‘sequence-specific DNA binding RNA polymerase II transcription factor activity’ (GO:0000981) has_part ‘RNA polymerase II regulatory region sequence-specific DNA binding’ (GO:0000977)] (see Figure 1). Equally important, MF ‘transcription factor activity’ terms [e.g. ‘sequence-specific DNA binding RNA polymerase II transcription factor activity’ (GO:0000981)] are also required to have ‘part_of’ relationships to appropriate BP terms for ‘regulation of transcription’ (e.g. ‘regulation of transcription from RNA polymerase II promoter’ (GO:0006357)], as the overall biological objective of the function of the molecule is to take part in regulating transcription. These ‘part_of’ relationships between a specific MF term and a BP term represent a previous advance in the use of relationships within the GO structure to provide more contextually-dependent MF terms, e.g. when the same enzymatic activities are used in more than one process. In the course of revising the transcription section of GO, we incorporated these ‘part_of’ links from MF to BP terms to provide more complete representation of the ‘transcription factor activity’ terms, which are located within the MF aspect of GO. Examples of these ‘has_part’ and ‘part_of’ relationships for these MF terms are shown in Figure 1. Retention of a generic ‘transcription factor activity’ does not make sense in the MF ontology because from a MF viewpoint it is equivalent to an otherwise unknown MF that regulates transcription. However, the BP term ‘transcription, DNA dependent’ can be used to annotate all gene products that regulate transcription, even when the mechanism of action is not known.Figure 1.


Gene Ontology annotation of sequence-specific DNA binding transcription factors: setting the stage for a large-scale curation effort.

Tripathi S, Christie KR, Balakrishnan R, Huntley R, Hill DP, Thommesen L, Blake JA, Kuiper M, Lægreid A - Database (Oxford) (2013)

Primary GO terms/subgraphs used for DbTF annotation. (A) GO subgraph used for sequence-specific DbTF. In this graph, sequence-specific DNA binding MF terms (yellow), sequence-specific DNA binding TF activity MF terms (green) and transcription regulation BP (blue) are shown along with the relationships between terms in the graph structure. (B) GO subgraph used for transcription factor binding transcription factors. In this graph, the different color coding represents the following: TF binding MF terms (yellow), transcription regulation BP (blue) and TF binding TF activity MF terms (green). I, P and H on top of the lines stand for relationships ‘is_a’, ‘part_of’ and ‘has_part’.
© Copyright Policy - creative-commons
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC3753819&req=5

bat062-F1: Primary GO terms/subgraphs used for DbTF annotation. (A) GO subgraph used for sequence-specific DbTF. In this graph, sequence-specific DNA binding MF terms (yellow), sequence-specific DNA binding TF activity MF terms (green) and transcription regulation BP (blue) are shown along with the relationships between terms in the graph structure. (B) GO subgraph used for transcription factor binding transcription factors. In this graph, the different color coding represents the following: TF binding MF terms (yellow), transcription regulation BP (blue) and TF binding TF activity MF terms (green). I, P and H on top of the lines stand for relationships ‘is_a’, ‘part_of’ and ‘has_part’.
Mentions: For example, nucleic acid-binding transcription factors must have nucleic acid-binding activity to function and also must regulate transcription. Thus, the MF terms for types of ‘nucleic acid binding transcription factor activity’ are required to have ‘has_part’ relationships to the appropriate MF terms for ‘nucleic acid binding’ [e.g. ‘sequence-specific DNA binding RNA polymerase II transcription factor activity’ (GO:0000981) has_part ‘RNA polymerase II regulatory region sequence-specific DNA binding’ (GO:0000977)] (see Figure 1). Equally important, MF ‘transcription factor activity’ terms [e.g. ‘sequence-specific DNA binding RNA polymerase II transcription factor activity’ (GO:0000981)] are also required to have ‘part_of’ relationships to appropriate BP terms for ‘regulation of transcription’ (e.g. ‘regulation of transcription from RNA polymerase II promoter’ (GO:0006357)], as the overall biological objective of the function of the molecule is to take part in regulating transcription. These ‘part_of’ relationships between a specific MF term and a BP term represent a previous advance in the use of relationships within the GO structure to provide more contextually-dependent MF terms, e.g. when the same enzymatic activities are used in more than one process. In the course of revising the transcription section of GO, we incorporated these ‘part_of’ links from MF to BP terms to provide more complete representation of the ‘transcription factor activity’ terms, which are located within the MF aspect of GO. Examples of these ‘has_part’ and ‘part_of’ relationships for these MF terms are shown in Figure 1. Retention of a generic ‘transcription factor activity’ does not make sense in the MF ontology because from a MF viewpoint it is equivalent to an otherwise unknown MF that regulates transcription. However, the BP term ‘transcription, DNA dependent’ can be used to annotate all gene products that regulate transcription, even when the mechanism of action is not known.Figure 1.

Bottom Line: However, existing transcription factor knowledge bases are still lacking in well-documented functional information.The completion of this task will significantly enrich Gene Ontology-based information resources for the research community.Database URL: www.tfcheckpoint.org.

View Article: PubMed Central - PubMed

Affiliation: Department of Cancer Research and Molecular Medicine, Norwegian University of Science and Technology, NTNU, N-7489 Trondheim, Norway.

ABSTRACT
Transcription factors control which information in a genome becomes transcribed to produce RNAs that function in the biological systems of cells and organisms. Reliable and comprehensive information about transcription factors is invaluable for large-scale network-based studies. However, existing transcription factor knowledge bases are still lacking in well-documented functional information. Here, we provide guidelines for a curation strategy, which constitutes a robust framework for using the controlled vocabularies defined by the Gene Ontology Consortium to annotate specific DNA binding transcription factors (DbTFs) based on experimental evidence reported in literature. Our standardized protocol and workflow for annotating specific DNA binding RNA polymerase II transcription factors is designed to document high-quality and decisive evidence from valid experimental methods. Within a collaborative biocuration effort involving the user community, we are now in the process of exhaustively annotating the full repertoire of human, mouse and rat proteins that qualify as DbTFs in as much as they are experimentally documented in the biomedical literature today. The completion of this task will significantly enrich Gene Ontology-based information resources for the research community. Database URL: www.tfcheckpoint.org.

Show MeSH