Limits...
The 20 years of PROSITE.

Hulo N, Bairoch A, Bulliard V, Cerutti L, Cuche BA, de Castro E, Lachaize C, Langendijk-Genevaux PS, Sigrist CJ - Nucleic Acids Res. (2007)

Bottom Line: PROSITE consists of documentation entries describing protein domains, families and functional sites, as well as associated patterns and profiles to identify them.In this article, we describe the implementation of a new method to assign a status to pattern matches, the new PROSITE web page and a new approach to improve the specificity and sensitivity of PROSITE methods.Over the past 2 years, about 200 domains have been added, and now 53% of UniProtKB/Swiss-Prot entries (release 54.2 of 11 September 2007) have a PROSITE match.

View Article: PubMed Central - PubMed

Affiliation: Swiss Institute of Bioinformatics (SIB), Centre Medical Universitaire, Structural Biology and Bioinformatics Department, University of Geneva, 1 rue Michel Servet, CH-1211 Geneva, Switzerland.

ABSTRACT
PROSITE consists of documentation entries describing protein domains, families and functional sites, as well as associated patterns and profiles to identify them. It is complemented by ProRule, a collection of rules based on profiles and patterns, which increases the discriminatory power of profiles and patterns by providing additional information about functionally and/or structurally critical amino acids. In this article, we describe the implementation of a new method to assign a status to pattern matches, the new PROSITE web page and a new approach to improve the specificity and sensitivity of PROSITE methods. The latest version of PROSITE (release 20.19 of 11 September 2007) contains 1319 patterns, 745 profiles and 764 ProRules. Over the past 2 years, about 200 domains have been added, and now 53% of UniProtKB/Swiss-Prot entries (release 54.2 of 11 September 2007) have a PROSITE match. PROSITE is available on the web at: http://www.expasy.org/prosite/.

Show MeSH
An example of a ProRule in ‘niceview’ format from the PROSITE web page showing the different types of annotation that can be generated by ProRule. The rule PRU00298 is used to annotate proteins matched by the animal peroxidase profile (PS50292) on the ScanProsite web page. It can annotate comment lines, KW, GO terms and various FT lines. In this rule, all types of annotation are conditionals. For example, the comment line ‘catalytic activity’ is generated only if the condition FTGroup(2) is fulfilled (a H in the sequence must align with position 96 of the profile and a R with position 235). The numbers in the ‘From’ and ‘To’ column in the features tables correspond to specific columns in the profile (for more details on the ProRule format see: ftp://ftp.expasy.org/databases/prosite/unirule.pdf).
© Copyright Policy - creative-commons
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC2238851&req=5

Figure 1: An example of a ProRule in ‘niceview’ format from the PROSITE web page showing the different types of annotation that can be generated by ProRule. The rule PRU00298 is used to annotate proteins matched by the animal peroxidase profile (PS50292) on the ScanProsite web page. It can annotate comment lines, KW, GO terms and various FT lines. In this rule, all types of annotation are conditionals. For example, the comment line ‘catalytic activity’ is generated only if the condition FTGroup(2) is fulfilled (a H in the sequence must align with position 96 of the profile and a R with position 235). The numbers in the ‘From’ and ‘To’ column in the features tables correspond to specific columns in the profile (for more details on the ProRule format see: ftp://ftp.expasy.org/databases/prosite/unirule.pdf).

Mentions: Since its creation, PROSITE has provided extensive documentation and detailed annotation of domains, families and functional sites. This information was mainly stored in free text and used by biologists who read the various documents and made their own decisions on the function of their protein according to the PROSITE matches. But with the rapid growth of sequence databases during the last 10 years, there was an increasing need for a reliable tool that could generate automatically precise and accurate functional annotation in standard format. In 2005, we decided to group some functional information stored in PROSITE in a database of rules that can easily be read by a program and applied on proteins that are recognized by PROSITE profiles (6). We named this complementary database ProRule, for PROSITE Rules. ProRule generates a variety of annotation in Swiss-Prot format. The main characteristic of ProRule is that it generates conditional annotation: the annotation is dependent on the presence of given amino acids at precise positions, on the occurrence of other domains or on taxonomic specificity. This information is only transferred if all the conditions are fulfilled. For example, an enzymatic active site is annotated only if the correct amino acid is found at the required position (for an example of ProRule, see Figure 1). As ProRule uses PROSITE profiles that are mainly directed against protein domains, it is well adapted to annotate modular proteins. The Swiss-Prot group has also developed a complementary database of rules (HAMAP), which uses the same format of rules but which is specific for well-conserved bacterial protein families (7).Figure 1.


The 20 years of PROSITE.

Hulo N, Bairoch A, Bulliard V, Cerutti L, Cuche BA, de Castro E, Lachaize C, Langendijk-Genevaux PS, Sigrist CJ - Nucleic Acids Res. (2007)

An example of a ProRule in ‘niceview’ format from the PROSITE web page showing the different types of annotation that can be generated by ProRule. The rule PRU00298 is used to annotate proteins matched by the animal peroxidase profile (PS50292) on the ScanProsite web page. It can annotate comment lines, KW, GO terms and various FT lines. In this rule, all types of annotation are conditionals. For example, the comment line ‘catalytic activity’ is generated only if the condition FTGroup(2) is fulfilled (a H in the sequence must align with position 96 of the profile and a R with position 235). The numbers in the ‘From’ and ‘To’ column in the features tables correspond to specific columns in the profile (for more details on the ProRule format see: ftp://ftp.expasy.org/databases/prosite/unirule.pdf).
© Copyright Policy - creative-commons
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC2238851&req=5

Figure 1: An example of a ProRule in ‘niceview’ format from the PROSITE web page showing the different types of annotation that can be generated by ProRule. The rule PRU00298 is used to annotate proteins matched by the animal peroxidase profile (PS50292) on the ScanProsite web page. It can annotate comment lines, KW, GO terms and various FT lines. In this rule, all types of annotation are conditionals. For example, the comment line ‘catalytic activity’ is generated only if the condition FTGroup(2) is fulfilled (a H in the sequence must align with position 96 of the profile and a R with position 235). The numbers in the ‘From’ and ‘To’ column in the features tables correspond to specific columns in the profile (for more details on the ProRule format see: ftp://ftp.expasy.org/databases/prosite/unirule.pdf).
Mentions: Since its creation, PROSITE has provided extensive documentation and detailed annotation of domains, families and functional sites. This information was mainly stored in free text and used by biologists who read the various documents and made their own decisions on the function of their protein according to the PROSITE matches. But with the rapid growth of sequence databases during the last 10 years, there was an increasing need for a reliable tool that could generate automatically precise and accurate functional annotation in standard format. In 2005, we decided to group some functional information stored in PROSITE in a database of rules that can easily be read by a program and applied on proteins that are recognized by PROSITE profiles (6). We named this complementary database ProRule, for PROSITE Rules. ProRule generates a variety of annotation in Swiss-Prot format. The main characteristic of ProRule is that it generates conditional annotation: the annotation is dependent on the presence of given amino acids at precise positions, on the occurrence of other domains or on taxonomic specificity. This information is only transferred if all the conditions are fulfilled. For example, an enzymatic active site is annotated only if the correct amino acid is found at the required position (for an example of ProRule, see Figure 1). As ProRule uses PROSITE profiles that are mainly directed against protein domains, it is well adapted to annotate modular proteins. The Swiss-Prot group has also developed a complementary database of rules (HAMAP), which uses the same format of rules but which is specific for well-conserved bacterial protein families (7).Figure 1.

Bottom Line: PROSITE consists of documentation entries describing protein domains, families and functional sites, as well as associated patterns and profiles to identify them.In this article, we describe the implementation of a new method to assign a status to pattern matches, the new PROSITE web page and a new approach to improve the specificity and sensitivity of PROSITE methods.Over the past 2 years, about 200 domains have been added, and now 53% of UniProtKB/Swiss-Prot entries (release 54.2 of 11 September 2007) have a PROSITE match.

View Article: PubMed Central - PubMed

Affiliation: Swiss Institute of Bioinformatics (SIB), Centre Medical Universitaire, Structural Biology and Bioinformatics Department, University of Geneva, 1 rue Michel Servet, CH-1211 Geneva, Switzerland.

ABSTRACT
PROSITE consists of documentation entries describing protein domains, families and functional sites, as well as associated patterns and profiles to identify them. It is complemented by ProRule, a collection of rules based on profiles and patterns, which increases the discriminatory power of profiles and patterns by providing additional information about functionally and/or structurally critical amino acids. In this article, we describe the implementation of a new method to assign a status to pattern matches, the new PROSITE web page and a new approach to improve the specificity and sensitivity of PROSITE methods. The latest version of PROSITE (release 20.19 of 11 September 2007) contains 1319 patterns, 745 profiles and 764 ProRules. Over the past 2 years, about 200 domains have been added, and now 53% of UniProtKB/Swiss-Prot entries (release 54.2 of 11 September 2007) have a PROSITE match. PROSITE is available on the web at: http://www.expasy.org/prosite/.

Show MeSH