Limits...
Determinants of protein function revealed by combinatorial entropy optimization.

Reva B, Antipin Y, Sander C - Genome Biol. (2007)

Bottom Line: Specificity residues are conserved within a subfamily but differ between subfamilies, and they typically encode functional diversity.We obtain good agreement between predicted specificity residues and experimentally known functional residues in protein interfaces.Such predicted functional determinants are useful for interpreting the functional consequences of mutations in natural evolution and disease.

View Article: PubMed Central - HTML - PubMed

Affiliation: Computational Biology Center, Memorial Sloan-Kettering Cancer Center, 1275 York Avenue, New York, NY 10065, USA. borisr@mskcc.org

ABSTRACT
We use a new algorithm (combinatorial entropy optimization [CEO]) to identify specificity residues and functional subfamilies in sets of proteins related by evolution. Specificity residues are conserved within a subfamily but differ between subfamilies, and they typically encode functional diversity. We obtain good agreement between predicted specificity residues and experimentally known functional residues in protein interfaces. Such predicted functional determinants are useful for interpreting the functional consequences of mutations in natural evolution and disease.

Show MeSH

Related in: MedlinePlus

Typical results and predictive power of the CEO method illustrated in the family of small GTPases (G-domains). The analysis used 126 distinct human sequences of the Ras superfamily of GTPase domains obtained after removing redundant identical copies and gappy (>30% gaps relative to rasH) sequences from the 284 protein domain sequences in the PFAM Protein Family Database (version 20), which includes ras, rab, and rho subfamilies. (a) Alignments of 22 specificity residues (numbered as in RasH) in the two largest ras and rho subfamilies; these residues (out of a total of about 190) carry most of the information for the distinction between functional subfamilies; note the conservation of residue type within each subfamily and nonconservation between subfamilies. (b) Presence of the computed specificity residues in known molecular interfaces (marked '#') of three GTPases (RasH, RhoA, and CDC42). Seventeen of the 22 specificity residues are in these interfaces (yellow numbers). Nine of the specificity residues are in the functionally important switch I (magenta numbers) and switch II (orange numbers) regions, which are involved in sensing and/or communicating the differences between the GTP and GDP states. CEO, combinatorial entropy optimization.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC2258190&req=5

Figure 2: Typical results and predictive power of the CEO method illustrated in the family of small GTPases (G-domains). The analysis used 126 distinct human sequences of the Ras superfamily of GTPase domains obtained after removing redundant identical copies and gappy (>30% gaps relative to rasH) sequences from the 284 protein domain sequences in the PFAM Protein Family Database (version 20), which includes ras, rab, and rho subfamilies. (a) Alignments of 22 specificity residues (numbered as in RasH) in the two largest ras and rho subfamilies; these residues (out of a total of about 190) carry most of the information for the distinction between functional subfamilies; note the conservation of residue type within each subfamily and nonconservation between subfamilies. (b) Presence of the computed specificity residues in known molecular interfaces (marked '#') of three GTPases (RasH, RhoA, and CDC42). Seventeen of the 22 specificity residues are in these interfaces (yellow numbers). Nine of the specificity residues are in the functionally important switch I (magenta numbers) and switch II (orange numbers) regions, which are involved in sensing and/or communicating the differences between the GTP and GDP states. CEO, combinatorial entropy optimization.

Mentions: Our analysis of 126 unique human sequences in the Protein Families (PFAM) Ras family defines 18 subfamilies, with from 2 to 15 proteins per subfamily and 22 specificity residues that optimally discriminate between these subfamilies (Figure 2). Remarkably, a relatively small number of residues (22 out of about 200) capture the essence of subfamily discrimination, presumably as a result of functional fine tuning of interaction sites in evolution. For example (Figure 2), the following residues are characteristic for the ras/rho discrimination (amino acid numbers as in ras) D33A, E37F, S65D, A66R, D69P, and Q70L.


Determinants of protein function revealed by combinatorial entropy optimization.

Reva B, Antipin Y, Sander C - Genome Biol. (2007)

Typical results and predictive power of the CEO method illustrated in the family of small GTPases (G-domains). The analysis used 126 distinct human sequences of the Ras superfamily of GTPase domains obtained after removing redundant identical copies and gappy (>30% gaps relative to rasH) sequences from the 284 protein domain sequences in the PFAM Protein Family Database (version 20), which includes ras, rab, and rho subfamilies. (a) Alignments of 22 specificity residues (numbered as in RasH) in the two largest ras and rho subfamilies; these residues (out of a total of about 190) carry most of the information for the distinction between functional subfamilies; note the conservation of residue type within each subfamily and nonconservation between subfamilies. (b) Presence of the computed specificity residues in known molecular interfaces (marked '#') of three GTPases (RasH, RhoA, and CDC42). Seventeen of the 22 specificity residues are in these interfaces (yellow numbers). Nine of the specificity residues are in the functionally important switch I (magenta numbers) and switch II (orange numbers) regions, which are involved in sensing and/or communicating the differences between the GTP and GDP states. CEO, combinatorial entropy optimization.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC2258190&req=5

Figure 2: Typical results and predictive power of the CEO method illustrated in the family of small GTPases (G-domains). The analysis used 126 distinct human sequences of the Ras superfamily of GTPase domains obtained after removing redundant identical copies and gappy (>30% gaps relative to rasH) sequences from the 284 protein domain sequences in the PFAM Protein Family Database (version 20), which includes ras, rab, and rho subfamilies. (a) Alignments of 22 specificity residues (numbered as in RasH) in the two largest ras and rho subfamilies; these residues (out of a total of about 190) carry most of the information for the distinction between functional subfamilies; note the conservation of residue type within each subfamily and nonconservation between subfamilies. (b) Presence of the computed specificity residues in known molecular interfaces (marked '#') of three GTPases (RasH, RhoA, and CDC42). Seventeen of the 22 specificity residues are in these interfaces (yellow numbers). Nine of the specificity residues are in the functionally important switch I (magenta numbers) and switch II (orange numbers) regions, which are involved in sensing and/or communicating the differences between the GTP and GDP states. CEO, combinatorial entropy optimization.
Mentions: Our analysis of 126 unique human sequences in the Protein Families (PFAM) Ras family defines 18 subfamilies, with from 2 to 15 proteins per subfamily and 22 specificity residues that optimally discriminate between these subfamilies (Figure 2). Remarkably, a relatively small number of residues (22 out of about 200) capture the essence of subfamily discrimination, presumably as a result of functional fine tuning of interaction sites in evolution. For example (Figure 2), the following residues are characteristic for the ras/rho discrimination (amino acid numbers as in ras) D33A, E37F, S65D, A66R, D69P, and Q70L.

Bottom Line: Specificity residues are conserved within a subfamily but differ between subfamilies, and they typically encode functional diversity.We obtain good agreement between predicted specificity residues and experimentally known functional residues in protein interfaces.Such predicted functional determinants are useful for interpreting the functional consequences of mutations in natural evolution and disease.

View Article: PubMed Central - HTML - PubMed

Affiliation: Computational Biology Center, Memorial Sloan-Kettering Cancer Center, 1275 York Avenue, New York, NY 10065, USA. borisr@mskcc.org

ABSTRACT
We use a new algorithm (combinatorial entropy optimization [CEO]) to identify specificity residues and functional subfamilies in sets of proteins related by evolution. Specificity residues are conserved within a subfamily but differ between subfamilies, and they typically encode functional diversity. We obtain good agreement between predicted specificity residues and experimentally known functional residues in protein interfaces. Such predicted functional determinants are useful for interpreting the functional consequences of mutations in natural evolution and disease.

Show MeSH
Related in: MedlinePlus