Limits...
Optimizing structural modeling for a specific protein scaffold: knottins or inhibitor cystine knots.

Gracy J, Chiche L - BMC Bioinformatics (2010)

Bottom Line: This important variability is likely to arise from the highly diverse loops which connect the successive knotted cysteines.These average model deviations represent an improvement varying between 0.74 and 1.17 Å over a basic homology modeling derived from a unique template.In particular, we have shown that the accuracy of the models constructed at a low level of sequence identity can be improved by 1) a careful optimization of the modeling procedure, 2) the combination of multiple structural templates and 3) the use of conserved structural features as modeling restraints.

View Article: PubMed Central - HTML - PubMed

Affiliation: CNRS, UMR5048, Université Montpellier 1 et 2, Centre de Biochimie Structurale, 34090 Montpellier, France. Jerome.Gracy@cbs.cnrs.fr

ABSTRACT

Background: Knottins are small, diverse and stable proteins with important drug design potential. They can be classified in 30 families which cover a wide range of sequences (1621 sequenced), three-dimensional structures (155 solved) and functions (> 10). Inter knottin similarity lies mainly between 15% and 40% sequence identity and 1.5 to 4.5 Å backbone deviations although they all share a tightly knotted disulfide core. This important variability is likely to arise from the highly diverse loops which connect the successive knotted cysteines. The prediction of structural models for all knottin sequences would open new directions for the analysis of interaction sites and to provide a better understanding of the structural and functional organization of proteins sharing this scaffold.

Results: We have designed an automated modeling procedure for predicting the three-dimensionnal structure of knottins. The different steps of the homology modeling pipeline were carefully optimized relatively to a test set of knottins with known structures: template selection and alignment, extraction of structural constraints and model building, model evaluation and refinement. After optimization, the accuracy of predicted models was shown to lie between 1.50 and 1.96 Å from native structures at 50% and 10% maximum sequence identity levels, respectively. These average model deviations represent an improvement varying between 0.74 and 1.17 Å over a basic homology modeling derived from a unique template. A database of 1621 structural models for all known knottin sequences was generated and is freely accessible from our web server at http://knottin.cbs.cnrs.fr. Models can also be interactively constructed from any knottin sequence using the structure prediction module Knoter1D3D available from our protein analysis toolkit PAT at http://pat.cbs.cnrs.fr.

Conclusions: This work explores different directions for a systematic homology modeling of a diverse family of protein sequences. In particular, we have shown that the accuracy of the models constructed at a low level of sequence identity can be improved by 1) a careful optimization of the modeling procedure, 2) the combination of multiple structural templates and 3) the use of conserved structural features as modeling restraints.

Show MeSH
Top, cartoon representations: 3D structures of the squash inhibitor EETI-II (left, PDB ID 2it7A) and of the α-amylase inhibitor AAI (right, PDB ID 1clvI). The two-disulfide macrocycle is shown in green and the penetrating disulfide is shown in orange. The structures are in similar but different orientations. Bottom, sequences: selected knottins from 7 families. Families, Swiss-Prot IDs (PDB IDs for the above structures), from top to bottom: Agouti-related, ASIP_HUMAN; α-amylase inhibitor, IAAI_AMAHP (1clvI); Conotoxin1, CXO7C_CONMA; Cyclotide, CYO1_VIOOD, Serine protease inhibitor1, ITR2_ECBEL (2it7A); Spider, TOG4B_AGEAP; Virus1, Q89632_CVS. Knotted cysteines are boxed and numbered at the bottom. Roman numbers indicate the order of knotted cysteines while Arabic numbers indicate standard numbering used in the KNOTTIN database. Cysteine connectivities are shown as thick lines on top of sequences. Sequences were aligned using the Knoter1D tool [1].
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC2984590&req=5

Figure 1: Top, cartoon representations: 3D structures of the squash inhibitor EETI-II (left, PDB ID 2it7A) and of the α-amylase inhibitor AAI (right, PDB ID 1clvI). The two-disulfide macrocycle is shown in green and the penetrating disulfide is shown in orange. The structures are in similar but different orientations. Bottom, sequences: selected knottins from 7 families. Families, Swiss-Prot IDs (PDB IDs for the above structures), from top to bottom: Agouti-related, ASIP_HUMAN; α-amylase inhibitor, IAAI_AMAHP (1clvI); Conotoxin1, CXO7C_CONMA; Cyclotide, CYO1_VIOOD, Serine protease inhibitor1, ITR2_ECBEL (2it7A); Spider, TOG4B_AGEAP; Virus1, Q89632_CVS. Knotted cysteines are boxed and numbered at the bottom. Roman numbers indicate the order of knotted cysteines while Arabic numbers indicate standard numbering used in the KNOTTIN database. Cysteine connectivities are shown as thick lines on top of sequences. Sequences were aligned using the Knoter1D tool [1].

Mentions: The knottin scaffold [1-3] is spread over about 30 distinct disulfide-rich miniprotein families that all share the same special disulfide knot. This knot (Figure 1) is obtained when one disulfide bridge crosses the macrocycle formed by two other disulfides and the interconnecting backbone (disulfide III-VI goes through disulfides I-IV and II-V) [1]. Knottins display a broad spectrum of biological activities and natural members are on the pharmaceutical market or are currently undergoing clinical trials. But knottins also display amazing chemical and proteolytic stabilities, and, thanks to their small size, are amenable to chemical synthesis. Knottins therefore also provide an interesting structural scaffold for engineering new therapeutics and somehow bridge the gap between biological macromolecules and small drug molecules [4,5]. Any such developments, however, would ideally require proper understanding of knottin sequence-structure-function relationships, or at least availability of large sequence and structure data sets. To this goal, we envisaged to extend the KNOTTIN database [1] with quality 3D models of all knottin sequences.


Optimizing structural modeling for a specific protein scaffold: knottins or inhibitor cystine knots.

Gracy J, Chiche L - BMC Bioinformatics (2010)

Top, cartoon representations: 3D structures of the squash inhibitor EETI-II (left, PDB ID 2it7A) and of the α-amylase inhibitor AAI (right, PDB ID 1clvI). The two-disulfide macrocycle is shown in green and the penetrating disulfide is shown in orange. The structures are in similar but different orientations. Bottom, sequences: selected knottins from 7 families. Families, Swiss-Prot IDs (PDB IDs for the above structures), from top to bottom: Agouti-related, ASIP_HUMAN; α-amylase inhibitor, IAAI_AMAHP (1clvI); Conotoxin1, CXO7C_CONMA; Cyclotide, CYO1_VIOOD, Serine protease inhibitor1, ITR2_ECBEL (2it7A); Spider, TOG4B_AGEAP; Virus1, Q89632_CVS. Knotted cysteines are boxed and numbered at the bottom. Roman numbers indicate the order of knotted cysteines while Arabic numbers indicate standard numbering used in the KNOTTIN database. Cysteine connectivities are shown as thick lines on top of sequences. Sequences were aligned using the Knoter1D tool [1].
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC2984590&req=5

Figure 1: Top, cartoon representations: 3D structures of the squash inhibitor EETI-II (left, PDB ID 2it7A) and of the α-amylase inhibitor AAI (right, PDB ID 1clvI). The two-disulfide macrocycle is shown in green and the penetrating disulfide is shown in orange. The structures are in similar but different orientations. Bottom, sequences: selected knottins from 7 families. Families, Swiss-Prot IDs (PDB IDs for the above structures), from top to bottom: Agouti-related, ASIP_HUMAN; α-amylase inhibitor, IAAI_AMAHP (1clvI); Conotoxin1, CXO7C_CONMA; Cyclotide, CYO1_VIOOD, Serine protease inhibitor1, ITR2_ECBEL (2it7A); Spider, TOG4B_AGEAP; Virus1, Q89632_CVS. Knotted cysteines are boxed and numbered at the bottom. Roman numbers indicate the order of knotted cysteines while Arabic numbers indicate standard numbering used in the KNOTTIN database. Cysteine connectivities are shown as thick lines on top of sequences. Sequences were aligned using the Knoter1D tool [1].
Mentions: The knottin scaffold [1-3] is spread over about 30 distinct disulfide-rich miniprotein families that all share the same special disulfide knot. This knot (Figure 1) is obtained when one disulfide bridge crosses the macrocycle formed by two other disulfides and the interconnecting backbone (disulfide III-VI goes through disulfides I-IV and II-V) [1]. Knottins display a broad spectrum of biological activities and natural members are on the pharmaceutical market or are currently undergoing clinical trials. But knottins also display amazing chemical and proteolytic stabilities, and, thanks to their small size, are amenable to chemical synthesis. Knottins therefore also provide an interesting structural scaffold for engineering new therapeutics and somehow bridge the gap between biological macromolecules and small drug molecules [4,5]. Any such developments, however, would ideally require proper understanding of knottin sequence-structure-function relationships, or at least availability of large sequence and structure data sets. To this goal, we envisaged to extend the KNOTTIN database [1] with quality 3D models of all knottin sequences.

Bottom Line: This important variability is likely to arise from the highly diverse loops which connect the successive knotted cysteines.These average model deviations represent an improvement varying between 0.74 and 1.17 Å over a basic homology modeling derived from a unique template.In particular, we have shown that the accuracy of the models constructed at a low level of sequence identity can be improved by 1) a careful optimization of the modeling procedure, 2) the combination of multiple structural templates and 3) the use of conserved structural features as modeling restraints.

View Article: PubMed Central - HTML - PubMed

Affiliation: CNRS, UMR5048, Université Montpellier 1 et 2, Centre de Biochimie Structurale, 34090 Montpellier, France. Jerome.Gracy@cbs.cnrs.fr

ABSTRACT

Background: Knottins are small, diverse and stable proteins with important drug design potential. They can be classified in 30 families which cover a wide range of sequences (1621 sequenced), three-dimensional structures (155 solved) and functions (> 10). Inter knottin similarity lies mainly between 15% and 40% sequence identity and 1.5 to 4.5 Å backbone deviations although they all share a tightly knotted disulfide core. This important variability is likely to arise from the highly diverse loops which connect the successive knotted cysteines. The prediction of structural models for all knottin sequences would open new directions for the analysis of interaction sites and to provide a better understanding of the structural and functional organization of proteins sharing this scaffold.

Results: We have designed an automated modeling procedure for predicting the three-dimensionnal structure of knottins. The different steps of the homology modeling pipeline were carefully optimized relatively to a test set of knottins with known structures: template selection and alignment, extraction of structural constraints and model building, model evaluation and refinement. After optimization, the accuracy of predicted models was shown to lie between 1.50 and 1.96 Å from native structures at 50% and 10% maximum sequence identity levels, respectively. These average model deviations represent an improvement varying between 0.74 and 1.17 Å over a basic homology modeling derived from a unique template. A database of 1621 structural models for all known knottin sequences was generated and is freely accessible from our web server at http://knottin.cbs.cnrs.fr. Models can also be interactively constructed from any knottin sequence using the structure prediction module Knoter1D3D available from our protein analysis toolkit PAT at http://pat.cbs.cnrs.fr.

Conclusions: This work explores different directions for a systematic homology modeling of a diverse family of protein sequences. In particular, we have shown that the accuracy of the models constructed at a low level of sequence identity can be improved by 1) a careful optimization of the modeling procedure, 2) the combination of multiple structural templates and 3) the use of conserved structural features as modeling restraints.

Show MeSH