Limits...
Decomposing the space of protein quaternary structures with the interface fragment pair library.

Xie ZR, Chen J, Zhao Y, Wu Y - BMC Bioinformatics (2015)

Bottom Line: After structural-based clustering, we found that more than 90% of these interface fragment pairs can be represented by a limited number of highly abundant motifs.Our study therefore presents supportive evidences that the space of protein quaternary structures can be represented by the combination of a small set of secondary-structure-based packing at binding interfaces.Finally, after future improvements such as adding sequence profiles, we expect this new library will be useful to predict structures of unknown protein-protein interactions.

View Article: PubMed Central - PubMed

Affiliation: Department of Systems and Computational Biology, Albert Einstein College of Medicine of Yeshiva University, 1300 Morris Park Avenue, Bronx, NY, 10461, USA. Zhong-Ru.Xie@einstein.yu.edu.

ABSTRACT

Background: The physical interactions between proteins constitute the basis of protein quaternary structures. They dominate many biological processes in living cells. Deciphering the structural features of interacting proteins is essential to understand their cellular functions. Similar to the space of protein tertiary structures in which discrete patterns are clearly observed on fold or sub-fold motif levels, it has been found that the space of protein quaternary structures is highly degenerate due to the packing of compact secondary structure elements at interfaces. Therefore, it is necessary to further decompose the protein quaternary structural space into a more local representation.

Results: Here we constructed an interface fragment pair library from the current structure database of protein complexes. After structural-based clustering, we found that more than 90% of these interface fragment pairs can be represented by a limited number of highly abundant motifs. These motifs were further used to guide complex assembly. A large-scale benchmark test shows that the native-like binding is highly likely in the structural ensemble of modeled protein complexes that were built through the library.

Conclusions: Our study therefore presents supportive evidences that the space of protein quaternary structures can be represented by the combination of a small set of secondary-structure-based packing at binding interfaces. Finally, after future improvements such as adding sequence profiles, we expect this new library will be useful to predict structures of unknown protein-protein interactions.

Show MeSH

Related in: MedlinePlus

We tested the library by a large-scale benchmark including a set of 176 non-redundant protein–protein complexes. For each entry in the benchmark, a large number of structural models were generated. Among the derived ensemble of all complex models, we further found the target that has the lowest RMSD from the structure of the native complex. The distribution of the lowest RMSD models for all 176 entries is plotted by the histogram. The figure suggests that the native-like binding is highly likely in the structural ensemble of modeled protein complexes that were built through the library.
© Copyright Policy - open-access
Related In: Results  -  Collection

License 1 - License 2
getmorefigures.php?uid=PMC4384354&req=5

Fig7: We tested the library by a large-scale benchmark including a set of 176 non-redundant protein–protein complexes. For each entry in the benchmark, a large number of structural models were generated. Among the derived ensemble of all complex models, we further found the target that has the lowest RMSD from the structure of the native complex. The distribution of the lowest RMSD models for all 176 entries is plotted by the histogram. The figure suggests that the native-like binding is highly likely in the structural ensemble of modeled protein complexes that were built through the library.

Mentions: In order to evaluate the completeness of protein quaternary structural space represented by the interface fragment pair library, we applied the library to a large-scale benchmark set. The protein-protein docking benchmark constructed by ZLAB was used in our study [32]. The most updated version of the benchmark (4.0) includes a set of 176 non-redundant protein–protein complexes. For each entry in the benchmark, we first separated subunits from the complex. Subunits were assembled together by aligning the corresponding fragments in their structures with each of the 459 interface fragment pairs in the library, based on the algorithm introduced in the method. Among the derived ensemble of all complex models, we further found the target that has the lowest RMSD from the structure of the native complex. The distribution of the lowest RMSD models for all 176 benchmark entries is plotted as a histogram in Figure 7. The figure shows that the peak of the distribution is at 4.0 Angstrom. For more than 90% of the 176 entries, we can find structural models that have RMSD less than 6.0 Angstrom from the native complexes, indicating that the native binding can be reproduced with a high successful rate. Our benchmark results thus suggest that the space of protein quaternary structures can be simplified by a limited number of modes expanded by the interface fragment pair library. It is worth mentioning that the purpose of this test is not for systematic comparison of docking algorithms, but to enumerate all binding modes of a complex through a fragment-based library. Thereby, we used bound structures of subunits during complex assembly instead of unbound structures that are normally used in docking tests.Figure 7


Decomposing the space of protein quaternary structures with the interface fragment pair library.

Xie ZR, Chen J, Zhao Y, Wu Y - BMC Bioinformatics (2015)

We tested the library by a large-scale benchmark including a set of 176 non-redundant protein–protein complexes. For each entry in the benchmark, a large number of structural models were generated. Among the derived ensemble of all complex models, we further found the target that has the lowest RMSD from the structure of the native complex. The distribution of the lowest RMSD models for all 176 entries is plotted by the histogram. The figure suggests that the native-like binding is highly likely in the structural ensemble of modeled protein complexes that were built through the library.
© Copyright Policy - open-access
Related In: Results  -  Collection

License 1 - License 2
Show All Figures
getmorefigures.php?uid=PMC4384354&req=5

Fig7: We tested the library by a large-scale benchmark including a set of 176 non-redundant protein–protein complexes. For each entry in the benchmark, a large number of structural models were generated. Among the derived ensemble of all complex models, we further found the target that has the lowest RMSD from the structure of the native complex. The distribution of the lowest RMSD models for all 176 entries is plotted by the histogram. The figure suggests that the native-like binding is highly likely in the structural ensemble of modeled protein complexes that were built through the library.
Mentions: In order to evaluate the completeness of protein quaternary structural space represented by the interface fragment pair library, we applied the library to a large-scale benchmark set. The protein-protein docking benchmark constructed by ZLAB was used in our study [32]. The most updated version of the benchmark (4.0) includes a set of 176 non-redundant protein–protein complexes. For each entry in the benchmark, we first separated subunits from the complex. Subunits were assembled together by aligning the corresponding fragments in their structures with each of the 459 interface fragment pairs in the library, based on the algorithm introduced in the method. Among the derived ensemble of all complex models, we further found the target that has the lowest RMSD from the structure of the native complex. The distribution of the lowest RMSD models for all 176 benchmark entries is plotted as a histogram in Figure 7. The figure shows that the peak of the distribution is at 4.0 Angstrom. For more than 90% of the 176 entries, we can find structural models that have RMSD less than 6.0 Angstrom from the native complexes, indicating that the native binding can be reproduced with a high successful rate. Our benchmark results thus suggest that the space of protein quaternary structures can be simplified by a limited number of modes expanded by the interface fragment pair library. It is worth mentioning that the purpose of this test is not for systematic comparison of docking algorithms, but to enumerate all binding modes of a complex through a fragment-based library. Thereby, we used bound structures of subunits during complex assembly instead of unbound structures that are normally used in docking tests.Figure 7

Bottom Line: After structural-based clustering, we found that more than 90% of these interface fragment pairs can be represented by a limited number of highly abundant motifs.Our study therefore presents supportive evidences that the space of protein quaternary structures can be represented by the combination of a small set of secondary-structure-based packing at binding interfaces.Finally, after future improvements such as adding sequence profiles, we expect this new library will be useful to predict structures of unknown protein-protein interactions.

View Article: PubMed Central - PubMed

Affiliation: Department of Systems and Computational Biology, Albert Einstein College of Medicine of Yeshiva University, 1300 Morris Park Avenue, Bronx, NY, 10461, USA. Zhong-Ru.Xie@einstein.yu.edu.

ABSTRACT

Background: The physical interactions between proteins constitute the basis of protein quaternary structures. They dominate many biological processes in living cells. Deciphering the structural features of interacting proteins is essential to understand their cellular functions. Similar to the space of protein tertiary structures in which discrete patterns are clearly observed on fold or sub-fold motif levels, it has been found that the space of protein quaternary structures is highly degenerate due to the packing of compact secondary structure elements at interfaces. Therefore, it is necessary to further decompose the protein quaternary structural space into a more local representation.

Results: Here we constructed an interface fragment pair library from the current structure database of protein complexes. After structural-based clustering, we found that more than 90% of these interface fragment pairs can be represented by a limited number of highly abundant motifs. These motifs were further used to guide complex assembly. A large-scale benchmark test shows that the native-like binding is highly likely in the structural ensemble of modeled protein complexes that were built through the library.

Conclusions: Our study therefore presents supportive evidences that the space of protein quaternary structures can be represented by the combination of a small set of secondary-structure-based packing at binding interfaces. Finally, after future improvements such as adding sequence profiles, we expect this new library will be useful to predict structures of unknown protein-protein interactions.

Show MeSH
Related in: MedlinePlus