Limits...
Decomposing the space of protein quaternary structures with the interface fragment pair library.

Xie ZR, Chen J, Zhao Y, Wu Y - BMC Bioinformatics (2015)

Bottom Line: After structural-based clustering, we found that more than 90% of these interface fragment pairs can be represented by a limited number of highly abundant motifs.Our study therefore presents supportive evidences that the space of protein quaternary structures can be represented by the combination of a small set of secondary-structure-based packing at binding interfaces.Finally, after future improvements such as adding sequence profiles, we expect this new library will be useful to predict structures of unknown protein-protein interactions.

View Article: PubMed Central - PubMed

Affiliation: Department of Systems and Computational Biology, Albert Einstein College of Medicine of Yeshiva University, 1300 Morris Park Avenue, Bronx, NY, 10461, USA. Zhong-Ru.Xie@einstein.yu.edu.

ABSTRACT

Background: The physical interactions between proteins constitute the basis of protein quaternary structures. They dominate many biological processes in living cells. Deciphering the structural features of interacting proteins is essential to understand their cellular functions. Similar to the space of protein tertiary structures in which discrete patterns are clearly observed on fold or sub-fold motif levels, it has been found that the space of protein quaternary structures is highly degenerate due to the packing of compact secondary structure elements at interfaces. Therefore, it is necessary to further decompose the protein quaternary structural space into a more local representation.

Results: Here we constructed an interface fragment pair library from the current structure database of protein complexes. After structural-based clustering, we found that more than 90% of these interface fragment pairs can be represented by a limited number of highly abundant motifs. These motifs were further used to guide complex assembly. A large-scale benchmark test shows that the native-like binding is highly likely in the structural ensemble of modeled protein complexes that were built through the library.

Conclusions: Our study therefore presents supportive evidences that the space of protein quaternary structures can be represented by the combination of a small set of secondary-structure-based packing at binding interfaces. Finally, after future improvements such as adding sequence profiles, we expect this new library will be useful to predict structures of unknown protein-protein interactions.

Show MeSH

Related in: MedlinePlus

Each fragment was divided into three categories, depending on the secondary structure type of the residue at the center of the fragment. Consequently, the interface fragment pair was classified into six motifs. The percentage of these six motifs for all the 459 fragment pairs in the library is plotted in (a). We further defined a preference score for each motif. A higher score for a specific motif indicates that it is more favored to form. Consequently, the preference scores for all six motifs are plotted in (b). The figure suggests that fragment pair motifs are not equally distributed, but have strong preference.
© Copyright Policy - open-access
Related In: Results  -  Collection

License 1 - License 2
getmorefigures.php?uid=PMC4384354&req=5

Fig4: Each fragment was divided into three categories, depending on the secondary structure type of the residue at the center of the fragment. Consequently, the interface fragment pair was classified into six motifs. The percentage of these six motifs for all the 459 fragment pairs in the library is plotted in (a). We further defined a preference score for each motif. A higher score for a specific motif indicates that it is more favored to form. Consequently, the preference scores for all six motifs are plotted in (b). The figure suggests that fragment pair motifs are not equally distributed, but have strong preference.

Mentions: We further investigated the distribution of protein secondary structures in the 459 representative models of interface fragment pairs. Each fragment in a pair was first divided into three categories: helix (H), strand (S) and loop (L). The criteria that each fragment belongs to one of these three categories depend on the secondary structure type of the residue at the center of the corresponding fragment. The secondary structure type of a residue is determined by the standard DSSP algorithm [34]. After we assigned categories for both fragments in a pair, the interface fragment pair can therefore be classified into the following six motifs: HH (a pair between two H fragments); SS (a pair between two S fragments); LL (a pair between two L fragments); HL (a pair between H and L fragments); HS (a pair between H and S fragments); SL (a pair between S and L fragments); The percentage of these six motifs is plotted in Figure 4a, after we got the secondary structure information for all the 459 fragment pairs in the library. The figure shows that these six motifs are not equally distributed in the library. For instance, the HH motif is more abundant than other motifs. In order to study the secondary structure preference of interface fragment pairs, the observed frequency of each motif need to be normalized by the probability of each secondary structural type at binding interfaces. Therefore, we calculated the distribution of H, S and L fragments in the chosen domain interfaces of the 3did database. The probability of H fragments appears at domain interfaces is 0.432. The probability of S fragments is 0.286, and the probability of L fragments is 0.282. We further defined a preference score for each motif. For instance, the preference score for HS motif is calculated as ln(P(HS)/(P(H)P(S))), in which P(HS) is the probability of finding HS motif at binding interfaces, and P(H) is the probability of finding H fragments at binding interfaces. A higher score for a specific motif indicates that it is more favored to form. Consequently, the preference scores for all six motifs are plotted in Figure 4B. Figure 4B shows that although HH motif is the most abundant motif in the library, its preference score is not the best, due to the highest probability of H fragment in the database. In contrast, loops are more preferred to appear at binding interfaces. Moreover, L fragments prefer forming heterogeneous contacts with S or H fragments. Finally, the interaction between H and S fragments is the least favored pattern.Figure 4


Decomposing the space of protein quaternary structures with the interface fragment pair library.

Xie ZR, Chen J, Zhao Y, Wu Y - BMC Bioinformatics (2015)

Each fragment was divided into three categories, depending on the secondary structure type of the residue at the center of the fragment. Consequently, the interface fragment pair was classified into six motifs. The percentage of these six motifs for all the 459 fragment pairs in the library is plotted in (a). We further defined a preference score for each motif. A higher score for a specific motif indicates that it is more favored to form. Consequently, the preference scores for all six motifs are plotted in (b). The figure suggests that fragment pair motifs are not equally distributed, but have strong preference.
© Copyright Policy - open-access
Related In: Results  -  Collection

License 1 - License 2
Show All Figures
getmorefigures.php?uid=PMC4384354&req=5

Fig4: Each fragment was divided into three categories, depending on the secondary structure type of the residue at the center of the fragment. Consequently, the interface fragment pair was classified into six motifs. The percentage of these six motifs for all the 459 fragment pairs in the library is plotted in (a). We further defined a preference score for each motif. A higher score for a specific motif indicates that it is more favored to form. Consequently, the preference scores for all six motifs are plotted in (b). The figure suggests that fragment pair motifs are not equally distributed, but have strong preference.
Mentions: We further investigated the distribution of protein secondary structures in the 459 representative models of interface fragment pairs. Each fragment in a pair was first divided into three categories: helix (H), strand (S) and loop (L). The criteria that each fragment belongs to one of these three categories depend on the secondary structure type of the residue at the center of the corresponding fragment. The secondary structure type of a residue is determined by the standard DSSP algorithm [34]. After we assigned categories for both fragments in a pair, the interface fragment pair can therefore be classified into the following six motifs: HH (a pair between two H fragments); SS (a pair between two S fragments); LL (a pair between two L fragments); HL (a pair between H and L fragments); HS (a pair between H and S fragments); SL (a pair between S and L fragments); The percentage of these six motifs is plotted in Figure 4a, after we got the secondary structure information for all the 459 fragment pairs in the library. The figure shows that these six motifs are not equally distributed in the library. For instance, the HH motif is more abundant than other motifs. In order to study the secondary structure preference of interface fragment pairs, the observed frequency of each motif need to be normalized by the probability of each secondary structural type at binding interfaces. Therefore, we calculated the distribution of H, S and L fragments in the chosen domain interfaces of the 3did database. The probability of H fragments appears at domain interfaces is 0.432. The probability of S fragments is 0.286, and the probability of L fragments is 0.282. We further defined a preference score for each motif. For instance, the preference score for HS motif is calculated as ln(P(HS)/(P(H)P(S))), in which P(HS) is the probability of finding HS motif at binding interfaces, and P(H) is the probability of finding H fragments at binding interfaces. A higher score for a specific motif indicates that it is more favored to form. Consequently, the preference scores for all six motifs are plotted in Figure 4B. Figure 4B shows that although HH motif is the most abundant motif in the library, its preference score is not the best, due to the highest probability of H fragment in the database. In contrast, loops are more preferred to appear at binding interfaces. Moreover, L fragments prefer forming heterogeneous contacts with S or H fragments. Finally, the interaction between H and S fragments is the least favored pattern.Figure 4

Bottom Line: After structural-based clustering, we found that more than 90% of these interface fragment pairs can be represented by a limited number of highly abundant motifs.Our study therefore presents supportive evidences that the space of protein quaternary structures can be represented by the combination of a small set of secondary-structure-based packing at binding interfaces.Finally, after future improvements such as adding sequence profiles, we expect this new library will be useful to predict structures of unknown protein-protein interactions.

View Article: PubMed Central - PubMed

Affiliation: Department of Systems and Computational Biology, Albert Einstein College of Medicine of Yeshiva University, 1300 Morris Park Avenue, Bronx, NY, 10461, USA. Zhong-Ru.Xie@einstein.yu.edu.

ABSTRACT

Background: The physical interactions between proteins constitute the basis of protein quaternary structures. They dominate many biological processes in living cells. Deciphering the structural features of interacting proteins is essential to understand their cellular functions. Similar to the space of protein tertiary structures in which discrete patterns are clearly observed on fold or sub-fold motif levels, it has been found that the space of protein quaternary structures is highly degenerate due to the packing of compact secondary structure elements at interfaces. Therefore, it is necessary to further decompose the protein quaternary structural space into a more local representation.

Results: Here we constructed an interface fragment pair library from the current structure database of protein complexes. After structural-based clustering, we found that more than 90% of these interface fragment pairs can be represented by a limited number of highly abundant motifs. These motifs were further used to guide complex assembly. A large-scale benchmark test shows that the native-like binding is highly likely in the structural ensemble of modeled protein complexes that were built through the library.

Conclusions: Our study therefore presents supportive evidences that the space of protein quaternary structures can be represented by the combination of a small set of secondary-structure-based packing at binding interfaces. Finally, after future improvements such as adding sequence profiles, we expect this new library will be useful to predict structures of unknown protein-protein interactions.

Show MeSH
Related in: MedlinePlus