Limits...
Inferring physical protein contacts from large-scale purification data of protein complexes.

Schelhorn SE, Mestre J, Albrecht M, Zotenko E - Mol. Cell Proteomics (2011)

Bottom Line: Our results show that raw purification data can indeed be exploited to determine high-confidence physical protein contacts within protein complexes.In contrast to previous findings, we observe that physical contacts inferred from purification experiments of protein complexes can be qualitatively comparable to binary protein interactions measured by experimental high-throughput assays such as yeast two-hybrid.This suggests that computationally derived physical contacts might complement binary protein interaction assays and guide large-scale interactome mapping projects by prioritizing putative physical contacts for further experimental screens.

View Article: PubMed Central - PubMed

Affiliation: Max Planck Institute for Informatics, Saarbr├╝cken, Germany. sven@mpi-inf.mpg.de

ABSTRACT
Recent large-scale data sets of protein complex purifications have provided unprecedented insights into the organization of cellular protein complexes. Several computational methods have been developed to detect co-complexed proteins in these data sets. Their common aim is the identification of biologically relevant protein complexes. However, much less is known about the network of direct physical protein contacts within the detected protein complexes. Therefore, our work investigates whether direct physical contacts can be computationally derived by combining raw data of large-scale protein complex purifications. We assess four established scoring schemes and introduce a new scoring approach that is specifically devised to infer direct physical protein contacts from protein complex purifications. The physical contacts identified by the five methods are comprehensively benchmarked against different reference sets that provide evidence for true physical contacts. Our results show that raw purification data can indeed be exploited to determine high-confidence physical protein contacts within protein complexes. In particular, our new method outperforms competing approaches at discovering physical contacts involving proteins that have been screened multiple times in purification experiments. It also excels in the analysis of recent protein purification screens of molecular chaperones and protein kinases. In contrast to previous findings, we observe that physical contacts inferred from purification experiments of protein complexes can be qualitatively comparable to binary protein interactions measured by experimental high-throughput assays such as yeast two-hybrid. This suggests that computationally derived physical contacts might complement binary protein interaction assays and guide large-scale interactome mapping projects by prioritizing putative physical contacts for further experimental screens.

Show MeSH
Assessment of inferred physical protein contacts by five scoring methods against binary experimental reference sets that provide direct evidence for physical contacts. Inferred physical contacts are ranked by scores of the corresponding scoring method. For each reference set, performance of all methods is measured by plotting the number of top-ranking inferred physical contacts of a method against the number of these contacts that are confirmed by the reference set.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC3108834&req=5

Figure 2: Assessment of inferred physical protein contacts by five scoring methods against binary experimental reference sets that provide direct evidence for physical contacts. Inferred physical contacts are ranked by scores of the corresponding scoring method. For each reference set, performance of all methods is measured by plotting the number of top-ranking inferred physical contacts of a method against the number of these contacts that are confirmed by the reference set.

Mentions: Fig. 2 shows how well the scoring methods perform in identifying true physical contacts from the reference sets. Note that although all methods are able to infer several physical contacts beyond the depicted 10,000 ranks, physical contacts at these high cutoffs have only very low confidence and are thus omitted. Notably, whereas SA and ISA methods have comparable performance across assessments, both of them outperform other approaches on all three reference sets. Moreover, the performance of all approaches starts to level off at about 3,000 to 4,000 ranks. We hypothesize that this number constitutes a reasonable limit on the number of physical contacts that can be reliably inferred given the available experimental data. This number of reliably inferable contacts also corresponds roughly to the number of direct binary interactions measured by high-throughput experimental techniques: the Y2H data set and the PCA data contain 2,930 and 2,616 interactions, respectively. As can be derived from Fig. 2A, Y2H and PCA data sets are less enriched in manually curated BGS interactions than an equivalent number of top-scoring interactions extracted from purification data. This suggests that physical contacts inferred by purification scoring schemes are at least qualitatively comparable and often superior to Y2H and PCA experimental data sets.


Inferring physical protein contacts from large-scale purification data of protein complexes.

Schelhorn SE, Mestre J, Albrecht M, Zotenko E - Mol. Cell Proteomics (2011)

Assessment of inferred physical protein contacts by five scoring methods against binary experimental reference sets that provide direct evidence for physical contacts. Inferred physical contacts are ranked by scores of the corresponding scoring method. For each reference set, performance of all methods is measured by plotting the number of top-ranking inferred physical contacts of a method against the number of these contacts that are confirmed by the reference set.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC3108834&req=5

Figure 2: Assessment of inferred physical protein contacts by five scoring methods against binary experimental reference sets that provide direct evidence for physical contacts. Inferred physical contacts are ranked by scores of the corresponding scoring method. For each reference set, performance of all methods is measured by plotting the number of top-ranking inferred physical contacts of a method against the number of these contacts that are confirmed by the reference set.
Mentions: Fig. 2 shows how well the scoring methods perform in identifying true physical contacts from the reference sets. Note that although all methods are able to infer several physical contacts beyond the depicted 10,000 ranks, physical contacts at these high cutoffs have only very low confidence and are thus omitted. Notably, whereas SA and ISA methods have comparable performance across assessments, both of them outperform other approaches on all three reference sets. Moreover, the performance of all approaches starts to level off at about 3,000 to 4,000 ranks. We hypothesize that this number constitutes a reasonable limit on the number of physical contacts that can be reliably inferred given the available experimental data. This number of reliably inferable contacts also corresponds roughly to the number of direct binary interactions measured by high-throughput experimental techniques: the Y2H data set and the PCA data contain 2,930 and 2,616 interactions, respectively. As can be derived from Fig. 2A, Y2H and PCA data sets are less enriched in manually curated BGS interactions than an equivalent number of top-scoring interactions extracted from purification data. This suggests that physical contacts inferred by purification scoring schemes are at least qualitatively comparable and often superior to Y2H and PCA experimental data sets.

Bottom Line: Our results show that raw purification data can indeed be exploited to determine high-confidence physical protein contacts within protein complexes.In contrast to previous findings, we observe that physical contacts inferred from purification experiments of protein complexes can be qualitatively comparable to binary protein interactions measured by experimental high-throughput assays such as yeast two-hybrid.This suggests that computationally derived physical contacts might complement binary protein interaction assays and guide large-scale interactome mapping projects by prioritizing putative physical contacts for further experimental screens.

View Article: PubMed Central - PubMed

Affiliation: Max Planck Institute for Informatics, Saarbr├╝cken, Germany. sven@mpi-inf.mpg.de

ABSTRACT
Recent large-scale data sets of protein complex purifications have provided unprecedented insights into the organization of cellular protein complexes. Several computational methods have been developed to detect co-complexed proteins in these data sets. Their common aim is the identification of biologically relevant protein complexes. However, much less is known about the network of direct physical protein contacts within the detected protein complexes. Therefore, our work investigates whether direct physical contacts can be computationally derived by combining raw data of large-scale protein complex purifications. We assess four established scoring schemes and introduce a new scoring approach that is specifically devised to infer direct physical protein contacts from protein complex purifications. The physical contacts identified by the five methods are comprehensively benchmarked against different reference sets that provide evidence for true physical contacts. Our results show that raw purification data can indeed be exploited to determine high-confidence physical protein contacts within protein complexes. In particular, our new method outperforms competing approaches at discovering physical contacts involving proteins that have been screened multiple times in purification experiments. It also excels in the analysis of recent protein purification screens of molecular chaperones and protein kinases. In contrast to previous findings, we observe that physical contacts inferred from purification experiments of protein complexes can be qualitatively comparable to binary protein interactions measured by experimental high-throughput assays such as yeast two-hybrid. This suggests that computationally derived physical contacts might complement binary protein interaction assays and guide large-scale interactome mapping projects by prioritizing putative physical contacts for further experimental screens.

Show MeSH