Limits...
Identification of protein complexes from co-immunoprecipitation data.

Geva G, Sharan R - Bioinformatics (2010)

Bottom Line: The framework aims at identifying sets of preys that significantly co-associate with the same set of baits.In application to an array of datasets from yeast, our method identifies thousands of protein complexes.Comparing these complexes to manually curated ones, we show that our method attains very high specificity and sensitivity levels (∼ 80%), outperforming current approaches for protein complex inference.

View Article: PubMed Central - PubMed

Affiliation: School of Computer Science, Tel Aviv University, Tel Aviv, Israel.

ABSTRACT

Motivation: Advanced technologies are producing large-scale protein-protein interaction data at an ever increasing pace. A fundamental challenge in analyzing these data is the inference of protein machineries. Previous methods for detecting protein complexes have been mainly based on analyzing binary protein-protein interaction data, ignoring the more involved co-complex relations obtained from co-immunoprecipitation experiments.

Results: Here, we devise a novel framework for protein complex detection from co-immunoprecipitation data. The framework aims at identifying sets of preys that significantly co-associate with the same set of baits. In application to an array of datasets from yeast, our method identifies thousands of protein complexes. Comparing these complexes to manually curated ones, we show that our method attains very high specificity and sensitivity levels (∼ 80%), outperforming current approaches for protein complex inference.

Availability: Supplementary information and the program are available at http://www.cs.tau.ac.il/~roded/CODEC/main.html.

Show MeSH
An example data set. (a) An input bait–prey graph. Baits are colored in blue and preys are colored in red. (b) Two possible protein complexes and their corresponding subgraphs.
© Copyright Policy - creative-commons
Related In: Results  -  Collection

License 1 - License 2
getmorefigures.php?uid=PMC3008648&req=5

Figure 1: An example data set. (a) An input bait–prey graph. Baits are colored in blue and preys are colored in red. (b) Two possible protein complexes and their corresponding subgraphs.

Mentions: In addition, we impose a consistency requirement: some proteins occur in the data both as baits and as preys. For such proteins, we require that if a certain prey (bait) vertex is included in the subgraph, so must be the corresponding bait (prey). These definitions are exemplified in Figure 1. The example dataset contains 10 proteins marked as P1-P10 (Fig. 1a). Four purifications are made. The proteins used as baits are P3, P4, P5 and P7. There are two sets of preys that are supported by more than one bait: {P2, P3, P4, P5} and {P5, P6, P7, P8}. It can be hypothesized that these sets correspond to two protein complexes, shown in Figure 1b. In both cases, the consistency requirement is satisfied. The missing edge between P5 and P2 is a likely false negative, since both P3 and P4 interact with P2. There may be additional complexes in this toy example, but there is only weak evidence for their existence since they are detected as preys by a single bait protein.Fig. 1.


Identification of protein complexes from co-immunoprecipitation data.

Geva G, Sharan R - Bioinformatics (2010)

An example data set. (a) An input bait–prey graph. Baits are colored in blue and preys are colored in red. (b) Two possible protein complexes and their corresponding subgraphs.
© Copyright Policy - creative-commons
Related In: Results  -  Collection

License 1 - License 2
Show All Figures
getmorefigures.php?uid=PMC3008648&req=5

Figure 1: An example data set. (a) An input bait–prey graph. Baits are colored in blue and preys are colored in red. (b) Two possible protein complexes and their corresponding subgraphs.
Mentions: In addition, we impose a consistency requirement: some proteins occur in the data both as baits and as preys. For such proteins, we require that if a certain prey (bait) vertex is included in the subgraph, so must be the corresponding bait (prey). These definitions are exemplified in Figure 1. The example dataset contains 10 proteins marked as P1-P10 (Fig. 1a). Four purifications are made. The proteins used as baits are P3, P4, P5 and P7. There are two sets of preys that are supported by more than one bait: {P2, P3, P4, P5} and {P5, P6, P7, P8}. It can be hypothesized that these sets correspond to two protein complexes, shown in Figure 1b. In both cases, the consistency requirement is satisfied. The missing edge between P5 and P2 is a likely false negative, since both P3 and P4 interact with P2. There may be additional complexes in this toy example, but there is only weak evidence for their existence since they are detected as preys by a single bait protein.Fig. 1.

Bottom Line: The framework aims at identifying sets of preys that significantly co-associate with the same set of baits.In application to an array of datasets from yeast, our method identifies thousands of protein complexes.Comparing these complexes to manually curated ones, we show that our method attains very high specificity and sensitivity levels (∼ 80%), outperforming current approaches for protein complex inference.

View Article: PubMed Central - PubMed

Affiliation: School of Computer Science, Tel Aviv University, Tel Aviv, Israel.

ABSTRACT

Motivation: Advanced technologies are producing large-scale protein-protein interaction data at an ever increasing pace. A fundamental challenge in analyzing these data is the inference of protein machineries. Previous methods for detecting protein complexes have been mainly based on analyzing binary protein-protein interaction data, ignoring the more involved co-complex relations obtained from co-immunoprecipitation experiments.

Results: Here, we devise a novel framework for protein complex detection from co-immunoprecipitation data. The framework aims at identifying sets of preys that significantly co-associate with the same set of baits. In application to an array of datasets from yeast, our method identifies thousands of protein complexes. Comparing these complexes to manually curated ones, we show that our method attains very high specificity and sensitivity levels (∼ 80%), outperforming current approaches for protein complex inference.

Availability: Supplementary information and the program are available at http://www.cs.tau.ac.il/~roded/CODEC/main.html.

Show MeSH