Limits...
Construction of co-complex score matrix for protein complex prediction from AP-MS data.

Xie Z, Kwoh CK, Li XL, Wu M - Bioinformatics (2011)

Bottom Line: It has just one parameter to set and its value has little effect on the results.It can be applied to different species as long as the AP-MS data are available.Despite its simplicity, it is competitive or superior in performance over many aspects when compared with the state-of-the-art predictions performed by supervised or unsupervised approaches.

View Article: PubMed Central - PubMed

Affiliation: School of Computer Science, Fudan University, Shanghai 200433, China.

ABSTRACT

Motivation: Protein complexes are of great importance for unraveling the secrets of cellular organization and function. The AP-MS technique has provided an effective high-throughput screening to directly measure the co-complex relationship among multiple proteins, but its performance suffers from both false positives and false negatives. To computationally predict complexes from AP-MS data, most existing approaches either required the additional knowledge from known complexes (supervised learning), or had numerous parameters to tune.

Method: In this article, we propose a novel unsupervised approach, without relying on the knowledge of existing complexes. Our method probabilistically calculates the affinity between two proteins, where the affinity score is evaluated by a co-complexed score or C2S in brief. In particular, our method measures the log-likelihood ratio of two proteins being co-complexed to being drawn randomly, and we then predict protein complexes by applying hierarchical clustering algorithm on the C2S score matrix.

Results: Compared with existing approaches, our approach is computationally efficient and easy to implement. It has just one parameter to set and its value has little effect on the results. It can be applied to different species as long as the AP-MS data are available. Despite its simplicity, it is competitive or superior in performance over many aspects when compared with the state-of-the-art predictions performed by supervised or unsupervised approaches.

Show MeSH
The comparison of recall with different values of ω.
© Copyright Policy - creative-commons
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC3117344&req=5

Figure 2: The comparison of recall with different values of ω.

Mentions: It can be seen from Figure 2 that C2S and C2S-HighConf have similar recall with the supervised methods (Hart and Pu), and they evidently outperform the unsupervised method (BT-893) which drops faster with the increased value of ω.Fig. 2.


Construction of co-complex score matrix for protein complex prediction from AP-MS data.

Xie Z, Kwoh CK, Li XL, Wu M - Bioinformatics (2011)

The comparison of recall with different values of ω.
© Copyright Policy - creative-commons
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC3117344&req=5

Figure 2: The comparison of recall with different values of ω.
Mentions: It can be seen from Figure 2 that C2S and C2S-HighConf have similar recall with the supervised methods (Hart and Pu), and they evidently outperform the unsupervised method (BT-893) which drops faster with the increased value of ω.Fig. 2.

Bottom Line: It has just one parameter to set and its value has little effect on the results.It can be applied to different species as long as the AP-MS data are available.Despite its simplicity, it is competitive or superior in performance over many aspects when compared with the state-of-the-art predictions performed by supervised or unsupervised approaches.

View Article: PubMed Central - PubMed

Affiliation: School of Computer Science, Fudan University, Shanghai 200433, China.

ABSTRACT

Motivation: Protein complexes are of great importance for unraveling the secrets of cellular organization and function. The AP-MS technique has provided an effective high-throughput screening to directly measure the co-complex relationship among multiple proteins, but its performance suffers from both false positives and false negatives. To computationally predict complexes from AP-MS data, most existing approaches either required the additional knowledge from known complexes (supervised learning), or had numerous parameters to tune.

Method: In this article, we propose a novel unsupervised approach, without relying on the knowledge of existing complexes. Our method probabilistically calculates the affinity between two proteins, where the affinity score is evaluated by a co-complexed score or C2S in brief. In particular, our method measures the log-likelihood ratio of two proteins being co-complexed to being drawn randomly, and we then predict protein complexes by applying hierarchical clustering algorithm on the C2S score matrix.

Results: Compared with existing approaches, our approach is computationally efficient and easy to implement. It has just one parameter to set and its value has little effect on the results. It can be applied to different species as long as the AP-MS data are available. Despite its simplicity, it is competitive or superior in performance over many aspects when compared with the state-of-the-art predictions performed by supervised or unsupervised approaches.

Show MeSH