Limits...
Integrating diverse biological and computational sources for reliable protein-protein interactions.

Wu M, Li X, Chua HN, Kwoh CK, Ng SK - BMC Bioinformatics (2010)

Bottom Line: We performed comprehensive experiments on two benchmark yeast PPI datasets.The experimental results showed that our proposed method can effectively eliminate false positives in detected PPIs and identify false negatives by predicting novel yet reliable PPIs.Our proposed method also performed significantly better than merely using each of individual evidence sources, illustrating the importance of integrating various biological and computational sources of data and evidence.

View Article: PubMed Central - HTML - PubMed

Affiliation: School of Computer Engineering, Nanyang Technological University, Singapore. wumi0002@ntu.edu.sg

ABSTRACT

Background: Protein-protein interactions (PPIs) play important roles in various cellular processes. However, the low quality of current PPI data detected from high-throughput screening techniques has diminished the potential usefulness of the data. We need to develop a method to address the high data noise and incompleteness of PPI data, namely, to filter out inaccurate protein interactions (false positives) and predict putative protein interactions (false negatives).

Results: In this paper, we proposed a novel two-step method to integrate diverse biological and computational sources of supporting evidence for reliable PPIs. The first step, interaction binning or InterBIN, groups PPIs together to more accurately estimate the likelihood (Bin-Confidence score) that the protein pairs interact for each biological or computational evidence source. The second step, interaction classification or InterCLASS, integrates the collected Bin-Confidence scores to build classifiers and identify reliable interactions.

Conclusions: We performed comprehensive experiments on two benchmark yeast PPI datasets. The experimental results showed that our proposed method can effectively eliminate false positives in detected PPIs and identify false negatives by predicting novel yet reliable PPIs. Our proposed method also performed significantly better than merely using each of individual evidence sources, illustrating the importance of integrating various biological and computational sources of data and evidence.

Show MeSH
The average functional similarity of top-ranked false negative candidates generated from DIP data. In DIP data, protein pairs with at least 2 common neighbors were selected as false negative candidates, resulting in 33482 such candidates. Figure 5 shows the average functional similarity of the top-ranked false negative candidates.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC2957691&req=5

Figure 5: The average functional similarity of top-ranked false negative candidates generated from DIP data. In DIP data, protein pairs with at least 2 common neighbors were selected as false negative candidates, resulting in 33482 such candidates. Figure 5 shows the average functional similarity of the top-ranked false negative candidates.

Mentions: We have applied our integrative methods which is learned from existing reliable interactions and non-interactions, to predict false negatives and false positives. The quality of the predicted false negatives and false positives is evaluated and validated by the corresponding functional similarity scores between the protein partners. Figure 3, 4 and 5 show the average similarity scores for top-ranked interactions (or predicted interactions). A point (x, y) in these figures means that top-x interactions (or predicted interactions) have an average similarity y.


Integrating diverse biological and computational sources for reliable protein-protein interactions.

Wu M, Li X, Chua HN, Kwoh CK, Ng SK - BMC Bioinformatics (2010)

The average functional similarity of top-ranked false negative candidates generated from DIP data. In DIP data, protein pairs with at least 2 common neighbors were selected as false negative candidates, resulting in 33482 such candidates. Figure 5 shows the average functional similarity of the top-ranked false negative candidates.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC2957691&req=5

Figure 5: The average functional similarity of top-ranked false negative candidates generated from DIP data. In DIP data, protein pairs with at least 2 common neighbors were selected as false negative candidates, resulting in 33482 such candidates. Figure 5 shows the average functional similarity of the top-ranked false negative candidates.
Mentions: We have applied our integrative methods which is learned from existing reliable interactions and non-interactions, to predict false negatives and false positives. The quality of the predicted false negatives and false positives is evaluated and validated by the corresponding functional similarity scores between the protein partners. Figure 3, 4 and 5 show the average similarity scores for top-ranked interactions (or predicted interactions). A point (x, y) in these figures means that top-x interactions (or predicted interactions) have an average similarity y.

Bottom Line: We performed comprehensive experiments on two benchmark yeast PPI datasets.The experimental results showed that our proposed method can effectively eliminate false positives in detected PPIs and identify false negatives by predicting novel yet reliable PPIs.Our proposed method also performed significantly better than merely using each of individual evidence sources, illustrating the importance of integrating various biological and computational sources of data and evidence.

View Article: PubMed Central - HTML - PubMed

Affiliation: School of Computer Engineering, Nanyang Technological University, Singapore. wumi0002@ntu.edu.sg

ABSTRACT

Background: Protein-protein interactions (PPIs) play important roles in various cellular processes. However, the low quality of current PPI data detected from high-throughput screening techniques has diminished the potential usefulness of the data. We need to develop a method to address the high data noise and incompleteness of PPI data, namely, to filter out inaccurate protein interactions (false positives) and predict putative protein interactions (false negatives).

Results: In this paper, we proposed a novel two-step method to integrate diverse biological and computational sources of supporting evidence for reliable PPIs. The first step, interaction binning or InterBIN, groups PPIs together to more accurately estimate the likelihood (Bin-Confidence score) that the protein pairs interact for each biological or computational evidence source. The second step, interaction classification or InterCLASS, integrates the collected Bin-Confidence scores to build classifiers and identify reliable interactions.

Conclusions: We performed comprehensive experiments on two benchmark yeast PPI datasets. The experimental results showed that our proposed method can effectively eliminate false positives in detected PPIs and identify false negatives by predicting novel yet reliable PPIs. Our proposed method also performed significantly better than merely using each of individual evidence sources, illustrating the importance of integrating various biological and computational sources of data and evidence.

Show MeSH