Limits...
A domain-based approach to predict protein-protein interactions.

Singhal M, Resat H - BMC Bioinformatics (2007)

Bottom Line: Knowing which proteins exist in a certain organism or cell type and how these proteins interact with each other are necessary for the understanding of biological processes at the whole cell level.Obtained domain interaction scores are then used to predict whether a pair of proteins interacts.We envision the DomainGA as a first step of a multiple tier approach to constructing organism specific PPIs.

View Article: PubMed Central - HTML - PubMed

Affiliation: Computational Biology and Bioinformatics Group, Pacific Northwest National Laboratory, Richland, WA 99352, USA. mudita.singhal@pnl.gov <mudita.singhal@pnl.gov>

ABSTRACT

Background: Knowing which proteins exist in a certain organism or cell type and how these proteins interact with each other are necessary for the understanding of biological processes at the whole cell level. The determination of the protein-protein interaction (PPI) networks has been the subject of extensive research. Despite the development of reasonably successful methods, serious technical difficulties still exist. In this paper we present DomainGA, a quantitative computational approach that uses the information about the domain-domain interactions to predict the interactions between proteins.

Results: DomainGA is a multi-parameter optimization method in which the available PPI information is used to derive a quantitative scoring scheme for the domain-domain pairs. Obtained domain interaction scores are then used to predict whether a pair of proteins interacts. Using the yeast PPI data and a series of tests, we show the robustness and insensitivity of the DomainGA method to the selection of the parameter sets, score ranges, and detection rules. Our DomainGA method achieves very high explanation ratios for the positive and negative PPIs in yeast. Based on our cross-verification tests on human PPIs, comparison of the optimized scores with the structurally observed domain interactions obtained from the iPFAM database, and sensitivity and specificity analysis; we conclude that our DomainGA method shows great promise to be applicable across multiple organisms.

Conclusion: We envision the DomainGA as a first step of a multiple tier approach to constructing organism specific PPIs. As it is based on fundamental structural information, the DomainGA approach can be used to create potential PPIs and the accuracy of the constructed interaction template can be further improved using complementary methods. Explanation ratios obtained in the reported test case studies clearly show that the false prediction rates of the template networks constructed using the DomainGA scores are reasonably low, and the erroneous predictions can be filtered further using supplementary approaches such as those based on literature search or other prediction methods.

Show MeSH

Related in: MedlinePlus

Comparison of the parameter scores optimized using the 344 parameter closed set with maximum (x-axis) and total (y-axis) score detection rules. Reported scores are the averages of the GA runs after the infrequently occurring parameter values are discarded during analysis. Histogram diagram reports the score distribution of the parameters that can be optimized in the simulations. Each (x,y) entry in this histogram plot reports the number of parameters that has mean values of x and y when the maximum- and total-score detection rule was used in the optimization, respectively. The maximum value of the color scale is lowered from 67 to 20 to enhance the contrast between the histogram points. The yeast MIPS dataset compiled by Jansen et al. was used.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC1919395&req=5

Figure 4: Comparison of the parameter scores optimized using the 344 parameter closed set with maximum (x-axis) and total (y-axis) score detection rules. Reported scores are the averages of the GA runs after the infrequently occurring parameter values are discarded during analysis. Histogram diagram reports the score distribution of the parameters that can be optimized in the simulations. Each (x,y) entry in this histogram plot reports the number of parameters that has mean values of x and y when the maximum- and total-score detection rule was used in the optimization, respectively. The maximum value of the color scale is lowered from 67 to 20 to enhance the contrast between the histogram points. The yeast MIPS dataset compiled by Jansen et al. was used.

Mentions: Optimizations using both of the detection rules were carried out using the closed 344 parameter set (Table 1). The parameter score range was [0–9] and a cutoff of 5 was used to classify the PPIs into positive or negative interaction categories. Parameter values obtained using the total- and the maximum-score detection rules are compared in Figure 4. As the reported two-dimensional histogram shows, the scores of the domain pairs in these two optimization studies lie close to the diagonal demonstrating the promise that the DomainGA results are rather insensitive to the detection rule. There are only a few parameters that have conflicting optimized values between the two detection rule cases. These appear as a spike at the (max ~7, total ~1) point in the histogram diagram indicating a discrepancy between the parameter sets. We note that the small differences at the low or high parameter scores are unimportant because in the current classification scheme, values are simply grouped into three classes: non-interacting (< 5), fuzzy (~5), and interacting (> 5). Therefore, small variations in the (0:3) or (7:9) ranges are irrelevant to the derived conclusions.


A domain-based approach to predict protein-protein interactions.

Singhal M, Resat H - BMC Bioinformatics (2007)

Comparison of the parameter scores optimized using the 344 parameter closed set with maximum (x-axis) and total (y-axis) score detection rules. Reported scores are the averages of the GA runs after the infrequently occurring parameter values are discarded during analysis. Histogram diagram reports the score distribution of the parameters that can be optimized in the simulations. Each (x,y) entry in this histogram plot reports the number of parameters that has mean values of x and y when the maximum- and total-score detection rule was used in the optimization, respectively. The maximum value of the color scale is lowered from 67 to 20 to enhance the contrast between the histogram points. The yeast MIPS dataset compiled by Jansen et al. was used.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC1919395&req=5

Figure 4: Comparison of the parameter scores optimized using the 344 parameter closed set with maximum (x-axis) and total (y-axis) score detection rules. Reported scores are the averages of the GA runs after the infrequently occurring parameter values are discarded during analysis. Histogram diagram reports the score distribution of the parameters that can be optimized in the simulations. Each (x,y) entry in this histogram plot reports the number of parameters that has mean values of x and y when the maximum- and total-score detection rule was used in the optimization, respectively. The maximum value of the color scale is lowered from 67 to 20 to enhance the contrast between the histogram points. The yeast MIPS dataset compiled by Jansen et al. was used.
Mentions: Optimizations using both of the detection rules were carried out using the closed 344 parameter set (Table 1). The parameter score range was [0–9] and a cutoff of 5 was used to classify the PPIs into positive or negative interaction categories. Parameter values obtained using the total- and the maximum-score detection rules are compared in Figure 4. As the reported two-dimensional histogram shows, the scores of the domain pairs in these two optimization studies lie close to the diagonal demonstrating the promise that the DomainGA results are rather insensitive to the detection rule. There are only a few parameters that have conflicting optimized values between the two detection rule cases. These appear as a spike at the (max ~7, total ~1) point in the histogram diagram indicating a discrepancy between the parameter sets. We note that the small differences at the low or high parameter scores are unimportant because in the current classification scheme, values are simply grouped into three classes: non-interacting (< 5), fuzzy (~5), and interacting (> 5). Therefore, small variations in the (0:3) or (7:9) ranges are irrelevant to the derived conclusions.

Bottom Line: Knowing which proteins exist in a certain organism or cell type and how these proteins interact with each other are necessary for the understanding of biological processes at the whole cell level.Obtained domain interaction scores are then used to predict whether a pair of proteins interacts.We envision the DomainGA as a first step of a multiple tier approach to constructing organism specific PPIs.

View Article: PubMed Central - HTML - PubMed

Affiliation: Computational Biology and Bioinformatics Group, Pacific Northwest National Laboratory, Richland, WA 99352, USA. mudita.singhal@pnl.gov <mudita.singhal@pnl.gov>

ABSTRACT

Background: Knowing which proteins exist in a certain organism or cell type and how these proteins interact with each other are necessary for the understanding of biological processes at the whole cell level. The determination of the protein-protein interaction (PPI) networks has been the subject of extensive research. Despite the development of reasonably successful methods, serious technical difficulties still exist. In this paper we present DomainGA, a quantitative computational approach that uses the information about the domain-domain interactions to predict the interactions between proteins.

Results: DomainGA is a multi-parameter optimization method in which the available PPI information is used to derive a quantitative scoring scheme for the domain-domain pairs. Obtained domain interaction scores are then used to predict whether a pair of proteins interacts. Using the yeast PPI data and a series of tests, we show the robustness and insensitivity of the DomainGA method to the selection of the parameter sets, score ranges, and detection rules. Our DomainGA method achieves very high explanation ratios for the positive and negative PPIs in yeast. Based on our cross-verification tests on human PPIs, comparison of the optimized scores with the structurally observed domain interactions obtained from the iPFAM database, and sensitivity and specificity analysis; we conclude that our DomainGA method shows great promise to be applicable across multiple organisms.

Conclusion: We envision the DomainGA as a first step of a multiple tier approach to constructing organism specific PPIs. As it is based on fundamental structural information, the DomainGA approach can be used to create potential PPIs and the accuracy of the constructed interaction template can be further improved using complementary methods. Explanation ratios obtained in the reported test case studies clearly show that the false prediction rates of the template networks constructed using the DomainGA scores are reasonably low, and the erroneous predictions can be filtered further using supplementary approaches such as those based on literature search or other prediction methods.

Show MeSH
Related in: MedlinePlus