Limits...
Improving the prediction of yeast protein function using weighted protein-protein interactions.

Ahmed KS, Saloma NH, Kadah YM - Theor Biol Med Model (2011)

Bottom Line: The present study provides a weighting strategy for PPI to improve the prediction of protein functions.A new technique to weight interactions in the yeast proteome is presented.Experimental results concerning yeast proteins demonstrated that weighting interactions integrated with the neighbor counting method improved the sensitivity and specificity of prediction in terms of two functional categories: cellular role and cell locations.

View Article: PubMed Central - HTML - PubMed

Affiliation: Department of Bio-electronics, MTI, El-Haddaba Elwosta, Cairo, Egypt.

ABSTRACT

Background: Bioinformatics can be used to predict protein function, leading to an understanding of cellular activities, and equally-weighted protein-protein interactions (PPI) are normally used to predict such protein functions. The present study provides a weighting strategy for PPI to improve the prediction of protein functions. The weights are dependent on the local and global network topologies and the number of experimental verification methods. The proposed methods were applied to the yeast proteome and integrated with the neighbour counting method to predict the functions of unknown proteins.

Results: A new technique to weight interactions in the yeast proteome is presented. The weights are related to the network topology (local and global) and the number of identified methods, and the results revealed improvement in the sensitivity and specificity of prediction in terms of cellular role and cellular locations. This method (new weights) was compared with a method that utilises interactions with the same weight and it was shown to be superior.

Conclusions: A new method for weighting the interactions in protein-protein interaction networks is presented. Experimental results concerning yeast proteins demonstrated that weighting interactions integrated with the neighbor counting method improved the sensitivity and specificity of prediction in terms of two functional categories: cellular role and cell locations.

Show MeSH

Related in: MedlinePlus

Cell location function sensitivity and specificity. The sensitivity and specificity of the six collected data (un-weighted and five weights) in cell location function for up to five interactions (k = 5).
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC3104947&req=5

Figure 2: Cell location function sensitivity and specificity. The sensitivity and specificity of the six collected data (un-weighted and five weights) in cell location function for up to five interactions (k = 5).

Mentions: The proposed approach was applied to infer the functions of un-annotated proteins in yeast and used weighting interactions rather than free weights (equal interactions). In YPD, proteins are assigned functions based on three criteria: "Biochemical function", "Subcellular location" and "Cellular role". The numbers of annotated and un-annotated proteins, based on the three functional categories, are presented in Table 1. The accuracy of the predictions was measured by the leave-one-out method. For each annotated protein with at least one annotated interaction partner, it was assumed to be un-annotated and functions were predicted using the weighted neighbour counting method. The predicted results were compared with the annotations of the protein. Repeating the leave-one-out experiment for all such proteins allowed the specificity (SP) and sensitivity (SN) to be defined [22]. The corresponding values of overlapped proteins for "Biochemical function", "Subcellular location" and "Cellular role" were 1145, 1129 and 1407, respectively. In the first three Figures, the relationship between sensitivity and specificity was implemented for biochemical function, cell location and cellular role, respectively. In terms of the prediction method (neighbour counting method), a fixed number of the highest frequency functions can be compared. In the present study, although one data set is used, k (number of interactions) had a variety of values (from 2 to 5). Figures 1a-d demonstrate the specificity and sensitivity in terms of biochemical function when k equals 2, 3, 4 and 5. In terms of biochemical functions (Figure 1), the sensitivity of a proposed algorithm is higher when specificity values are low. However, for higher specificity the weightless technique (W0) has good sensitivity. Therefore, an established technique is sufficient for predicting biochemical function. As demonstrated in Figures 2 and 3, the sensitivity and specificity for all weights (new suggested techniques W1-W5) were higher than W0 for all values of k. It can be demonstrated that in the cell location function category, W2 (weight relating to IG1) is the best weight to use when the number of interactions for each protein is two. W3 (weights for IG2), W1 (weights for number of experimental method) and W5 (PCA for the basic three weights (W1, W2, W3)) were the best weights when the numbers of interactions for each protein were 3, 4 or 5, respectively. Furthermore, W2 was the best weight for the cellular role function category when the number of interactions was two, and W3 (weights of IG2) were the best weights for the cellular role function category when the numbers of interactions were 3, 4 or 5. There were overlaps between some weights on the indicated curves (overlap curves), but there was a small variation in terms of detecting these weights.


Improving the prediction of yeast protein function using weighted protein-protein interactions.

Ahmed KS, Saloma NH, Kadah YM - Theor Biol Med Model (2011)

Cell location function sensitivity and specificity. The sensitivity and specificity of the six collected data (un-weighted and five weights) in cell location function for up to five interactions (k = 5).
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC3104947&req=5

Figure 2: Cell location function sensitivity and specificity. The sensitivity and specificity of the six collected data (un-weighted and five weights) in cell location function for up to five interactions (k = 5).
Mentions: The proposed approach was applied to infer the functions of un-annotated proteins in yeast and used weighting interactions rather than free weights (equal interactions). In YPD, proteins are assigned functions based on three criteria: "Biochemical function", "Subcellular location" and "Cellular role". The numbers of annotated and un-annotated proteins, based on the three functional categories, are presented in Table 1. The accuracy of the predictions was measured by the leave-one-out method. For each annotated protein with at least one annotated interaction partner, it was assumed to be un-annotated and functions were predicted using the weighted neighbour counting method. The predicted results were compared with the annotations of the protein. Repeating the leave-one-out experiment for all such proteins allowed the specificity (SP) and sensitivity (SN) to be defined [22]. The corresponding values of overlapped proteins for "Biochemical function", "Subcellular location" and "Cellular role" were 1145, 1129 and 1407, respectively. In the first three Figures, the relationship between sensitivity and specificity was implemented for biochemical function, cell location and cellular role, respectively. In terms of the prediction method (neighbour counting method), a fixed number of the highest frequency functions can be compared. In the present study, although one data set is used, k (number of interactions) had a variety of values (from 2 to 5). Figures 1a-d demonstrate the specificity and sensitivity in terms of biochemical function when k equals 2, 3, 4 and 5. In terms of biochemical functions (Figure 1), the sensitivity of a proposed algorithm is higher when specificity values are low. However, for higher specificity the weightless technique (W0) has good sensitivity. Therefore, an established technique is sufficient for predicting biochemical function. As demonstrated in Figures 2 and 3, the sensitivity and specificity for all weights (new suggested techniques W1-W5) were higher than W0 for all values of k. It can be demonstrated that in the cell location function category, W2 (weight relating to IG1) is the best weight to use when the number of interactions for each protein is two. W3 (weights for IG2), W1 (weights for number of experimental method) and W5 (PCA for the basic three weights (W1, W2, W3)) were the best weights when the numbers of interactions for each protein were 3, 4 or 5, respectively. Furthermore, W2 was the best weight for the cellular role function category when the number of interactions was two, and W3 (weights of IG2) were the best weights for the cellular role function category when the numbers of interactions were 3, 4 or 5. There were overlaps between some weights on the indicated curves (overlap curves), but there was a small variation in terms of detecting these weights.

Bottom Line: The present study provides a weighting strategy for PPI to improve the prediction of protein functions.A new technique to weight interactions in the yeast proteome is presented.Experimental results concerning yeast proteins demonstrated that weighting interactions integrated with the neighbor counting method improved the sensitivity and specificity of prediction in terms of two functional categories: cellular role and cell locations.

View Article: PubMed Central - HTML - PubMed

Affiliation: Department of Bio-electronics, MTI, El-Haddaba Elwosta, Cairo, Egypt.

ABSTRACT

Background: Bioinformatics can be used to predict protein function, leading to an understanding of cellular activities, and equally-weighted protein-protein interactions (PPI) are normally used to predict such protein functions. The present study provides a weighting strategy for PPI to improve the prediction of protein functions. The weights are dependent on the local and global network topologies and the number of experimental verification methods. The proposed methods were applied to the yeast proteome and integrated with the neighbour counting method to predict the functions of unknown proteins.

Results: A new technique to weight interactions in the yeast proteome is presented. The weights are related to the network topology (local and global) and the number of identified methods, and the results revealed improvement in the sensitivity and specificity of prediction in terms of cellular role and cellular locations. This method (new weights) was compared with a method that utilises interactions with the same weight and it was shown to be superior.

Conclusions: A new method for weighting the interactions in protein-protein interaction networks is presented. Experimental results concerning yeast proteins demonstrated that weighting interactions integrated with the neighbor counting method improved the sensitivity and specificity of prediction in terms of two functional categories: cellular role and cell locations.

Show MeSH
Related in: MedlinePlus