Limits...
Clustering proteins from interaction networks for the prediction of cellular functions.

Brun C, Herrmann C, Guénoche A - BMC Bioinformatics (2004)

Bottom Line: Applied to the yeast interaction network, the classes obtained appear to be biological significant.Finally, we propose a new annotation for 37 previously uncharacterized yeast proteins.We believe that our results represent a significant improvement for the inference of cellular functions, that can be applied to other organism as well as to other type of interaction graph, such as genetic interactions.

View Article: PubMed Central - HTML - PubMed

Affiliation: Laboratoire de Génétique et Physiologie du Développement, IBDM, CNRS/INSERM/Université de la Méditerranée. herrmann@ibdm.univ-mrs.fr

ABSTRACT

Background: Developing reliable and efficient strategies allowing to infer a function to yet uncharacterized proteins based on interaction networks is of crucial interest in the current context of high-throughput data generation. In this paper, we develop a new algorithm for clustering vertices of a protein-protein interaction network using a density function, providing disjoint classes.

Results: Applied to the yeast interaction network, the classes obtained appear to be biological significant. The partitions are then used to make functional predictions for uncharacterized yeast proteins, using an annotation procedure that takes into account the binary interactions between proteins inside the classes. We show that this procedure is able to enhance the performances with respect to previous approaches. Finally, we propose a new annotation for 37 previously uncharacterized yeast proteins.

Conclusion: We believe that our results represent a significant improvement for the inference of cellular functions, that can be applied to other organism as well as to other type of interaction graph, such as genetic interactions.

Show MeSH

Related in: MedlinePlus

Comparison of our procedure (full squares) and the MRC strategy (full diamonds) of the rate of true functions recovered (TFR, plot (a)) and the rates of correct predictions (RCP, plot (b)). The straight lines show the linear fit for our procedure (full line) and the MRC procedure (dashed lines). The horizontal axis indicates the number of proteins for which a prediction has been made. All rates in the vertical axis are computed with respect to the total number of annotated proteins.
© Copyright Policy
Related In: Results  -  Collection


getmorefigures.php?uid=PMC487898&req=5

Figure 3: Comparison of our procedure (full squares) and the MRC strategy (full diamonds) of the rate of true functions recovered (TFR, plot (a)) and the rates of correct predictions (RCP, plot (b)). The straight lines show the linear fit for our procedure (full line) and the MRC procedure (dashed lines). The horizontal axis indicates the number of proteins for which a prediction has been made. All rates in the vertical axis are computed with respect to the total number of annotated proteins.

Mentions: For our method and the MRC approach, these indicators depend on the threshold d. A reasonable interval for d is [30, 70]: below 30%, a function is not particularly representative whereas above 70%, the threshold is too stringent and yields too few predictions. For a given value of d, our procedure predicts a function for less proteins than the simple MRC, due to the additional step in the method. Hence, in order to compare both approaches, we shall plot both criteria against the number of proteins for which a prediction is made (out of the total number of proteins, hence 876). The results are shown in Fig. 3.


Clustering proteins from interaction networks for the prediction of cellular functions.

Brun C, Herrmann C, Guénoche A - BMC Bioinformatics (2004)

Comparison of our procedure (full squares) and the MRC strategy (full diamonds) of the rate of true functions recovered (TFR, plot (a)) and the rates of correct predictions (RCP, plot (b)). The straight lines show the linear fit for our procedure (full line) and the MRC procedure (dashed lines). The horizontal axis indicates the number of proteins for which a prediction has been made. All rates in the vertical axis are computed with respect to the total number of annotated proteins.
© Copyright Policy
Related In: Results  -  Collection

Show All Figures
getmorefigures.php?uid=PMC487898&req=5

Figure 3: Comparison of our procedure (full squares) and the MRC strategy (full diamonds) of the rate of true functions recovered (TFR, plot (a)) and the rates of correct predictions (RCP, plot (b)). The straight lines show the linear fit for our procedure (full line) and the MRC procedure (dashed lines). The horizontal axis indicates the number of proteins for which a prediction has been made. All rates in the vertical axis are computed with respect to the total number of annotated proteins.
Mentions: For our method and the MRC approach, these indicators depend on the threshold d. A reasonable interval for d is [30, 70]: below 30%, a function is not particularly representative whereas above 70%, the threshold is too stringent and yields too few predictions. For a given value of d, our procedure predicts a function for less proteins than the simple MRC, due to the additional step in the method. Hence, in order to compare both approaches, we shall plot both criteria against the number of proteins for which a prediction is made (out of the total number of proteins, hence 876). The results are shown in Fig. 3.

Bottom Line: Applied to the yeast interaction network, the classes obtained appear to be biological significant.Finally, we propose a new annotation for 37 previously uncharacterized yeast proteins.We believe that our results represent a significant improvement for the inference of cellular functions, that can be applied to other organism as well as to other type of interaction graph, such as genetic interactions.

View Article: PubMed Central - HTML - PubMed

Affiliation: Laboratoire de Génétique et Physiologie du Développement, IBDM, CNRS/INSERM/Université de la Méditerranée. herrmann@ibdm.univ-mrs.fr

ABSTRACT

Background: Developing reliable and efficient strategies allowing to infer a function to yet uncharacterized proteins based on interaction networks is of crucial interest in the current context of high-throughput data generation. In this paper, we develop a new algorithm for clustering vertices of a protein-protein interaction network using a density function, providing disjoint classes.

Results: Applied to the yeast interaction network, the classes obtained appear to be biological significant. The partitions are then used to make functional predictions for uncharacterized yeast proteins, using an annotation procedure that takes into account the binary interactions between proteins inside the classes. We show that this procedure is able to enhance the performances with respect to previous approaches. Finally, we propose a new annotation for 37 previously uncharacterized yeast proteins.

Conclusion: We believe that our results represent a significant improvement for the inference of cellular functions, that can be applied to other organism as well as to other type of interaction graph, such as genetic interactions.

Show MeSH
Related in: MedlinePlus