Integration of molecular network data reconstructs Gene Ontology.
Bottom Line: Furthermore, we use our method to infer new relationships between GO terms solely from the topologies of these networks and validate 44% of our predictions in the literature.In addition, our integration method reproduces 48% of cellular component, 41% of molecular function and 41% of biological process GO terms, outperforming the previous method in the former two domains of GO.Finally, we predict new GO annotations of yeast genes and validate our predictions through GIs profiling.
Affiliation: Department of Computing, Imperial College London SW7 2AZ, UK.Show MeSH
Mentions: We find that with the removal of each of the four data sources (a network along with its corresponding GDV similarity matrix) the value of RSS increases, while the value of Evar decreases, implying that each data source contributes to the quality of the model. Relative increase of RSS and relative decrease of Evar (with respect to the initial model containing all the data), computed by removing a particular network along with its corresponding GDV similarity matrix, are shown in the top panel of Figure 2. We find that the largest model degradation is achieved with the removal of GI network and its corresponding GDV similarity matrix. A similar result was reported by Žitnik et al. (2013): they found GIs to be the most informative data source in prediction of disease–disease associations. Exclusion of the gene Co-Ex network and its corresponding GDV similarity matrix results in the smallest changes in RSS and Evar indicating that Co-Ex data contribute the least to the quality of the model.Fig. 2.
Affiliation: Department of Computing, Imperial College London SW7 2AZ, UK.