Limits...
Modular biological function is most effectively captured by combining molecular interaction data types.

Ames RM, Macpherson JI, Pinney JW, Lovell SC, Robertson DL - PLoS ONE (2013)

Bottom Line: Furthermore, the different annotation types of GO are not predominantly associated with one of the interaction data types.Collectively our results demonstrate that successful capture of functional relationships by network data depends on both the specific biological function being characterised and the type of network data being used.Combining interaction subnetworks across data types is therefore essential for fully understanding the complex and emergent nature of biological function.

View Article: PubMed Central - PubMed

Affiliation: Computational and Evolutionary Biology, Faculty of Life Sciences, The University of Manchester, Manchester, United Kingdom. ryan.ames@manchester.ac.uk

ABSTRACT
Large-scale molecular interaction data sets have the potential to provide a comprehensive, system-wide understanding of biological function. Although individual molecules can be promiscuous in terms of their contribution to function, molecular functions emerge from the specific interactions of molecules giving rise to modular organisation. As functions often derive from a range of mechanisms, we demonstrate that they are best studied using networks derived from different sources. Implementing a graph partitioning algorithm we identify subnetworks in yeast protein-protein interaction (PPI), genetic interaction and gene co-regulation networks. Among these subnetworks we identify cohesive subgraphs that we expect to represent functional modules in the different data types. We demonstrate significant overlap between the subgraphs generated from the different data types and show these overlaps can represent related functions as represented by the Gene Ontology (GO). Next, we investigate the correspondence between our subgraphs and the Gene Ontology. This revealed varying degrees of coverage of the biological process, molecular function and cellular component ontologies, dependent on the data type. For example, subgraphs from the PPI show enrichment for 84%, 58% and 93% of annotated GO terms, respectively. Integrating the interaction data into a combined network increases the coverage of GO. Furthermore, the different annotation types of GO are not predominantly associated with one of the interaction data types. Collectively our results demonstrate that successful capture of functional relationships by network data depends on both the specific biological function being characterised and the type of network data being used. We identify functions that require integrated information to be accurately represented, demonstrating the limitations of individual data types. Combining interaction subnetworks across data types is therefore essential for fully understanding the complex and emergent nature of biological function.

Show MeSH

Related in: MedlinePlus

Network of best hits between subgraphs of PPI, genetic and coregulation networks.Nodes represent individual subgraphs with blue, red or yellow nodes corresponding to subgraphs from the PPI, genetic or co-regulation networks, respectively. Edges represent links between subgraphs with a statistically significant intersection of  genes with an MCC . Only the best intersection between each network comparison, defined by MCC score, is shown. Letters A to D indicate high-degree neighbourhoods that consist of a node with degree  and all neighbours of that node.
© Copyright Policy
Related In: Results  -  Collection


getmorefigures.php?uid=PMC3643936&req=5

pone-0062670-g002: Network of best hits between subgraphs of PPI, genetic and coregulation networks.Nodes represent individual subgraphs with blue, red or yellow nodes corresponding to subgraphs from the PPI, genetic or co-regulation networks, respectively. Edges represent links between subgraphs with a statistically significant intersection of genes with an MCC . Only the best intersection between each network comparison, defined by MCC score, is shown. Letters A to D indicate high-degree neighbourhoods that consist of a node with degree and all neighbours of that node.

Mentions: To obtain a high-level insight into the congruence relationships between subgraphs from different networks, we visualised best hits (and best reciprocal hits) using a network, where nodes represent subgraphs and edges represent the hits (Figure 2). From a total of 4669 subgraphs that are involved in a best hit with one or more subgraphs, 3689 subgraphs are involved in a best hit with just one other subgraph, while a minority of subgraphs have many more best hits; the node degree fitting a power-law distribution. A repeated topological pattern of this best hits network (Figure 2) is for the subgraph of one network to be connected to a large number of subgraphs from one other network. Interestingly, there are 115 subgraphs that have a degree (top ). These subgraphs, that we refer to as high-degree subgraphs, are a set of genes that are repeatedly identified by partitioning networks into different sized partitions. Therefore, high-degree subgraphs and their hits appear to be robust sets of highly connected genes that transcend multiple networks. We hypothesised that high-degree subgraphs might have particular functional significance. Indeed, high-degree subgraphs and the subgraphs that are their best hits (together termed high-degree neighbourhoods) are: (i) significantly more likely to be enriched for one or more GO terms and (ii) capture GO functions with significantly better accuracy than subgraphs that are not congruent, in all networks (, two-tailed Mann Whitney U test, in all cases), collectively indicating that the congruent subgraphs are more likely to be real functional modules. Furthermore this result highlights the value of integrating information between networks in order to validate network subgraphs.


Modular biological function is most effectively captured by combining molecular interaction data types.

Ames RM, Macpherson JI, Pinney JW, Lovell SC, Robertson DL - PLoS ONE (2013)

Network of best hits between subgraphs of PPI, genetic and coregulation networks.Nodes represent individual subgraphs with blue, red or yellow nodes corresponding to subgraphs from the PPI, genetic or co-regulation networks, respectively. Edges represent links between subgraphs with a statistically significant intersection of  genes with an MCC . Only the best intersection between each network comparison, defined by MCC score, is shown. Letters A to D indicate high-degree neighbourhoods that consist of a node with degree  and all neighbours of that node.
© Copyright Policy
Related In: Results  -  Collection

Show All Figures
getmorefigures.php?uid=PMC3643936&req=5

pone-0062670-g002: Network of best hits between subgraphs of PPI, genetic and coregulation networks.Nodes represent individual subgraphs with blue, red or yellow nodes corresponding to subgraphs from the PPI, genetic or co-regulation networks, respectively. Edges represent links between subgraphs with a statistically significant intersection of genes with an MCC . Only the best intersection between each network comparison, defined by MCC score, is shown. Letters A to D indicate high-degree neighbourhoods that consist of a node with degree and all neighbours of that node.
Mentions: To obtain a high-level insight into the congruence relationships between subgraphs from different networks, we visualised best hits (and best reciprocal hits) using a network, where nodes represent subgraphs and edges represent the hits (Figure 2). From a total of 4669 subgraphs that are involved in a best hit with one or more subgraphs, 3689 subgraphs are involved in a best hit with just one other subgraph, while a minority of subgraphs have many more best hits; the node degree fitting a power-law distribution. A repeated topological pattern of this best hits network (Figure 2) is for the subgraph of one network to be connected to a large number of subgraphs from one other network. Interestingly, there are 115 subgraphs that have a degree (top ). These subgraphs, that we refer to as high-degree subgraphs, are a set of genes that are repeatedly identified by partitioning networks into different sized partitions. Therefore, high-degree subgraphs and their hits appear to be robust sets of highly connected genes that transcend multiple networks. We hypothesised that high-degree subgraphs might have particular functional significance. Indeed, high-degree subgraphs and the subgraphs that are their best hits (together termed high-degree neighbourhoods) are: (i) significantly more likely to be enriched for one or more GO terms and (ii) capture GO functions with significantly better accuracy than subgraphs that are not congruent, in all networks (, two-tailed Mann Whitney U test, in all cases), collectively indicating that the congruent subgraphs are more likely to be real functional modules. Furthermore this result highlights the value of integrating information between networks in order to validate network subgraphs.

Bottom Line: Furthermore, the different annotation types of GO are not predominantly associated with one of the interaction data types.Collectively our results demonstrate that successful capture of functional relationships by network data depends on both the specific biological function being characterised and the type of network data being used.Combining interaction subnetworks across data types is therefore essential for fully understanding the complex and emergent nature of biological function.

View Article: PubMed Central - PubMed

Affiliation: Computational and Evolutionary Biology, Faculty of Life Sciences, The University of Manchester, Manchester, United Kingdom. ryan.ames@manchester.ac.uk

ABSTRACT
Large-scale molecular interaction data sets have the potential to provide a comprehensive, system-wide understanding of biological function. Although individual molecules can be promiscuous in terms of their contribution to function, molecular functions emerge from the specific interactions of molecules giving rise to modular organisation. As functions often derive from a range of mechanisms, we demonstrate that they are best studied using networks derived from different sources. Implementing a graph partitioning algorithm we identify subnetworks in yeast protein-protein interaction (PPI), genetic interaction and gene co-regulation networks. Among these subnetworks we identify cohesive subgraphs that we expect to represent functional modules in the different data types. We demonstrate significant overlap between the subgraphs generated from the different data types and show these overlaps can represent related functions as represented by the Gene Ontology (GO). Next, we investigate the correspondence between our subgraphs and the Gene Ontology. This revealed varying degrees of coverage of the biological process, molecular function and cellular component ontologies, dependent on the data type. For example, subgraphs from the PPI show enrichment for 84%, 58% and 93% of annotated GO terms, respectively. Integrating the interaction data into a combined network increases the coverage of GO. Furthermore, the different annotation types of GO are not predominantly associated with one of the interaction data types. Collectively our results demonstrate that successful capture of functional relationships by network data depends on both the specific biological function being characterised and the type of network data being used. We identify functions that require integrated information to be accurately represented, demonstrating the limitations of individual data types. Combining interaction subnetworks across data types is therefore essential for fully understanding the complex and emergent nature of biological function.

Show MeSH
Related in: MedlinePlus