Limits...
Concurrent conditional clustering of multiple networks: COCONETS.

Kleessen S, Klie S, Nikoloski Z - PLoS ONE (2014)

Bottom Line: We present theoretical results for special classes of networks to demonstrate the implications of conditionality captured by the COCONETS formulation.We demonstrate that the investigation of the differences between the clustering based on all networks with that obtained from a subset of networks can be used to quantify the specificity of biological responses.Therefore, the comparison of multiple networks based on concurrent conditional clustering offers a novel venue for detection and investigation of preserved network substructures.

View Article: PubMed Central - PubMed

Affiliation: Systems Biology and Mathematical Modeling Group, Max Planck Institute of Molecular Plant Physiology, Potsdam-Golm, Germany.

ABSTRACT
The accumulation of high-throughput data from different experiments has facilitated the extraction of condition-specific networks over the same set of biological entities. Comparing and contrasting of such multiple biological networks is in the center of differential network biology, aiming at determining general and condition-specific responses captured in the network structure (i.e., included associations between the network components). We provide a novel way for comparison of multiple networks based on determining network clustering (i.e., partition into communities) which is optimal across the set of networks with respect to a given cluster quality measure. To this end, we formulate the optimization-based problem of concurrent conditional clustering of multiple networks, termed COCONETS, based on the modularity. The solution to this problem is a clustering which depends on all considered networks and pinpoints their preserved substructures. We present theoretical results for special classes of networks to demonstrate the implications of conditionality captured by the COCONETS formulation. As the problem can be shown to be intractable, we extend an existing efficient greedy heuristic and applied it to determine concurrent conditional clusters on coexpression networks extracted from publically available time-resolved transcriptomics data of Escherichia coli under five stresses as well as on metabolite correlation networks from metabolomics data set from Arabidopsis thaliana exposed to eight environmental conditions. We demonstrate that the investigation of the differences between the clustering based on all networks with that obtained from a subset of networks can be used to quantify the specificity of biological responses. While a comparison of the Escherichia coli coexpression networks based on seminal properties does not pinpoint biologically relevant differences, the common network substructures extracted by COCONETS are supported by existing experimental evidence. Therefore, the comparison of multiple networks based on concurrent conditional clustering offers a novel venue for detection and investigation of preserved network substructures.

Show MeSH

Related in: MedlinePlus

Clustering tree based on the adjusted Rand index values for the investigated CoCo clusterings.The clusterings from single networks are based on the greedy heuristic for approximating the MODULARITY problem. All other clusterings are based on the greedy heuristic for COCONETS. The tree is derived by agglomerative clustering with a distance matrix derived from the adjusted Rand index values for all pairwise comparisons of the obtained CoCo clusterings. The stress conditions are denoted as follows: 4° C and darkness (4 D), 21° C and darkness (21 D), 32° C and darkness (32 D), 4° C and light (4 L), 21° C and low-light (21 LL), 21° C and high light (21 HL), 32° C and light (32 L), and 21° C and light (21 L, control); their pairwise combinations are marked with ‘/’, and the clustering over all five stresses, by ‘all’. The number of clusters in each CoCo clustering is included next to the abbreviations for the stresses.
© Copyright Policy
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC4126743&req=5

pone-0103637-g004: Clustering tree based on the adjusted Rand index values for the investigated CoCo clusterings.The clusterings from single networks are based on the greedy heuristic for approximating the MODULARITY problem. All other clusterings are based on the greedy heuristic for COCONETS. The tree is derived by agglomerative clustering with a distance matrix derived from the adjusted Rand index values for all pairwise comparisons of the obtained CoCo clusterings. The stress conditions are denoted as follows: 4° C and darkness (4 D), 21° C and darkness (21 D), 32° C and darkness (32 D), 4° C and light (4 L), 21° C and low-light (21 LL), 21° C and high light (21 HL), 32° C and light (32 L), and 21° C and light (21 L, control); their pairwise combinations are marked with ‘/’, and the clustering over all five stresses, by ‘all’. The number of clusters in each CoCo clustering is included next to the abbreviations for the stresses.

Mentions: By determining the Jaccard similarity between the edge-sets of the condition-specific network (Table 2) we observe that the network at 4-D differs the most from the other networks (average Jaccard similarity 0.18), followed by that at 4-L (Jaccard similarity of 0.19); moreover, the networks from 21-D and 32-D are closest to each other (Jaccard similarity 0.40). To analyze potential similarities in network communities we applied the greedy heuristic for COCONETS with each condition-specific separately as well as all eight networks at once. Furthermore, we investigated a combination of networks from conditions with a temperature of 4° C (4-L and 4-D) and 32° C (32-L and 32-D) as well as for darkness treatment (4-D, 21-D, and 32-D). In total, 12 clusterings are obtained which are examined by the pairwise adjusted Rand index of the clusterings (Table S3, Figure 4). The most similar clusterings for individual environmental conditions are observed for 21-D and 32-D (0.58) indicating a similar response to the different treatments which was already shown in the analysis of Caldana et al.[34]. They also noted that the third darkness condition with the temperature kept at 4° C (4-D) only showed a small overlap with the 21-D and 32-D responses, which is further supported by the similarities of condition-specific clusterings as well as between individual darkness conditions and the overall clustering (21-D/4-D/32-D) (Table S3, Figure 4). Based on all 12 clusterings the highest similarity (0.73) is observed between 32-D and combinations of 32° C conditions (32-L/32-D) indicating that the clustering of 32-D highly represents the general effect of high temperature. The CoCo clustering from the eight condition-specific networks analyzed at once has the highest similarity to 32-L (0.31), but generally a low similarity to the clusterings of individual networks. The low similarity may largely be due to the high range of different simultaneosly investigated conditions. Therefore, the findings from the CoCo clustering of metabolite-correlation networks show another example of possible application for the proposed approach to get further insights not only from condition-specific clusterings but also clustering of different conditions at once highlighting the response similarities.


Concurrent conditional clustering of multiple networks: COCONETS.

Kleessen S, Klie S, Nikoloski Z - PLoS ONE (2014)

Clustering tree based on the adjusted Rand index values for the investigated CoCo clusterings.The clusterings from single networks are based on the greedy heuristic for approximating the MODULARITY problem. All other clusterings are based on the greedy heuristic for COCONETS. The tree is derived by agglomerative clustering with a distance matrix derived from the adjusted Rand index values for all pairwise comparisons of the obtained CoCo clusterings. The stress conditions are denoted as follows: 4° C and darkness (4 D), 21° C and darkness (21 D), 32° C and darkness (32 D), 4° C and light (4 L), 21° C and low-light (21 LL), 21° C and high light (21 HL), 32° C and light (32 L), and 21° C and light (21 L, control); their pairwise combinations are marked with ‘/’, and the clustering over all five stresses, by ‘all’. The number of clusters in each CoCo clustering is included next to the abbreviations for the stresses.
© Copyright Policy
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC4126743&req=5

pone-0103637-g004: Clustering tree based on the adjusted Rand index values for the investigated CoCo clusterings.The clusterings from single networks are based on the greedy heuristic for approximating the MODULARITY problem. All other clusterings are based on the greedy heuristic for COCONETS. The tree is derived by agglomerative clustering with a distance matrix derived from the adjusted Rand index values for all pairwise comparisons of the obtained CoCo clusterings. The stress conditions are denoted as follows: 4° C and darkness (4 D), 21° C and darkness (21 D), 32° C and darkness (32 D), 4° C and light (4 L), 21° C and low-light (21 LL), 21° C and high light (21 HL), 32° C and light (32 L), and 21° C and light (21 L, control); their pairwise combinations are marked with ‘/’, and the clustering over all five stresses, by ‘all’. The number of clusters in each CoCo clustering is included next to the abbreviations for the stresses.
Mentions: By determining the Jaccard similarity between the edge-sets of the condition-specific network (Table 2) we observe that the network at 4-D differs the most from the other networks (average Jaccard similarity 0.18), followed by that at 4-L (Jaccard similarity of 0.19); moreover, the networks from 21-D and 32-D are closest to each other (Jaccard similarity 0.40). To analyze potential similarities in network communities we applied the greedy heuristic for COCONETS with each condition-specific separately as well as all eight networks at once. Furthermore, we investigated a combination of networks from conditions with a temperature of 4° C (4-L and 4-D) and 32° C (32-L and 32-D) as well as for darkness treatment (4-D, 21-D, and 32-D). In total, 12 clusterings are obtained which are examined by the pairwise adjusted Rand index of the clusterings (Table S3, Figure 4). The most similar clusterings for individual environmental conditions are observed for 21-D and 32-D (0.58) indicating a similar response to the different treatments which was already shown in the analysis of Caldana et al.[34]. They also noted that the third darkness condition with the temperature kept at 4° C (4-D) only showed a small overlap with the 21-D and 32-D responses, which is further supported by the similarities of condition-specific clusterings as well as between individual darkness conditions and the overall clustering (21-D/4-D/32-D) (Table S3, Figure 4). Based on all 12 clusterings the highest similarity (0.73) is observed between 32-D and combinations of 32° C conditions (32-L/32-D) indicating that the clustering of 32-D highly represents the general effect of high temperature. The CoCo clustering from the eight condition-specific networks analyzed at once has the highest similarity to 32-L (0.31), but generally a low similarity to the clusterings of individual networks. The low similarity may largely be due to the high range of different simultaneosly investigated conditions. Therefore, the findings from the CoCo clustering of metabolite-correlation networks show another example of possible application for the proposed approach to get further insights not only from condition-specific clusterings but also clustering of different conditions at once highlighting the response similarities.

Bottom Line: We present theoretical results for special classes of networks to demonstrate the implications of conditionality captured by the COCONETS formulation.We demonstrate that the investigation of the differences between the clustering based on all networks with that obtained from a subset of networks can be used to quantify the specificity of biological responses.Therefore, the comparison of multiple networks based on concurrent conditional clustering offers a novel venue for detection and investigation of preserved network substructures.

View Article: PubMed Central - PubMed

Affiliation: Systems Biology and Mathematical Modeling Group, Max Planck Institute of Molecular Plant Physiology, Potsdam-Golm, Germany.

ABSTRACT
The accumulation of high-throughput data from different experiments has facilitated the extraction of condition-specific networks over the same set of biological entities. Comparing and contrasting of such multiple biological networks is in the center of differential network biology, aiming at determining general and condition-specific responses captured in the network structure (i.e., included associations between the network components). We provide a novel way for comparison of multiple networks based on determining network clustering (i.e., partition into communities) which is optimal across the set of networks with respect to a given cluster quality measure. To this end, we formulate the optimization-based problem of concurrent conditional clustering of multiple networks, termed COCONETS, based on the modularity. The solution to this problem is a clustering which depends on all considered networks and pinpoints their preserved substructures. We present theoretical results for special classes of networks to demonstrate the implications of conditionality captured by the COCONETS formulation. As the problem can be shown to be intractable, we extend an existing efficient greedy heuristic and applied it to determine concurrent conditional clusters on coexpression networks extracted from publically available time-resolved transcriptomics data of Escherichia coli under five stresses as well as on metabolite correlation networks from metabolomics data set from Arabidopsis thaliana exposed to eight environmental conditions. We demonstrate that the investigation of the differences between the clustering based on all networks with that obtained from a subset of networks can be used to quantify the specificity of biological responses. While a comparison of the Escherichia coli coexpression networks based on seminal properties does not pinpoint biologically relevant differences, the common network substructures extracted by COCONETS are supported by existing experimental evidence. Therefore, the comparison of multiple networks based on concurrent conditional clustering offers a novel venue for detection and investigation of preserved network substructures.

Show MeSH
Related in: MedlinePlus