Limits...
Concurrent conditional clustering of multiple networks: COCONETS.

Kleessen S, Klie S, Nikoloski Z - PLoS ONE (2014)

Bottom Line: We present theoretical results for special classes of networks to demonstrate the implications of conditionality captured by the COCONETS formulation.We demonstrate that the investigation of the differences between the clustering based on all networks with that obtained from a subset of networks can be used to quantify the specificity of biological responses.Therefore, the comparison of multiple networks based on concurrent conditional clustering offers a novel venue for detection and investigation of preserved network substructures.

View Article: PubMed Central - PubMed

Affiliation: Systems Biology and Mathematical Modeling Group, Max Planck Institute of Molecular Plant Physiology, Potsdam-Golm, Germany.

ABSTRACT
The accumulation of high-throughput data from different experiments has facilitated the extraction of condition-specific networks over the same set of biological entities. Comparing and contrasting of such multiple biological networks is in the center of differential network biology, aiming at determining general and condition-specific responses captured in the network structure (i.e., included associations between the network components). We provide a novel way for comparison of multiple networks based on determining network clustering (i.e., partition into communities) which is optimal across the set of networks with respect to a given cluster quality measure. To this end, we formulate the optimization-based problem of concurrent conditional clustering of multiple networks, termed COCONETS, based on the modularity. The solution to this problem is a clustering which depends on all considered networks and pinpoints their preserved substructures. We present theoretical results for special classes of networks to demonstrate the implications of conditionality captured by the COCONETS formulation. As the problem can be shown to be intractable, we extend an existing efficient greedy heuristic and applied it to determine concurrent conditional clusters on coexpression networks extracted from publically available time-resolved transcriptomics data of Escherichia coli under five stresses as well as on metabolite correlation networks from metabolomics data set from Arabidopsis thaliana exposed to eight environmental conditions. We demonstrate that the investigation of the differences between the clustering based on all networks with that obtained from a subset of networks can be used to quantify the specificity of biological responses. While a comparison of the Escherichia coli coexpression networks based on seminal properties does not pinpoint biologically relevant differences, the common network substructures extracted by COCONETS are supported by existing experimental evidence. Therefore, the comparison of multiple networks based on concurrent conditional clustering offers a novel venue for detection and investigation of preserved network substructures.

Show MeSH

Related in: MedlinePlus

Clustering tree based on the adjusted Rand index values for the investigated CoCo clusterings.The clusterings from single networks are based on the greedy heuristic for approximating the MODULARITY problem. All other clusterings are based on the greedy heuristic for COCONETS. The tree is derived by agglomerative clustering with a distance matrix derived from the adjusted Rand index values for all pairwise comparisons of the obtained CoCo clusterings. The stress conditions are denoted as follows: cold (c), heat (h), lactose diauxie (ld), oxidative (o), and stationary phase (s); their pairwise combinations are marked with ‘/’, and the clustering over all five stresses, by ‘all’. The number of clusters in each CoCo clustering is included next to the abbreviations for the stresses.
© Copyright Policy
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC4126743&req=5

pone-0103637-g003: Clustering tree based on the adjusted Rand index values for the investigated CoCo clusterings.The clusterings from single networks are based on the greedy heuristic for approximating the MODULARITY problem. All other clusterings are based on the greedy heuristic for COCONETS. The tree is derived by agglomerative clustering with a distance matrix derived from the adjusted Rand index values for all pairwise comparisons of the obtained CoCo clusterings. The stress conditions are denoted as follows: cold (c), heat (h), lactose diauxie (ld), oxidative (o), and stationary phase (s); their pairwise combinations are marked with ‘/’, and the clustering over all five stresses, by ‘all’. The number of clusters in each CoCo clustering is included next to the abbreviations for the stresses.

Mentions: Analogously, another criterion to validate the usage of the heuristic is the following: CoCo clusterings based on networks from a pair of conditions are expected to be the most similar to the clusterings induced by the individual conditions participating in the pair than to those induced by any other condition alone. As shown in Figure 3, this is indeed the case for all pairs of conditions. In addition, the CoCo clustering based on the pairs of networks from oxidative stress and stationary growth was the most similar to the CoCo clustering with all five networks (0.50), followed by heat stress and lactose diauxie (0.44), cold and oxidative stress (0.41), as well as lactose diauxie and stationary condition (0.39). Therefore, these three pairs of conditions have the largest influence on the CoCo clustering with all networks. Interestingly, while the similarity of the individual clustering from the stationary phase to the CoCo clustering with all five networks is the smallest (0.06), conditioned on the data from oxidative stress, it obtains the highest contribution (0.50) (see Table S2 for values).


Concurrent conditional clustering of multiple networks: COCONETS.

Kleessen S, Klie S, Nikoloski Z - PLoS ONE (2014)

Clustering tree based on the adjusted Rand index values for the investigated CoCo clusterings.The clusterings from single networks are based on the greedy heuristic for approximating the MODULARITY problem. All other clusterings are based on the greedy heuristic for COCONETS. The tree is derived by agglomerative clustering with a distance matrix derived from the adjusted Rand index values for all pairwise comparisons of the obtained CoCo clusterings. The stress conditions are denoted as follows: cold (c), heat (h), lactose diauxie (ld), oxidative (o), and stationary phase (s); their pairwise combinations are marked with ‘/’, and the clustering over all five stresses, by ‘all’. The number of clusters in each CoCo clustering is included next to the abbreviations for the stresses.
© Copyright Policy
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC4126743&req=5

pone-0103637-g003: Clustering tree based on the adjusted Rand index values for the investigated CoCo clusterings.The clusterings from single networks are based on the greedy heuristic for approximating the MODULARITY problem. All other clusterings are based on the greedy heuristic for COCONETS. The tree is derived by agglomerative clustering with a distance matrix derived from the adjusted Rand index values for all pairwise comparisons of the obtained CoCo clusterings. The stress conditions are denoted as follows: cold (c), heat (h), lactose diauxie (ld), oxidative (o), and stationary phase (s); their pairwise combinations are marked with ‘/’, and the clustering over all five stresses, by ‘all’. The number of clusters in each CoCo clustering is included next to the abbreviations for the stresses.
Mentions: Analogously, another criterion to validate the usage of the heuristic is the following: CoCo clusterings based on networks from a pair of conditions are expected to be the most similar to the clusterings induced by the individual conditions participating in the pair than to those induced by any other condition alone. As shown in Figure 3, this is indeed the case for all pairs of conditions. In addition, the CoCo clustering based on the pairs of networks from oxidative stress and stationary growth was the most similar to the CoCo clustering with all five networks (0.50), followed by heat stress and lactose diauxie (0.44), cold and oxidative stress (0.41), as well as lactose diauxie and stationary condition (0.39). Therefore, these three pairs of conditions have the largest influence on the CoCo clustering with all networks. Interestingly, while the similarity of the individual clustering from the stationary phase to the CoCo clustering with all five networks is the smallest (0.06), conditioned on the data from oxidative stress, it obtains the highest contribution (0.50) (see Table S2 for values).

Bottom Line: We present theoretical results for special classes of networks to demonstrate the implications of conditionality captured by the COCONETS formulation.We demonstrate that the investigation of the differences between the clustering based on all networks with that obtained from a subset of networks can be used to quantify the specificity of biological responses.Therefore, the comparison of multiple networks based on concurrent conditional clustering offers a novel venue for detection and investigation of preserved network substructures.

View Article: PubMed Central - PubMed

Affiliation: Systems Biology and Mathematical Modeling Group, Max Planck Institute of Molecular Plant Physiology, Potsdam-Golm, Germany.

ABSTRACT
The accumulation of high-throughput data from different experiments has facilitated the extraction of condition-specific networks over the same set of biological entities. Comparing and contrasting of such multiple biological networks is in the center of differential network biology, aiming at determining general and condition-specific responses captured in the network structure (i.e., included associations between the network components). We provide a novel way for comparison of multiple networks based on determining network clustering (i.e., partition into communities) which is optimal across the set of networks with respect to a given cluster quality measure. To this end, we formulate the optimization-based problem of concurrent conditional clustering of multiple networks, termed COCONETS, based on the modularity. The solution to this problem is a clustering which depends on all considered networks and pinpoints their preserved substructures. We present theoretical results for special classes of networks to demonstrate the implications of conditionality captured by the COCONETS formulation. As the problem can be shown to be intractable, we extend an existing efficient greedy heuristic and applied it to determine concurrent conditional clusters on coexpression networks extracted from publically available time-resolved transcriptomics data of Escherichia coli under five stresses as well as on metabolite correlation networks from metabolomics data set from Arabidopsis thaliana exposed to eight environmental conditions. We demonstrate that the investigation of the differences between the clustering based on all networks with that obtained from a subset of networks can be used to quantify the specificity of biological responses. While a comparison of the Escherichia coli coexpression networks based on seminal properties does not pinpoint biologically relevant differences, the common network substructures extracted by COCONETS are supported by existing experimental evidence. Therefore, the comparison of multiple networks based on concurrent conditional clustering offers a novel venue for detection and investigation of preserved network substructures.

Show MeSH
Related in: MedlinePlus