Limits...
Network-based segmentation of biological multivariate time series.

Omranian N, Klie S, Mueller-Roeber B, Nikoloski Z - PLoS ONE (2013)

Bottom Line: As a result, MTS data capture the dynamics of biochemical processes and components whose couplings may involve different scales and exhibit temporal changes.We demonstrate that the problem of partitioning MTS data into [Formula: see text] segments to maximize a distance function, operating on polynomially computable network properties, often used in analysis of biological network, can be efficiently solved.To enable biological interpretation, we also propose a breakpoint-penalty (BP-penalty) formulation for determining MTS segmentation which combines a distance function with the number/length of segments.

View Article: PubMed Central - PubMed

Affiliation: Institute of Biochemistry and Biology, University of Potsdam, Potsdam-Golm, Germany.

ABSTRACT
Molecular phenotyping technologies (e.g., transcriptomics, proteomics, and metabolomics) offer the possibility to simultaneously obtain multivariate time series (MTS) data from different levels of information processing and metabolic conversions in biological systems. As a result, MTS data capture the dynamics of biochemical processes and components whose couplings may involve different scales and exhibit temporal changes. Therefore, it is important to develop methods for determining the time segments in MTS data, which may correspond to critical biochemical events reflected in the coupling of the system's components. Here we provide a novel network-based formalization of the MTS segmentation problem based on temporal dependencies and the covariance structure of the data. We demonstrate that the problem of partitioning MTS data into [Formula: see text] segments to maximize a distance function, operating on polynomially computable network properties, often used in analysis of biological network, can be efficiently solved. To enable biological interpretation, we also propose a breakpoint-penalty (BP-penalty) formulation for determining MTS segmentation which combines a distance function with the number/length of segments. Our empirical analyses of synthetic benchmark data as well as time-resolved transcriptomics data from the metabolic and cell cycles of Saccharomyces cerevisiae demonstrate that the proposed method accurately infers the phases in the temporal compartmentalization of biological processes. In addition, through comparison on the same data sets, we show that the results from the proposed formalization of the MTS segmentation problem match biological knowledge and provide more rigorous statistical support in comparison to the contending state-of-the-art methods.

Show MeSH
Algorithm 1 - Optimal number of segments.It presents the algorithm for computing the optimal segmentation based on the longest path in a directed acyclic graph.
© Copyright Policy
Related In: Results  -  Collection


getmorefigures.php?uid=PMC3646968&req=5

pone-0062974-g003: Algorithm 1 - Optimal number of segments.It presents the algorithm for computing the optimal segmentation based on the longest path in a directed acyclic graph.

Mentions: Proof. Let denote the sequence of nodes of in topological order. A topological ordering of a directed acyclic graph is a linear ordering of its nodes in which each node appears before all other nodes to which it has outgoing edges. It can be obtained in polynomial time in the order of [34]. Let denote the array with elements initialized to zero, and be a predecessor array. Furthermore, let the weight of a path in be the sum of edge-weights on the path and the length of the path be the number of edges. The largest weight of a path from the source to the target equals the sum of the largest weight of the path from to a predecessor of and the weight of the edge from the predecessor to . Determining the path of maximum weight in can be solved by the dynamic programming approach given in Algorithm 1 (Fig. 3). The number of segmentation positions can then be determined as the minimum number of edges on the path of maximum weight in . The claim follows from the fact that the algorithm considers all nodes and edges.


Network-based segmentation of biological multivariate time series.

Omranian N, Klie S, Mueller-Roeber B, Nikoloski Z - PLoS ONE (2013)

Algorithm 1 - Optimal number of segments.It presents the algorithm for computing the optimal segmentation based on the longest path in a directed acyclic graph.
© Copyright Policy
Related In: Results  -  Collection

Show All Figures
getmorefigures.php?uid=PMC3646968&req=5

pone-0062974-g003: Algorithm 1 - Optimal number of segments.It presents the algorithm for computing the optimal segmentation based on the longest path in a directed acyclic graph.
Mentions: Proof. Let denote the sequence of nodes of in topological order. A topological ordering of a directed acyclic graph is a linear ordering of its nodes in which each node appears before all other nodes to which it has outgoing edges. It can be obtained in polynomial time in the order of [34]. Let denote the array with elements initialized to zero, and be a predecessor array. Furthermore, let the weight of a path in be the sum of edge-weights on the path and the length of the path be the number of edges. The largest weight of a path from the source to the target equals the sum of the largest weight of the path from to a predecessor of and the weight of the edge from the predecessor to . Determining the path of maximum weight in can be solved by the dynamic programming approach given in Algorithm 1 (Fig. 3). The number of segmentation positions can then be determined as the minimum number of edges on the path of maximum weight in . The claim follows from the fact that the algorithm considers all nodes and edges.

Bottom Line: As a result, MTS data capture the dynamics of biochemical processes and components whose couplings may involve different scales and exhibit temporal changes.We demonstrate that the problem of partitioning MTS data into [Formula: see text] segments to maximize a distance function, operating on polynomially computable network properties, often used in analysis of biological network, can be efficiently solved.To enable biological interpretation, we also propose a breakpoint-penalty (BP-penalty) formulation for determining MTS segmentation which combines a distance function with the number/length of segments.

View Article: PubMed Central - PubMed

Affiliation: Institute of Biochemistry and Biology, University of Potsdam, Potsdam-Golm, Germany.

ABSTRACT
Molecular phenotyping technologies (e.g., transcriptomics, proteomics, and metabolomics) offer the possibility to simultaneously obtain multivariate time series (MTS) data from different levels of information processing and metabolic conversions in biological systems. As a result, MTS data capture the dynamics of biochemical processes and components whose couplings may involve different scales and exhibit temporal changes. Therefore, it is important to develop methods for determining the time segments in MTS data, which may correspond to critical biochemical events reflected in the coupling of the system's components. Here we provide a novel network-based formalization of the MTS segmentation problem based on temporal dependencies and the covariance structure of the data. We demonstrate that the problem of partitioning MTS data into [Formula: see text] segments to maximize a distance function, operating on polynomially computable network properties, often used in analysis of biological network, can be efficiently solved. To enable biological interpretation, we also propose a breakpoint-penalty (BP-penalty) formulation for determining MTS segmentation which combines a distance function with the number/length of segments. Our empirical analyses of synthetic benchmark data as well as time-resolved transcriptomics data from the metabolic and cell cycles of Saccharomyces cerevisiae demonstrate that the proposed method accurately infers the phases in the temporal compartmentalization of biological processes. In addition, through comparison on the same data sets, we show that the results from the proposed formalization of the MTS segmentation problem match biological knowledge and provide more rigorous statistical support in comparison to the contending state-of-the-art methods.

Show MeSH