Limits...
Network-based segmentation of biological multivariate time series.

Omranian N, Klie S, Mueller-Roeber B, Nikoloski Z - PLoS ONE (2013)

Bottom Line: As a result, MTS data capture the dynamics of biochemical processes and components whose couplings may involve different scales and exhibit temporal changes.We demonstrate that the problem of partitioning MTS data into [Formula: see text] segments to maximize a distance function, operating on polynomially computable network properties, often used in analysis of biological network, can be efficiently solved.To enable biological interpretation, we also propose a breakpoint-penalty (BP-penalty) formulation for determining MTS segmentation which combines a distance function with the number/length of segments.

View Article: PubMed Central - PubMed

Affiliation: Institute of Biochemistry and Biology, University of Potsdam, Potsdam-Golm, Germany.

ABSTRACT
Molecular phenotyping technologies (e.g., transcriptomics, proteomics, and metabolomics) offer the possibility to simultaneously obtain multivariate time series (MTS) data from different levels of information processing and metabolic conversions in biological systems. As a result, MTS data capture the dynamics of biochemical processes and components whose couplings may involve different scales and exhibit temporal changes. Therefore, it is important to develop methods for determining the time segments in MTS data, which may correspond to critical biochemical events reflected in the coupling of the system's components. Here we provide a novel network-based formalization of the MTS segmentation problem based on temporal dependencies and the covariance structure of the data. We demonstrate that the problem of partitioning MTS data into [Formula: see text] segments to maximize a distance function, operating on polynomially computable network properties, often used in analysis of biological network, can be efficiently solved. To enable biological interpretation, we also propose a breakpoint-penalty (BP-penalty) formulation for determining MTS segmentation which combines a distance function with the number/length of segments. Our empirical analyses of synthetic benchmark data as well as time-resolved transcriptomics data from the metabolic and cell cycles of Saccharomyces cerevisiae demonstrate that the proposed method accurately infers the phases in the temporal compartmentalization of biological processes. In addition, through comparison on the same data sets, we show that the results from the proposed formalization of the MTS segmentation problem match biological knowledge and provide more rigorous statistical support in comparison to the contending state-of-the-art methods.

Show MeSH
Illustration of the MULTSEG problem.(Upper panel) 14 time series over 25 time points; (Middle panel) Networks reconstructed from the shown series. The networks correspond to the 4 optimal time series segments, depicted with light grey rectangles in the upper panel. The color coding of nodes correspond to the colors of the time series; (Lower panel, the last two rows) Symmetric difference and union networks from the consecutive segments resulting in the optimal value of 2.40 for the objective , with relative density as a distance measure.
© Copyright Policy
Related In: Results  -  Collection


getmorefigures.php?uid=PMC3646968&req=5

pone-0062974-g001: Illustration of the MULTSEG problem.(Upper panel) 14 time series over 25 time points; (Middle panel) Networks reconstructed from the shown series. The networks correspond to the 4 optimal time series segments, depicted with light grey rectangles in the upper panel. The color coding of nodes correspond to the colors of the time series; (Lower panel, the last two rows) Symmetric difference and union networks from the consecutive segments resulting in the optimal value of 2.40 for the objective , with relative density as a distance measure.

Mentions: As an illustration, we consider 14 time series over 25 time points shown in the upper panel of Fig. 1. There are 2600 pairs of segments to consider, for which can be determined in terms of network properties according to Eqs. (1) and (2). If is obtained with the relative density as a global network property, according to Eq. (3), the solution to the MULTSEG is segments resulting in the maximum value . In this paradigmatic example, the networks are shown below each time series segment, colored grey in Fig. 1. The symmetric difference and union of networks for all pairs of consecutive segments used in obtaining the value of are visualized in the last two rows of Fig. 1, denoted by and , respectively.


Network-based segmentation of biological multivariate time series.

Omranian N, Klie S, Mueller-Roeber B, Nikoloski Z - PLoS ONE (2013)

Illustration of the MULTSEG problem.(Upper panel) 14 time series over 25 time points; (Middle panel) Networks reconstructed from the shown series. The networks correspond to the 4 optimal time series segments, depicted with light grey rectangles in the upper panel. The color coding of nodes correspond to the colors of the time series; (Lower panel, the last two rows) Symmetric difference and union networks from the consecutive segments resulting in the optimal value of 2.40 for the objective , with relative density as a distance measure.
© Copyright Policy
Related In: Results  -  Collection

Show All Figures
getmorefigures.php?uid=PMC3646968&req=5

pone-0062974-g001: Illustration of the MULTSEG problem.(Upper panel) 14 time series over 25 time points; (Middle panel) Networks reconstructed from the shown series. The networks correspond to the 4 optimal time series segments, depicted with light grey rectangles in the upper panel. The color coding of nodes correspond to the colors of the time series; (Lower panel, the last two rows) Symmetric difference and union networks from the consecutive segments resulting in the optimal value of 2.40 for the objective , with relative density as a distance measure.
Mentions: As an illustration, we consider 14 time series over 25 time points shown in the upper panel of Fig. 1. There are 2600 pairs of segments to consider, for which can be determined in terms of network properties according to Eqs. (1) and (2). If is obtained with the relative density as a global network property, according to Eq. (3), the solution to the MULTSEG is segments resulting in the maximum value . In this paradigmatic example, the networks are shown below each time series segment, colored grey in Fig. 1. The symmetric difference and union of networks for all pairs of consecutive segments used in obtaining the value of are visualized in the last two rows of Fig. 1, denoted by and , respectively.

Bottom Line: As a result, MTS data capture the dynamics of biochemical processes and components whose couplings may involve different scales and exhibit temporal changes.We demonstrate that the problem of partitioning MTS data into [Formula: see text] segments to maximize a distance function, operating on polynomially computable network properties, often used in analysis of biological network, can be efficiently solved.To enable biological interpretation, we also propose a breakpoint-penalty (BP-penalty) formulation for determining MTS segmentation which combines a distance function with the number/length of segments.

View Article: PubMed Central - PubMed

Affiliation: Institute of Biochemistry and Biology, University of Potsdam, Potsdam-Golm, Germany.

ABSTRACT
Molecular phenotyping technologies (e.g., transcriptomics, proteomics, and metabolomics) offer the possibility to simultaneously obtain multivariate time series (MTS) data from different levels of information processing and metabolic conversions in biological systems. As a result, MTS data capture the dynamics of biochemical processes and components whose couplings may involve different scales and exhibit temporal changes. Therefore, it is important to develop methods for determining the time segments in MTS data, which may correspond to critical biochemical events reflected in the coupling of the system's components. Here we provide a novel network-based formalization of the MTS segmentation problem based on temporal dependencies and the covariance structure of the data. We demonstrate that the problem of partitioning MTS data into [Formula: see text] segments to maximize a distance function, operating on polynomially computable network properties, often used in analysis of biological network, can be efficiently solved. To enable biological interpretation, we also propose a breakpoint-penalty (BP-penalty) formulation for determining MTS segmentation which combines a distance function with the number/length of segments. Our empirical analyses of synthetic benchmark data as well as time-resolved transcriptomics data from the metabolic and cell cycles of Saccharomyces cerevisiae demonstrate that the proposed method accurately infers the phases in the temporal compartmentalization of biological processes. In addition, through comparison on the same data sets, we show that the results from the proposed formalization of the MTS segmentation problem match biological knowledge and provide more rigorous statistical support in comparison to the contending state-of-the-art methods.

Show MeSH