DTW-MIC Coexpression Networks from Time-Course Data.
Bottom Line:
When modeling coexpression networks from high-throughput time course data, Pearson Correlation Coefficient (PCC) is one of the most effective and popular similarity functions.Here we propose to overcome these two issues by employing a novel similarity function, Dynamic Time Warping Maximal Information Coefficient (DTW-MIC), combining a measure taking care of functional interactions of signals (MIC) and a measure identifying time lag (DTW).By using the Hamming-Ipsen-Mikhailov (HIM) metric to quantify network differences, the effectiveness of the DTW-MIC approach is demonstrated on a set of four synthetic and one transcriptomic datasets, also in comparison to TimeDelay ARACNE and Transfer Entropy.
View Article:
PubMed Central - PubMed
Affiliation: Fondazione Bruno Kessler, Trento, Italy.
ABSTRACT
Show MeSH
When modeling coexpression networks from high-throughput time course data, Pearson Correlation Coefficient (PCC) is one of the most effective and popular similarity functions. However, its reliability is limited since it cannot capture non-linear interactions and time shifts. Here we propose to overcome these two issues by employing a novel similarity function, Dynamic Time Warping Maximal Information Coefficient (DTW-MIC), combining a measure taking care of functional interactions of signals (MIC) and a measure identifying time lag (DTW). By using the Hamming-Ipsen-Mikhailov (HIM) metric to quantify network differences, the effectiveness of the DTW-MIC approach is demonstrated on a set of four synthetic and one transcriptomic datasets, also in comparison to TimeDelay ARACNE and Transfer Entropy. |
Related In:
Results -
Collection
License getmorefigures.php?uid=PMC4816347&req=5
Mentions: Example To illustrate the difference between PCC and MIC in detecting non-linear relationships between two variables, we introduce a simple synthetic example . Consider the following five time series with 100 time points {ti = i : 1 ≤ i ≤ 100}:A(i)=0.01iB(i)=log100iC(i)=0.01i+ε(0.002i),ε(z)∈U(-z,z)D(i)=0.5coslogi+0.65E(i)={0for50≤i≤70D(i)-0.15otherwise,where is the uniform distribution with extremes a < b. While A(i) is just 1/100–th of the identity map, B(i) is a logarithmic map, C(i) is obtained from A(i) by adding a 20% level of uniform noise, D(i) is a more complex non-linear map merging a trigonometric and a logarithmic relation and, finally, E(i) is obtained from D(i) by a vertical offset and then flattening to zero all the values in the time interval [50, 70]. In Fig 1 the plot of the five time series A–E is displayed together with the PCC and MIC values for all pairs of sequences. MIC is able to capture the functional relationship linking all pairs of time series, even in presence of a moderate level of noise: all MIC values are larger than 0.72, and in six cases out of ten MIC attains the upper bound 1. On the other hand, PCC is close to one only when evaluating the pairs (A, B), (A, C), (B, C) and (D, E), while all the remaining six cases display a correlation score smaller than 0.33, confirming that PCC is ineffective as a similarity measure for complex longitudinal data. As a relevant example, note that B(i) has a strong functional dependence from D(i) and E(i) although the shape of the corresponding curves are hugely different: this non-linear behaviour is well captured by MIC, with similarity value 1 to both (B, D) and (B, E), while the corresponding values for PCC are negative. |
View Article: PubMed Central - PubMed
Affiliation: Fondazione Bruno Kessler, Trento, Italy.