Limits...
Graph reconstruction using covariance-based methods

View Article: PubMed Central - PubMed

ABSTRACT

Methods based on correlation and partial correlation are today employed in the reconstruction of a statistical interaction graph from high-throughput omics data. These dedicated methods work well even for the case when the number of variables exceeds the number of samples. In this study, we investigate how the graphs extracted from covariance and concentration matrix estimates are related by using Neumann series and transitive closure and through discussing concrete small examples. Considering the ideal case where the true graph is available, we also compare correlation and partial correlation methods for large realistic graphs. In particular, we perform the comparisons with optimally selected parameters based on the true underlying graph and with data-driven approaches where the parameters are directly estimated from the data.

Electronic supplementary material: The online version of this article (doi:10.1186/s13637-016-0052-y) contains supplementary material, which is available to authorized users.

No MeSH data available.


Related in: MedlinePlus

Predictions by the nodewise regression Lasso (MB Lasso), the graphical Lasso (Glasso), the covariance Lasso, the thresholded sample covariance matrix (Thresholded SCov), and the random guessing using the synthetic data generated from four graph types (chain, cluster, scale-free, and hub graphs). Illustrated are predicted edges (resampled 100 times) and true edges (dark green circle) on correctly predicted vs total predicted axes (left). The Euclidean distances from true edges to predicted edges are summarized in terms of cumulative distribution which indicates the probability of the Euclidean distances (middle). Performances of methods are also assessed using traditional ROC curves (resampled 20 times)
© Copyright Policy - OpenAccess
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC5121191&req=5

Fig6: Predictions by the nodewise regression Lasso (MB Lasso), the graphical Lasso (Glasso), the covariance Lasso, the thresholded sample covariance matrix (Thresholded SCov), and the random guessing using the synthetic data generated from four graph types (chain, cluster, scale-free, and hub graphs). Illustrated are predicted edges (resampled 100 times) and true edges (dark green circle) on correctly predicted vs total predicted axes (left). The Euclidean distances from true edges to predicted edges are summarized in terms of cumulative distribution which indicates the probability of the Euclidean distances (middle). Performances of methods are also assessed using traditional ROC curves (resampled 20 times)

Mentions: First, we performed the comparison on an ideal case where the underlying graph is known and one can optimize predictions based on the given graph (Fig. 6). This way, one can judge the performance of methods under optimal conditions. Since the adaptive Lasso is an adaptive version of nodewise regression method, it is not considered for comparison in this setting.Fig. 6


Graph reconstruction using covariance-based methods
Predictions by the nodewise regression Lasso (MB Lasso), the graphical Lasso (Glasso), the covariance Lasso, the thresholded sample covariance matrix (Thresholded SCov), and the random guessing using the synthetic data generated from four graph types (chain, cluster, scale-free, and hub graphs). Illustrated are predicted edges (resampled 100 times) and true edges (dark green circle) on correctly predicted vs total predicted axes (left). The Euclidean distances from true edges to predicted edges are summarized in terms of cumulative distribution which indicates the probability of the Euclidean distances (middle). Performances of methods are also assessed using traditional ROC curves (resampled 20 times)
© Copyright Policy - OpenAccess
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC5121191&req=5

Fig6: Predictions by the nodewise regression Lasso (MB Lasso), the graphical Lasso (Glasso), the covariance Lasso, the thresholded sample covariance matrix (Thresholded SCov), and the random guessing using the synthetic data generated from four graph types (chain, cluster, scale-free, and hub graphs). Illustrated are predicted edges (resampled 100 times) and true edges (dark green circle) on correctly predicted vs total predicted axes (left). The Euclidean distances from true edges to predicted edges are summarized in terms of cumulative distribution which indicates the probability of the Euclidean distances (middle). Performances of methods are also assessed using traditional ROC curves (resampled 20 times)
Mentions: First, we performed the comparison on an ideal case where the underlying graph is known and one can optimize predictions based on the given graph (Fig. 6). This way, one can judge the performance of methods under optimal conditions. Since the adaptive Lasso is an adaptive version of nodewise regression method, it is not considered for comparison in this setting.Fig. 6

View Article: PubMed Central - PubMed

ABSTRACT

Methods based on correlation and partial correlation are today employed in the reconstruction of a statistical interaction graph from high-throughput omics data. These dedicated methods work well even for the case when the number of variables exceeds the number of samples. In this study, we investigate how the graphs extracted from covariance and concentration matrix estimates are related by using Neumann series and transitive closure and through discussing concrete small examples. Considering the ideal case where the true graph is available, we also compare correlation and partial correlation methods for large realistic graphs. In particular, we perform the comparisons with optimally selected parameters based on the true underlying graph and with data-driven approaches where the parameters are directly estimated from the data.

Electronic supplementary material: The online version of this article (doi:10.1186/s13637-016-0052-y) contains supplementary material, which is available to authorized users.

No MeSH data available.


Related in: MedlinePlus