Limits...
Multiscale Embedded Gene Co-expression Network Analysis.

Song WM, Zhang B - PLoS Comput. Biol. (2015)

Bottom Line: Gene co-expression network analysis has been shown effective in identifying functional co-expressed gene modules associated with complex human diseases.However, existing techniques to construct co-expression networks require some critical prior information such as predefined number of clusters, numerical thresholds for defining co-expression/interaction, or do not naturally reproduce the hallmarks of complex systems such as the scale-free degree distribution of small-worldness.MEGENA showed improved performance over well-established clustering methods and co-expression network construction approaches.

View Article: PubMed Central - PubMed

Affiliation: Department of Genetics and Genomic Sciences, Icahn Institute of Genomics and Multiscale Biology, Icahn School of Medicine at Mount Sinai, New York, New York, United States of America.

ABSTRACT
Gene co-expression network analysis has been shown effective in identifying functional co-expressed gene modules associated with complex human diseases. However, existing techniques to construct co-expression networks require some critical prior information such as predefined number of clusters, numerical thresholds for defining co-expression/interaction, or do not naturally reproduce the hallmarks of complex systems such as the scale-free degree distribution of small-worldness. Previously, a graph filtering technique called Planar Maximally Filtered Graph (PMFG) has been applied to many real-world data sets such as financial stock prices and gene expression to extract meaningful and relevant interactions. However, PMFG is not suitable for large-scale genomic data due to several drawbacks, such as the high computation complexity O(/V/3), the presence of false-positives due to the maximal planarity constraint, and the inadequacy of the clustering framework. Here, we developed a new co-expression network analysis framework called Multiscale Embedded Gene Co-expression Network Analysis (MEGENA) by: i) introducing quality control of co-expression similarities, ii) parallelizing embedded network construction, and iii) developing a novel clustering technique to identify multi-scale clustering structures in Planar Filtered Networks (PFNs). We applied MEGENA to a series of simulated data and the gene expression data in breast carcinoma and lung adenocarcinoma from The Cancer Genome Atlas (TCGA). MEGENA showed improved performance over well-established clustering methods and co-expression network construction approaches. MEGENA revealed not only meaningful multi-scale organizations of co-expressed gene clusters but also novel targets in breast carcinoma and lung adenocarcinoma.

Show MeSH

Related in: MedlinePlus

Validation of PFNs in comparison to various network inference methods.A. Comparisons of AUC of ROC for weighted shortest path distances of inferred networks from simulated data from various golden standard networks (labeled on the top), in comparison to ARACNE and RF. Different combinations with Pearson’s correlation coefficient (Pearson), mutual information (MI) and Euclidean distance (Euclid) were tested. B-C. Comparison of BRCA TF knock down signatures on BRCA PFN (red) and FDRN (green) neighborhoods of the target TFs, inferred from MI. The strips on the top of each plot shows expression fold changes (1.3 and 1.5 respectively) to derive these signatures. B shows FDR corrected FET p-values against the number of significantly enriched signatures. C shows enrichment fold change cut-off against the number of significantly enriched signatures. D-E. Comparisons of BRCA TF knock down signatures on inferred networks from PCC. D and E correspond to FDR corrected FET p-values and enrichment fold changes, similarly to B and C.
© Copyright Policy
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC4664553&req=5

pcbi.1004574.g003: Validation of PFNs in comparison to various network inference methods.A. Comparisons of AUC of ROC for weighted shortest path distances of inferred networks from simulated data from various golden standard networks (labeled on the top), in comparison to ARACNE and RF. Different combinations with Pearson’s correlation coefficient (Pearson), mutual information (MI) and Euclidean distance (Euclid) were tested. B-C. Comparison of BRCA TF knock down signatures on BRCA PFN (red) and FDRN (green) neighborhoods of the target TFs, inferred from MI. The strips on the top of each plot shows expression fold changes (1.3 and 1.5 respectively) to derive these signatures. B shows FDR corrected FET p-values against the number of significantly enriched signatures. C shows enrichment fold change cut-off against the number of significantly enriched signatures. D-E. Comparisons of BRCA TF knock down signatures on inferred networks from PCC. D and E correspond to FDR corrected FET p-values and enrichment fold changes, similarly to B and C.

Mentions: As shown in Fig 3A, the PFNs from PCC and MI consistently outperform the RF and ARACNE networks across various FDR thresholds. Table 1 shows the best average AUC-ROC scores, indicating that PFNs from PCC and MI across various FDR thresholds show consistently the best performance except for InSilicoSize100-Yeast2 data set where PFNs are only slightly outperformed by RF based networks at an FDR threshold of 1. At FDR thresholds of 0.2 or less are practically used in almost all cases, PFNs from PCC and MI show the best overall performance.


Multiscale Embedded Gene Co-expression Network Analysis.

Song WM, Zhang B - PLoS Comput. Biol. (2015)

Validation of PFNs in comparison to various network inference methods.A. Comparisons of AUC of ROC for weighted shortest path distances of inferred networks from simulated data from various golden standard networks (labeled on the top), in comparison to ARACNE and RF. Different combinations with Pearson’s correlation coefficient (Pearson), mutual information (MI) and Euclidean distance (Euclid) were tested. B-C. Comparison of BRCA TF knock down signatures on BRCA PFN (red) and FDRN (green) neighborhoods of the target TFs, inferred from MI. The strips on the top of each plot shows expression fold changes (1.3 and 1.5 respectively) to derive these signatures. B shows FDR corrected FET p-values against the number of significantly enriched signatures. C shows enrichment fold change cut-off against the number of significantly enriched signatures. D-E. Comparisons of BRCA TF knock down signatures on inferred networks from PCC. D and E correspond to FDR corrected FET p-values and enrichment fold changes, similarly to B and C.
© Copyright Policy
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC4664553&req=5

pcbi.1004574.g003: Validation of PFNs in comparison to various network inference methods.A. Comparisons of AUC of ROC for weighted shortest path distances of inferred networks from simulated data from various golden standard networks (labeled on the top), in comparison to ARACNE and RF. Different combinations with Pearson’s correlation coefficient (Pearson), mutual information (MI) and Euclidean distance (Euclid) were tested. B-C. Comparison of BRCA TF knock down signatures on BRCA PFN (red) and FDRN (green) neighborhoods of the target TFs, inferred from MI. The strips on the top of each plot shows expression fold changes (1.3 and 1.5 respectively) to derive these signatures. B shows FDR corrected FET p-values against the number of significantly enriched signatures. C shows enrichment fold change cut-off against the number of significantly enriched signatures. D-E. Comparisons of BRCA TF knock down signatures on inferred networks from PCC. D and E correspond to FDR corrected FET p-values and enrichment fold changes, similarly to B and C.
Mentions: As shown in Fig 3A, the PFNs from PCC and MI consistently outperform the RF and ARACNE networks across various FDR thresholds. Table 1 shows the best average AUC-ROC scores, indicating that PFNs from PCC and MI across various FDR thresholds show consistently the best performance except for InSilicoSize100-Yeast2 data set where PFNs are only slightly outperformed by RF based networks at an FDR threshold of 1. At FDR thresholds of 0.2 or less are practically used in almost all cases, PFNs from PCC and MI show the best overall performance.

Bottom Line: Gene co-expression network analysis has been shown effective in identifying functional co-expressed gene modules associated with complex human diseases.However, existing techniques to construct co-expression networks require some critical prior information such as predefined number of clusters, numerical thresholds for defining co-expression/interaction, or do not naturally reproduce the hallmarks of complex systems such as the scale-free degree distribution of small-worldness.MEGENA showed improved performance over well-established clustering methods and co-expression network construction approaches.

View Article: PubMed Central - PubMed

Affiliation: Department of Genetics and Genomic Sciences, Icahn Institute of Genomics and Multiscale Biology, Icahn School of Medicine at Mount Sinai, New York, New York, United States of America.

ABSTRACT
Gene co-expression network analysis has been shown effective in identifying functional co-expressed gene modules associated with complex human diseases. However, existing techniques to construct co-expression networks require some critical prior information such as predefined number of clusters, numerical thresholds for defining co-expression/interaction, or do not naturally reproduce the hallmarks of complex systems such as the scale-free degree distribution of small-worldness. Previously, a graph filtering technique called Planar Maximally Filtered Graph (PMFG) has been applied to many real-world data sets such as financial stock prices and gene expression to extract meaningful and relevant interactions. However, PMFG is not suitable for large-scale genomic data due to several drawbacks, such as the high computation complexity O(/V/3), the presence of false-positives due to the maximal planarity constraint, and the inadequacy of the clustering framework. Here, we developed a new co-expression network analysis framework called Multiscale Embedded Gene Co-expression Network Analysis (MEGENA) by: i) introducing quality control of co-expression similarities, ii) parallelizing embedded network construction, and iii) developing a novel clustering technique to identify multi-scale clustering structures in Planar Filtered Networks (PFNs). We applied MEGENA to a series of simulated data and the gene expression data in breast carcinoma and lung adenocarcinoma from The Cancer Genome Atlas (TCGA). MEGENA showed improved performance over well-established clustering methods and co-expression network construction approaches. MEGENA revealed not only meaningful multi-scale organizations of co-expressed gene clusters but also novel targets in breast carcinoma and lung adenocarcinoma.

Show MeSH
Related in: MedlinePlus