Limits...
Multiscale Embedded Gene Co-expression Network Analysis.

Song WM, Zhang B - PLoS Comput. Biol. (2015)

Bottom Line: Gene co-expression network analysis has been shown effective in identifying functional co-expressed gene modules associated with complex human diseases.However, existing techniques to construct co-expression networks require some critical prior information such as predefined number of clusters, numerical thresholds for defining co-expression/interaction, or do not naturally reproduce the hallmarks of complex systems such as the scale-free degree distribution of small-worldness.MEGENA showed improved performance over well-established clustering methods and co-expression network construction approaches.

View Article: PubMed Central - PubMed

Affiliation: Department of Genetics and Genomic Sciences, Icahn Institute of Genomics and Multiscale Biology, Icahn School of Medicine at Mount Sinai, New York, New York, United States of America.

ABSTRACT
Gene co-expression network analysis has been shown effective in identifying functional co-expressed gene modules associated with complex human diseases. However, existing techniques to construct co-expression networks require some critical prior information such as predefined number of clusters, numerical thresholds for defining co-expression/interaction, or do not naturally reproduce the hallmarks of complex systems such as the scale-free degree distribution of small-worldness. Previously, a graph filtering technique called Planar Maximally Filtered Graph (PMFG) has been applied to many real-world data sets such as financial stock prices and gene expression to extract meaningful and relevant interactions. However, PMFG is not suitable for large-scale genomic data due to several drawbacks, such as the high computation complexity O(/V/3), the presence of false-positives due to the maximal planarity constraint, and the inadequacy of the clustering framework. Here, we developed a new co-expression network analysis framework called Multiscale Embedded Gene Co-expression Network Analysis (MEGENA) by: i) introducing quality control of co-expression similarities, ii) parallelizing embedded network construction, and iii) developing a novel clustering technique to identify multi-scale clustering structures in Planar Filtered Networks (PFNs). We applied MEGENA to a series of simulated data and the gene expression data in breast carcinoma and lung adenocarcinoma from The Cancer Genome Atlas (TCGA). MEGENA showed improved performance over well-established clustering methods and co-expression network construction approaches. MEGENA revealed not only meaningful multi-scale organizations of co-expressed gene clusters but also novel targets in breast carcinoma and lung adenocarcinoma.

Show MeSH

Related in: MedlinePlus

Flow chart of MEGENA.A) Fast planar filtered network construction. Significant interactions are first identified and then embedded on topological surface via a parallelized screening procedure described in the text. On the right, a toy example is illustrated to show construction of PFN from a thresholded network by FDR (top left), and gradual construction of PFN with number of included links and screened pairs shown on the top of each. B) Multi-scale clustering: Beginning from connected components of the initial PFN as the parent clusters, clustering is performed for each parent cluster and compactness of the sub-clusters are evaluated. These steps are described in the dotted box. The clustering is performed iteratively until there remains no further parent clusters meaningful to split. C) Downstream analyses: Multiscale Hub Analysis (MHA) is performed to detect significant hubs of individual clusters and across α, characterizing different scales of organizations in PFN. Then, clusters are ranked by associations to clinical traits including enrichment of differentially expressed gene (DEG) signatures, and correlations to survival end-point etc.
© Copyright Policy
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC4664553&req=5

pcbi.1004574.g001: Flow chart of MEGENA.A) Fast planar filtered network construction. Significant interactions are first identified and then embedded on topological surface via a parallelized screening procedure described in the text. On the right, a toy example is illustrated to show construction of PFN from a thresholded network by FDR (top left), and gradual construction of PFN with number of included links and screened pairs shown on the top of each. B) Multi-scale clustering: Beginning from connected components of the initial PFN as the parent clusters, clustering is performed for each parent cluster and compactness of the sub-clusters are evaluated. These steps are described in the dotted box. The clustering is performed iteratively until there remains no further parent clusters meaningful to split. C) Downstream analyses: Multiscale Hub Analysis (MHA) is performed to detect significant hubs of individual clusters and across α, characterizing different scales of organizations in PFN. Then, clusters are ranked by associations to clinical traits including enrichment of differentially expressed gene (DEG) signatures, and correlations to survival end-point etc.

Mentions: MEGENA consists of four major steps: 1) Fast Planar Filtered Network construction (FPFNC) by introducing parallelization, early termination and prior quality control; 2) Multiscale Clustering Analysis (MCA) by introducing compactness of modular structures characterized by a resolution parameter; 3) Multiscale Hub Analysis (MHA) to identify highly connected hubs of each cluster at each scale and 4) Cluster-Trait Association Analysis (CTA) to explore the relevance of cluster to clinical outcomes. Fig 1 shows the overall analysis flow of MEGENA. Below we give a brief description of FPFNC, MCA and MHA. The details about these steps are presented in Methods.


Multiscale Embedded Gene Co-expression Network Analysis.

Song WM, Zhang B - PLoS Comput. Biol. (2015)

Flow chart of MEGENA.A) Fast planar filtered network construction. Significant interactions are first identified and then embedded on topological surface via a parallelized screening procedure described in the text. On the right, a toy example is illustrated to show construction of PFN from a thresholded network by FDR (top left), and gradual construction of PFN with number of included links and screened pairs shown on the top of each. B) Multi-scale clustering: Beginning from connected components of the initial PFN as the parent clusters, clustering is performed for each parent cluster and compactness of the sub-clusters are evaluated. These steps are described in the dotted box. The clustering is performed iteratively until there remains no further parent clusters meaningful to split. C) Downstream analyses: Multiscale Hub Analysis (MHA) is performed to detect significant hubs of individual clusters and across α, characterizing different scales of organizations in PFN. Then, clusters are ranked by associations to clinical traits including enrichment of differentially expressed gene (DEG) signatures, and correlations to survival end-point etc.
© Copyright Policy
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC4664553&req=5

pcbi.1004574.g001: Flow chart of MEGENA.A) Fast planar filtered network construction. Significant interactions are first identified and then embedded on topological surface via a parallelized screening procedure described in the text. On the right, a toy example is illustrated to show construction of PFN from a thresholded network by FDR (top left), and gradual construction of PFN with number of included links and screened pairs shown on the top of each. B) Multi-scale clustering: Beginning from connected components of the initial PFN as the parent clusters, clustering is performed for each parent cluster and compactness of the sub-clusters are evaluated. These steps are described in the dotted box. The clustering is performed iteratively until there remains no further parent clusters meaningful to split. C) Downstream analyses: Multiscale Hub Analysis (MHA) is performed to detect significant hubs of individual clusters and across α, characterizing different scales of organizations in PFN. Then, clusters are ranked by associations to clinical traits including enrichment of differentially expressed gene (DEG) signatures, and correlations to survival end-point etc.
Mentions: MEGENA consists of four major steps: 1) Fast Planar Filtered Network construction (FPFNC) by introducing parallelization, early termination and prior quality control; 2) Multiscale Clustering Analysis (MCA) by introducing compactness of modular structures characterized by a resolution parameter; 3) Multiscale Hub Analysis (MHA) to identify highly connected hubs of each cluster at each scale and 4) Cluster-Trait Association Analysis (CTA) to explore the relevance of cluster to clinical outcomes. Fig 1 shows the overall analysis flow of MEGENA. Below we give a brief description of FPFNC, MCA and MHA. The details about these steps are presented in Methods.

Bottom Line: Gene co-expression network analysis has been shown effective in identifying functional co-expressed gene modules associated with complex human diseases.However, existing techniques to construct co-expression networks require some critical prior information such as predefined number of clusters, numerical thresholds for defining co-expression/interaction, or do not naturally reproduce the hallmarks of complex systems such as the scale-free degree distribution of small-worldness.MEGENA showed improved performance over well-established clustering methods and co-expression network construction approaches.

View Article: PubMed Central - PubMed

Affiliation: Department of Genetics and Genomic Sciences, Icahn Institute of Genomics and Multiscale Biology, Icahn School of Medicine at Mount Sinai, New York, New York, United States of America.

ABSTRACT
Gene co-expression network analysis has been shown effective in identifying functional co-expressed gene modules associated with complex human diseases. However, existing techniques to construct co-expression networks require some critical prior information such as predefined number of clusters, numerical thresholds for defining co-expression/interaction, or do not naturally reproduce the hallmarks of complex systems such as the scale-free degree distribution of small-worldness. Previously, a graph filtering technique called Planar Maximally Filtered Graph (PMFG) has been applied to many real-world data sets such as financial stock prices and gene expression to extract meaningful and relevant interactions. However, PMFG is not suitable for large-scale genomic data due to several drawbacks, such as the high computation complexity O(/V/3), the presence of false-positives due to the maximal planarity constraint, and the inadequacy of the clustering framework. Here, we developed a new co-expression network analysis framework called Multiscale Embedded Gene Co-expression Network Analysis (MEGENA) by: i) introducing quality control of co-expression similarities, ii) parallelizing embedded network construction, and iii) developing a novel clustering technique to identify multi-scale clustering structures in Planar Filtered Networks (PFNs). We applied MEGENA to a series of simulated data and the gene expression data in breast carcinoma and lung adenocarcinoma from The Cancer Genome Atlas (TCGA). MEGENA showed improved performance over well-established clustering methods and co-expression network construction approaches. MEGENA revealed not only meaningful multi-scale organizations of co-expressed gene clusters but also novel targets in breast carcinoma and lung adenocarcinoma.

Show MeSH
Related in: MedlinePlus