Limits...
DAWN: a framework to identify autism genes and subnetworks using gene expression and genetics.

Liu L, Lei J, Sanders SJ, Willsey AJ, Kou Y, Cicek AE, Klei L, Lu C, He X, Li M, Muhle RA, Ma'ayan A, Noonan JP, Sestan N, McFadden KA, State MW, Buxbaum JD, Devlin B, Roeder K - Mol Autism (2014)

Bottom Line: Validation experiments making use of published targeted resequencing results demonstrate its efficacy in reliably predicting ASD genes.DAWN also successfully predicts known ASD genes, not included in the genetic data used to create the model.Validation studies demonstrate that DAWN is effective in predicting ASD genes and subnetworks by leveraging genetic and gene expression data.

View Article: PubMed Central - HTML - PubMed

Affiliation: Department of Statistics, Carnegie Mellon University, Pittsburgh, PA, USA. roeder@stat.cmu.edu.

ABSTRACT

Background: De novo loss-of-function (dnLoF) mutations are found twofold more often in autism spectrum disorder (ASD) probands than their unaffected siblings. Multiple independent dnLoF mutations in the same gene implicate the gene in risk and hence provide a systematic, albeit arduous, path forward for ASD genetics. It is likely that using additional non-genetic data will enhance the ability to identify ASD genes.

Methods: To accelerate the search for ASD genes, we developed a novel algorithm, DAWN, to model two kinds of data: rare variations from exome sequencing and gene co-expression in the mid-fetal prefrontal and motor-somatosensory neocortex, a critical nexus for risk. The algorithm casts the ensemble data as a hidden Markov random field in which the graph structure is determined by gene co-expression and it combines these interrelationships with node-specific observations, namely gene identity, expression, genetic data and the estimated effect on risk.

Results: Using currently available genetic data and a specific developmental time period for gene co-expression, DAWN identified 127 genes that plausibly affect risk, and a set of likely ASD subnetworks. Validation experiments making use of published targeted resequencing results demonstrate its efficacy in reliably predicting ASD genes. DAWN also successfully predicts known ASD genes, not included in the genetic data used to create the model.

Conclusions: Validation studies demonstrate that DAWN is effective in predicting ASD genes and subnetworks by leveraging genetic and gene expression data. The findings reported here implicate neurite extension and neuronal arborization as risks for ASD. Using DAWN on emerging ASD sequence data and gene expression data from other brain regions and tissues would likely identify novel ASD genes. DAWN can also be used for other complex disorders to identify genes and subnetworks in those disorders.

No MeSH data available.


Related in: MedlinePlus

Clustering by enrichment and protein-protein interaction (PPI). The rASD genes are seeded into the PPI network presented in [6], represented by red nodes, with size proportional to the number of connections. The blue nodes are immediate intermediate proteins [36]. The network was clustered using organic clustering methods implemented in yEd [44] rASD, risk autism spectrum disorder.
© Copyright Policy - open-access
Related In: Results  -  Collection

License 1 - License 2
getmorefigures.php?uid=PMC4016412&req=5

Figure 4: Clustering by enrichment and protein-protein interaction (PPI). The rASD genes are seeded into the PPI network presented in [6], represented by red nodes, with size proportional to the number of connections. The blue nodes are immediate intermediate proteins [36]. The network was clustered using organic clustering methods implemented in yEd [44] rASD, risk autism spectrum disorder.

Mentions: Next we reasoned that if the rASD list were meaningful, it should be enriched for biologically meaningful, ASD-relevant processes. We focused on PPI networks, which are independent of the co-expression networks we analyzed but have the expectation that interacting genes will have correlated expression. In addition to forming a highly significant network of interacting genes (Additional file 12: Figure S6), the rASD genes in the PPI network fall into several natural clusters (Figure 4). Clusters C1, C2 and C4, accounting for a large proportion of the genes, share related functional categories. Specifically, these three clusters are involved in transcriptional regulation (see the GO BP and GO MF categories in Additional file 13: Figure S7). Cluster C2 is additionally enriched for chromatin remodeling terms in GO BP, while cluster C4 is enriched for RNA polymerase II-related categories in GO MF. Additionally Cluster C7 relates to regulation of translation as seen in both GO BP and GO MF. Together these results show that dysregulation of gene expression and coordinated co-expression is a key risk factor for ASD and they further suggest dysregulation has an effect early in development. Dysregulation of coordinated gene expression is consistent with a wide range of ASD studies [43].


DAWN: a framework to identify autism genes and subnetworks using gene expression and genetics.

Liu L, Lei J, Sanders SJ, Willsey AJ, Kou Y, Cicek AE, Klei L, Lu C, He X, Li M, Muhle RA, Ma'ayan A, Noonan JP, Sestan N, McFadden KA, State MW, Buxbaum JD, Devlin B, Roeder K - Mol Autism (2014)

Clustering by enrichment and protein-protein interaction (PPI). The rASD genes are seeded into the PPI network presented in [6], represented by red nodes, with size proportional to the number of connections. The blue nodes are immediate intermediate proteins [36]. The network was clustered using organic clustering methods implemented in yEd [44] rASD, risk autism spectrum disorder.
© Copyright Policy - open-access
Related In: Results  -  Collection

License 1 - License 2
Show All Figures
getmorefigures.php?uid=PMC4016412&req=5

Figure 4: Clustering by enrichment and protein-protein interaction (PPI). The rASD genes are seeded into the PPI network presented in [6], represented by red nodes, with size proportional to the number of connections. The blue nodes are immediate intermediate proteins [36]. The network was clustered using organic clustering methods implemented in yEd [44] rASD, risk autism spectrum disorder.
Mentions: Next we reasoned that if the rASD list were meaningful, it should be enriched for biologically meaningful, ASD-relevant processes. We focused on PPI networks, which are independent of the co-expression networks we analyzed but have the expectation that interacting genes will have correlated expression. In addition to forming a highly significant network of interacting genes (Additional file 12: Figure S6), the rASD genes in the PPI network fall into several natural clusters (Figure 4). Clusters C1, C2 and C4, accounting for a large proportion of the genes, share related functional categories. Specifically, these three clusters are involved in transcriptional regulation (see the GO BP and GO MF categories in Additional file 13: Figure S7). Cluster C2 is additionally enriched for chromatin remodeling terms in GO BP, while cluster C4 is enriched for RNA polymerase II-related categories in GO MF. Additionally Cluster C7 relates to regulation of translation as seen in both GO BP and GO MF. Together these results show that dysregulation of gene expression and coordinated co-expression is a key risk factor for ASD and they further suggest dysregulation has an effect early in development. Dysregulation of coordinated gene expression is consistent with a wide range of ASD studies [43].

Bottom Line: Validation experiments making use of published targeted resequencing results demonstrate its efficacy in reliably predicting ASD genes.DAWN also successfully predicts known ASD genes, not included in the genetic data used to create the model.Validation studies demonstrate that DAWN is effective in predicting ASD genes and subnetworks by leveraging genetic and gene expression data.

View Article: PubMed Central - HTML - PubMed

Affiliation: Department of Statistics, Carnegie Mellon University, Pittsburgh, PA, USA. roeder@stat.cmu.edu.

ABSTRACT

Background: De novo loss-of-function (dnLoF) mutations are found twofold more often in autism spectrum disorder (ASD) probands than their unaffected siblings. Multiple independent dnLoF mutations in the same gene implicate the gene in risk and hence provide a systematic, albeit arduous, path forward for ASD genetics. It is likely that using additional non-genetic data will enhance the ability to identify ASD genes.

Methods: To accelerate the search for ASD genes, we developed a novel algorithm, DAWN, to model two kinds of data: rare variations from exome sequencing and gene co-expression in the mid-fetal prefrontal and motor-somatosensory neocortex, a critical nexus for risk. The algorithm casts the ensemble data as a hidden Markov random field in which the graph structure is determined by gene co-expression and it combines these interrelationships with node-specific observations, namely gene identity, expression, genetic data and the estimated effect on risk.

Results: Using currently available genetic data and a specific developmental time period for gene co-expression, DAWN identified 127 genes that plausibly affect risk, and a set of likely ASD subnetworks. Validation experiments making use of published targeted resequencing results demonstrate its efficacy in reliably predicting ASD genes. DAWN also successfully predicts known ASD genes, not included in the genetic data used to create the model.

Conclusions: Validation studies demonstrate that DAWN is effective in predicting ASD genes and subnetworks by leveraging genetic and gene expression data. The findings reported here implicate neurite extension and neuronal arborization as risks for ASD. Using DAWN on emerging ASD sequence data and gene expression data from other brain regions and tissues would likely identify novel ASD genes. DAWN can also be used for other complex disorders to identify genes and subnetworks in those disorders.

No MeSH data available.


Related in: MedlinePlus