Identification of causal genes for complex traits.
Bottom Line: As opposed to association studies that benefit from linkage disequilibrium (LD), the main challenge in identifying causal variants at associated loci lies in distinguishing among the many closely correlated variants due to LD.Through extensive simulations, we demonstrate that our method not only speeds up computation, but also have an average of 10% higher recall rate compared with the existing approaches.Software is freely available for download at genetics.cs.ucla.edu/caviar.
Affiliation: Department of Computer Science, Inter-Departmental Program in Bioinformatics, Department of Human Genetics and Department of Pathology and Laboratory Medicine, University of California, Los Angeles, CA 90095, USA.Show MeSH
Related in: MedlinePlus
Mentions: To illustrate an application of our method in real data, we use an HDL dataset which was collected for three different mouse strains: outbred dataset (Zhang et al., 2012), F2 dataset (van Nas et al., 2009), and HMDP dataset (Bennett et al., 2010). We ran CAVIAR-Gene on a region ∼80 megabases in length containing 595 genes (chr1: 120,000,000–197,195,432). This region harbors Apoa2, a gene previously established to influence HDL levels (Flint and Eskin, 2012; van Nas et al., 2009). We applied CAVIAR-Gene on the HMDP dataset considering all the genes in this region which yielded a 95% ρ causal set of 130 genes. Next, we conducted a more refined experiment, using domain-specific knowledge of the phenotype, to create a list of 53 potential candidate genes. CAVIAR-Gene selected a 23 gene subset of this list as the ρ causal gene set. Running CAVIAR-Gene on the Outbred dataset for all 595 genes resulted in a 95% gene set of only 13 genes. Because of the fact that the Outbred mice have a smaller degree of population structure than the HDMP, it is expected that the gene set resolution should be greater in this data. Most importantly, across all the datasets, CAVIAR-Gene includes Apoa2 in the gene set. Figure 4 illustrates the genes which are selected by CAVIAR-Gene for each datasets. The five genes which are common between all the datasets are Nr1i3, Tomm40l, Apoa2, Fcer1g, andNdufs2. All these genes are known to be highly associated with the HDL. This suggests that CAVIAR-Gene not only recovers the actual causal gene, but simultaneously reduced the number of genes that need to undergo functional validation.Fig. 4.
Affiliation: Department of Computer Science, Inter-Departmental Program in Bioinformatics, Department of Human Genetics and Department of Pathology and Laboratory Medicine, University of California, Los Angeles, CA 90095, USA.