Genome-wide detection of intervals of genetic heterogeneity associated with complex traits.
Bottom Line: Here, we present an approach that overcomes both problems: it allows one to automatically find all contiguous sequences of single nucleotide polymorphisms in the genome that are jointly associated with the phenotype.We demonstrate on Arabidopsis thaliana genome-wide association study data that our approach can discover regions that exhibit genetic heterogeneity and would be missed by single-locus mapping.Our novel approach can contribute to the genome-wide discovery of intervals that are involved in the genetic heterogeneity underlying complex phenotypes.
Affiliation: Machine Learning and Computational Biology Lab, Department of Biosystems Science and Engineering, ETH Zürich, Basel, Switzerland, The Institute of Scientific and Industrial Research, Osaka University, Osaka, Japan, JST, PRESTO, Japan and Department of Molecular Biology, Max Planck Institute for Developmental Biology, Tübingen, Germany.Show MeSH
Related in: MedlinePlus
Mentions: Because our method cannot explicitly correct for confounding due to population structure, we investigated how many of our significant intervals contain or are in close proximity (10 kb up- or down-stream) to a ‘confounded’ SNP—a SNP found to be significantly associated by UFE (a UFE ‘hit’), but not found to be significantly associated by a LMM, that is able to correct for population structure. We used a 10 kb window since linkage disequilibrium (LD) decays on average within 10 kb in Arabidopsis thaliana(Kim et al., 2007). We found that only 6.9% (15 intervals) among all significant intervals (217) were close to such a confounded SNP (Fig. 5). Even for the phenotype with strongest population structure (YEL), only one of the intervals contained such a confounded SNP (Supplementary Fig. S2). Eventually, we excluded all intervals that contained or were in close proximity to any significant hit found with an UFE or a LMM. A set of 152 intervals, that is 70% of all detected intervals, was left (Fig. 5). Those can be deemed as truly novel intervals that cannot be detected with a univariate method.Fig. 5.
Affiliation: Machine Learning and Computational Biology Lab, Department of Biosystems Science and Engineering, ETH Zürich, Basel, Switzerland, The Institute of Scientific and Industrial Research, Osaka University, Osaka, Japan, JST, PRESTO, Japan and Department of Molecular Biology, Max Planck Institute for Developmental Biology, Tübingen, Germany.