Limits...
Genome-wide detection of intervals of genetic heterogeneity associated with complex traits.

Llinares-López F, Grimm DG, Bodenham DA, Gieraths U, Sugiyama M, Rowan B, Borgwardt K - Bioinformatics (2015)

Bottom Line: Here, we present an approach that overcomes both problems: it allows one to automatically find all contiguous sequences of single nucleotide polymorphisms in the genome that are jointly associated with the phenotype.We demonstrate on Arabidopsis thaliana genome-wide association study data that our approach can discover regions that exhibit genetic heterogeneity and would be missed by single-locus mapping.Our novel approach can contribute to the genome-wide discovery of intervals that are involved in the genetic heterogeneity underlying complex phenotypes.

View Article: PubMed Central - PubMed

Affiliation: Machine Learning and Computational Biology Lab, Department of Biosystems Science and Engineering, ETH Zürich, Basel, Switzerland, The Institute of Scientific and Industrial Research, Osaka University, Osaka, Japan, JST, PRESTO, Japan and Department of Molecular Biology, Max Planck Institute for Developmental Biology, Tübingen, Germany.

Show MeSH

Related in: MedlinePlus

Proportion of novel intervals among all intervals found by FAIS-WY, across all phenotypes. The green part shows the proportion of novel intervals found by FAIS-WY. The red part (UFE ± 10 kb\LMM ± 10 kb) are intervals containing an UFE hit or are in close proximity (±10 kb) to one and the hit could not be found with a LMM. The blue part (LMM ± 10 kb\UFE ± 10 kb) are intervals containing a LMM hit or are in close proximity (±10 kb) to one and the hit could not be found with an UFE. The purple part (LMM ± 10 kb∩UFE ± 10 kb) are intervals that contain both, a hit (±10 kb) found with an UFE and a LMM
© Copyright Policy - creative-commons
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC4559912&req=5

btv263-F5: Proportion of novel intervals among all intervals found by FAIS-WY, across all phenotypes. The green part shows the proportion of novel intervals found by FAIS-WY. The red part (UFE ± 10 kb\LMM ± 10 kb) are intervals containing an UFE hit or are in close proximity (±10 kb) to one and the hit could not be found with a LMM. The blue part (LMM ± 10 kb\UFE ± 10 kb) are intervals containing a LMM hit or are in close proximity (±10 kb) to one and the hit could not be found with an UFE. The purple part (LMM ± 10 kb∩UFE ± 10 kb) are intervals that contain both, a hit (±10 kb) found with an UFE and a LMM

Mentions: Because our method cannot explicitly correct for confounding due to population structure, we investigated how many of our significant intervals contain or are in close proximity (10 kb up- or down-stream) to a ‘confounded’ SNP—a SNP found to be significantly associated by UFE (a UFE ‘hit’), but not found to be significantly associated by a LMM, that is able to correct for population structure. We used a 10 kb window since linkage disequilibrium (LD) decays on average within 10 kb in Arabidopsis thaliana(Kim et al., 2007). We found that only 6.9% (15 intervals) among all significant intervals (217) were close to such a confounded SNP (Fig. 5). Even for the phenotype with strongest population structure (YEL), only one of the intervals contained such a confounded SNP (Supplementary Fig. S2). Eventually, we excluded all intervals that contained or were in close proximity to any significant hit found with an UFE or a LMM. A set of 152 intervals, that is 70% of all detected intervals, was left (Fig. 5). Those can be deemed as truly novel intervals that cannot be detected with a univariate method.Fig. 5.


Genome-wide detection of intervals of genetic heterogeneity associated with complex traits.

Llinares-López F, Grimm DG, Bodenham DA, Gieraths U, Sugiyama M, Rowan B, Borgwardt K - Bioinformatics (2015)

Proportion of novel intervals among all intervals found by FAIS-WY, across all phenotypes. The green part shows the proportion of novel intervals found by FAIS-WY. The red part (UFE ± 10 kb\LMM ± 10 kb) are intervals containing an UFE hit or are in close proximity (±10 kb) to one and the hit could not be found with a LMM. The blue part (LMM ± 10 kb\UFE ± 10 kb) are intervals containing a LMM hit or are in close proximity (±10 kb) to one and the hit could not be found with an UFE. The purple part (LMM ± 10 kb∩UFE ± 10 kb) are intervals that contain both, a hit (±10 kb) found with an UFE and a LMM
© Copyright Policy - creative-commons
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC4559912&req=5

btv263-F5: Proportion of novel intervals among all intervals found by FAIS-WY, across all phenotypes. The green part shows the proportion of novel intervals found by FAIS-WY. The red part (UFE ± 10 kb\LMM ± 10 kb) are intervals containing an UFE hit or are in close proximity (±10 kb) to one and the hit could not be found with a LMM. The blue part (LMM ± 10 kb\UFE ± 10 kb) are intervals containing a LMM hit or are in close proximity (±10 kb) to one and the hit could not be found with an UFE. The purple part (LMM ± 10 kb∩UFE ± 10 kb) are intervals that contain both, a hit (±10 kb) found with an UFE and a LMM
Mentions: Because our method cannot explicitly correct for confounding due to population structure, we investigated how many of our significant intervals contain or are in close proximity (10 kb up- or down-stream) to a ‘confounded’ SNP—a SNP found to be significantly associated by UFE (a UFE ‘hit’), but not found to be significantly associated by a LMM, that is able to correct for population structure. We used a 10 kb window since linkage disequilibrium (LD) decays on average within 10 kb in Arabidopsis thaliana(Kim et al., 2007). We found that only 6.9% (15 intervals) among all significant intervals (217) were close to such a confounded SNP (Fig. 5). Even for the phenotype with strongest population structure (YEL), only one of the intervals contained such a confounded SNP (Supplementary Fig. S2). Eventually, we excluded all intervals that contained or were in close proximity to any significant hit found with an UFE or a LMM. A set of 152 intervals, that is 70% of all detected intervals, was left (Fig. 5). Those can be deemed as truly novel intervals that cannot be detected with a univariate method.Fig. 5.

Bottom Line: Here, we present an approach that overcomes both problems: it allows one to automatically find all contiguous sequences of single nucleotide polymorphisms in the genome that are jointly associated with the phenotype.We demonstrate on Arabidopsis thaliana genome-wide association study data that our approach can discover regions that exhibit genetic heterogeneity and would be missed by single-locus mapping.Our novel approach can contribute to the genome-wide discovery of intervals that are involved in the genetic heterogeneity underlying complex phenotypes.

View Article: PubMed Central - PubMed

Affiliation: Machine Learning and Computational Biology Lab, Department of Biosystems Science and Engineering, ETH Zürich, Basel, Switzerland, The Institute of Scientific and Industrial Research, Osaka University, Osaka, Japan, JST, PRESTO, Japan and Department of Molecular Biology, Max Planck Institute for Developmental Biology, Tübingen, Germany.

Show MeSH
Related in: MedlinePlus