Limits...
Challenges in conducting genome-wide association studies in highly admixed multi-ethnic populations: the Generation R Study.

Medina-Gomez C, Felix JF, Estrada K, Peters MJ, Herrera L, Kruithof CJ, Duijts L, Hofman A, van Duijn CM, Uitterlinden AG, Jaddoe VW, Rivadeneira F - Eur. J. Epidemiol. (2015)

Bottom Line: Genome-wide association studies (GWAS) have been successful in identifying loci associated with a wide range of complex human traits and diseases.However, the inclusion of other ethnic groups as well as admixed populations in GWAS studies is rapidly rising following the pressing need to extrapolate findings to non-European populations and to increase statistical power.Furthermore, we highlight a number of practical considerations and alternatives pertinent to the quality control and analysis of admixed GWAS data.

View Article: PubMed Central - PubMed

Affiliation: The Generation R Study Group, Erasmus University Medical Center, Rotterdam, The Netherlands.

ABSTRACT
Genome-wide association studies (GWAS) have been successful in identifying loci associated with a wide range of complex human traits and diseases. Up to now, the majority of GWAS have focused on European populations. However, the inclusion of other ethnic groups as well as admixed populations in GWAS studies is rapidly rising following the pressing need to extrapolate findings to non-European populations and to increase statistical power. In this paper, we describe the methodological steps surrounding genetic data generation, quality control, study design and analytical procedures needed to run GWAS in the multiethnic and highly admixed Generation R Study, a large prospective birth cohort in Rotterdam, the Netherlands. Furthermore, we highlight a number of practical considerations and alternatives pertinent to the quality control and analysis of admixed GWAS data.

Show MeSH
Genome-wide association of red-hair pigmentation in the Generation R cohort. a Q–Q plot showing the inflation of the test statistics when correction for data structure is not applied (black dots) and the slightly lower power when genomic components correction is applied (red dots) in comparison with the EMMAX model (green dots). b Manhattan plots of the red-hair pigmentation GWAS in the Generation R Study using adjustment for genomic components. c Manhattan plots of the red-hair pigmentation GWAS in the Generation R Study using a linear mixed model as implemented in EMMAX. (Color figure online)
© Copyright Policy - OpenAccess
Related In: Results  -  Collection


getmorefigures.php?uid=PMC4385148&req=5

Fig5: Genome-wide association of red-hair pigmentation in the Generation R cohort. a Q–Q plot showing the inflation of the test statistics when correction for data structure is not applied (black dots) and the slightly lower power when genomic components correction is applied (red dots) in comparison with the EMMAX model (green dots). b Manhattan plots of the red-hair pigmentation GWAS in the Generation R Study using adjustment for genomic components. c Manhattan plots of the red-hair pigmentation GWAS in the Generation R Study using a linear mixed model as implemented in EMMAX. (Color figure online)

Mentions: Statistical approaches based on EMMAX and genomic components were tested for two different traits.). There is no evidence of major degrees of residual population stratification in the GWAS results for red hair color (Fig. 5 and Online resource 8), within the Generation R Study (196 children with red hair (3.4 %) as gauged in the QQ-plots (no early deviation from the test statistic or p value distribution) and genomic inflation factors (GIF) close to unity for both EMMAX (GIF = 0.994) and genomic components correction (GIF = 0.999). In contrast, when no adjustment for population stratification was employed, very early (artefactual) deviation was seen in the QQ plot, erroneously indicating that the vast majority of markers across the genome were associated with red hair pigmentation (Fig. 5). After correction for population stratification, only the markers on chromosome 16q24.3 mapping in the vicinity of MCR1 reached GWS, variants in this gene largely explain the presence of red hair pigmentation [28]. GWAS based on the imputed data gave rise to similar results, but showed an even higher number of SNPs underlying the MCR1 associated signal. Furthermore, the leading SNP on these analyses was a missense variant rs1805007, P < 1 × 10−20, reported previously as associated with this trait [29], which was not present in the genotyped data (Online Resource 9). QQ-plots from the skull BMD GWAS show adequate correction for population stratification (Online Resource 10). Power for both EMMAX and genomic components is similar in the two tested traits, as gauged by the number of GWS signals and their significant level (Online resources 8 and 11). Moreover the effect size of skull BMD associated SNPs is practically identical across the two approaches.Fig. 5


Challenges in conducting genome-wide association studies in highly admixed multi-ethnic populations: the Generation R Study.

Medina-Gomez C, Felix JF, Estrada K, Peters MJ, Herrera L, Kruithof CJ, Duijts L, Hofman A, van Duijn CM, Uitterlinden AG, Jaddoe VW, Rivadeneira F - Eur. J. Epidemiol. (2015)

Genome-wide association of red-hair pigmentation in the Generation R cohort. a Q–Q plot showing the inflation of the test statistics when correction for data structure is not applied (black dots) and the slightly lower power when genomic components correction is applied (red dots) in comparison with the EMMAX model (green dots). b Manhattan plots of the red-hair pigmentation GWAS in the Generation R Study using adjustment for genomic components. c Manhattan plots of the red-hair pigmentation GWAS in the Generation R Study using a linear mixed model as implemented in EMMAX. (Color figure online)
© Copyright Policy - OpenAccess
Related In: Results  -  Collection

Show All Figures
getmorefigures.php?uid=PMC4385148&req=5

Fig5: Genome-wide association of red-hair pigmentation in the Generation R cohort. a Q–Q plot showing the inflation of the test statistics when correction for data structure is not applied (black dots) and the slightly lower power when genomic components correction is applied (red dots) in comparison with the EMMAX model (green dots). b Manhattan plots of the red-hair pigmentation GWAS in the Generation R Study using adjustment for genomic components. c Manhattan plots of the red-hair pigmentation GWAS in the Generation R Study using a linear mixed model as implemented in EMMAX. (Color figure online)
Mentions: Statistical approaches based on EMMAX and genomic components were tested for two different traits.). There is no evidence of major degrees of residual population stratification in the GWAS results for red hair color (Fig. 5 and Online resource 8), within the Generation R Study (196 children with red hair (3.4 %) as gauged in the QQ-plots (no early deviation from the test statistic or p value distribution) and genomic inflation factors (GIF) close to unity for both EMMAX (GIF = 0.994) and genomic components correction (GIF = 0.999). In contrast, when no adjustment for population stratification was employed, very early (artefactual) deviation was seen in the QQ plot, erroneously indicating that the vast majority of markers across the genome were associated with red hair pigmentation (Fig. 5). After correction for population stratification, only the markers on chromosome 16q24.3 mapping in the vicinity of MCR1 reached GWS, variants in this gene largely explain the presence of red hair pigmentation [28]. GWAS based on the imputed data gave rise to similar results, but showed an even higher number of SNPs underlying the MCR1 associated signal. Furthermore, the leading SNP on these analyses was a missense variant rs1805007, P < 1 × 10−20, reported previously as associated with this trait [29], which was not present in the genotyped data (Online Resource 9). QQ-plots from the skull BMD GWAS show adequate correction for population stratification (Online Resource 10). Power for both EMMAX and genomic components is similar in the two tested traits, as gauged by the number of GWS signals and their significant level (Online resources 8 and 11). Moreover the effect size of skull BMD associated SNPs is practically identical across the two approaches.Fig. 5

Bottom Line: Genome-wide association studies (GWAS) have been successful in identifying loci associated with a wide range of complex human traits and diseases.However, the inclusion of other ethnic groups as well as admixed populations in GWAS studies is rapidly rising following the pressing need to extrapolate findings to non-European populations and to increase statistical power.Furthermore, we highlight a number of practical considerations and alternatives pertinent to the quality control and analysis of admixed GWAS data.

View Article: PubMed Central - PubMed

Affiliation: The Generation R Study Group, Erasmus University Medical Center, Rotterdam, The Netherlands.

ABSTRACT
Genome-wide association studies (GWAS) have been successful in identifying loci associated with a wide range of complex human traits and diseases. Up to now, the majority of GWAS have focused on European populations. However, the inclusion of other ethnic groups as well as admixed populations in GWAS studies is rapidly rising following the pressing need to extrapolate findings to non-European populations and to increase statistical power. In this paper, we describe the methodological steps surrounding genetic data generation, quality control, study design and analytical procedures needed to run GWAS in the multiethnic and highly admixed Generation R Study, a large prospective birth cohort in Rotterdam, the Netherlands. Furthermore, we highlight a number of practical considerations and alternatives pertinent to the quality control and analysis of admixed GWAS data.

Show MeSH