Limits...
Discerning the ancestry of European Americans in genetic association studies.

Price AL, Butler J, Patterson N, Capelli C, Pascali VL, Scarnicci F, Ruiz-Linares A, Groop L, Saetta AA, Korkolopoulou P, Seligsohn U, Waliszewska A, Schirmer C, Ardlie K, Ramos A, Nemesh J, Arbeitman L, Goldstein DB, Reich D, Hirschhorn JN - PLoS Genet. (2007)

Bottom Line: Discerning the ancestry of European Americans genotyped in association studies is important in order to prevent false-positive or false-negative associations due to population stratification and to identify genetic variants whose contribution to disease risk differs across European ancestries.Here, we investigate empirical patterns of population structure in European Americans, analyzing 4,198 samples from four genome-wide association studies to show that components roughly corresponding to northwest European, southeast European, and Ashkenazi Jewish ancestry are the main sources of European American population structure.Building on this insight, we constructed a panel of 300 validated markers that are highly informative for distinguishing these ancestries.

View Article: PubMed Central - PubMed

Affiliation: Department of Genetics, Harvard Medical School, Boston, Massachusetts, United States of America. aprice@broad.mit.edu

ABSTRACT
European Americans are often treated as a homogeneous group, but in fact form a structured population due to historical immigration of diverse source populations. Discerning the ancestry of European Americans genotyped in association studies is important in order to prevent false-positive or false-negative associations due to population stratification and to identify genetic variants whose contribution to disease risk differs across European ancestries. Here, we investigate empirical patterns of population structure in European Americans, analyzing 4,198 samples from four genome-wide association studies to show that components roughly corresponding to northwest European, southeast European, and Ashkenazi Jewish ancestry are the main sources of European American population structure. Building on this insight, we constructed a panel of 300 validated markers that are highly informative for distinguishing these ancestries. We demonstrate that this panel of markers can be used to correct for stratification in association studies that do not generate dense genotype data.

Show MeSH
The Top Two Axes of Variation of MS, BD, PD, and IBD Datasets(A) MS dataset, (B) BD dataset, (C) PD dataset, (D) IBD dataset, (E) IBD dataset with samples labeled according to self-reported ancestry (see Methods): northwest European (IBD-NWreport), southeast European (IBD-SEreport) or Ashkenazi Jewish (IBD-AJreport), with individuals having unknown or mixed European ancestry and not self-reporting as Ashkenazi Jewish (IBD-noreport) not displayed.
© Copyright Policy
Related In: Results  -  Collection


getmorefigures.php?uid=PMC2211542&req=5

pgen-0030236-g001: The Top Two Axes of Variation of MS, BD, PD, and IBD Datasets(A) MS dataset, (B) BD dataset, (C) PD dataset, (D) IBD dataset, (E) IBD dataset with samples labeled according to self-reported ancestry (see Methods): northwest European (IBD-NWreport), southeast European (IBD-SEreport) or Ashkenazi Jewish (IBD-AJreport), with individuals having unknown or mixed European ancestry and not self-reporting as Ashkenazi Jewish (IBD-noreport) not displayed.

Mentions: To investigate whether we could identify consistent patterns of European American population structure, we analyzed four European American datasets involving a total of 4,198 samples. These samples were genotyped on the Affymetrix GeneChip 500K or Illumina HumanHap300 marker sets in the context of genome-wide association studies for multiple sclerosis (MS), bipolar disorder (BD), Parkinson's disease (PD) and inflammatory bowel disease (IBD) (see Methods). For each dataset, we used the EIGENSOFT package to identify principal components describing the most variation in the data [11]. The top two principal components for each dataset are displayed in Figure 1. Strikingly, the results are very similar for each dataset, and are similar to our previous results on a smaller dataset involving the Affymetrix GeneChip 100K marker set [9], suggesting that the main sources of population structure are roughly consistent across European American sample sets.


Discerning the ancestry of European Americans in genetic association studies.

Price AL, Butler J, Patterson N, Capelli C, Pascali VL, Scarnicci F, Ruiz-Linares A, Groop L, Saetta AA, Korkolopoulou P, Seligsohn U, Waliszewska A, Schirmer C, Ardlie K, Ramos A, Nemesh J, Arbeitman L, Goldstein DB, Reich D, Hirschhorn JN - PLoS Genet. (2007)

The Top Two Axes of Variation of MS, BD, PD, and IBD Datasets(A) MS dataset, (B) BD dataset, (C) PD dataset, (D) IBD dataset, (E) IBD dataset with samples labeled according to self-reported ancestry (see Methods): northwest European (IBD-NWreport), southeast European (IBD-SEreport) or Ashkenazi Jewish (IBD-AJreport), with individuals having unknown or mixed European ancestry and not self-reporting as Ashkenazi Jewish (IBD-noreport) not displayed.
© Copyright Policy
Related In: Results  -  Collection

Show All Figures
getmorefigures.php?uid=PMC2211542&req=5

pgen-0030236-g001: The Top Two Axes of Variation of MS, BD, PD, and IBD Datasets(A) MS dataset, (B) BD dataset, (C) PD dataset, (D) IBD dataset, (E) IBD dataset with samples labeled according to self-reported ancestry (see Methods): northwest European (IBD-NWreport), southeast European (IBD-SEreport) or Ashkenazi Jewish (IBD-AJreport), with individuals having unknown or mixed European ancestry and not self-reporting as Ashkenazi Jewish (IBD-noreport) not displayed.
Mentions: To investigate whether we could identify consistent patterns of European American population structure, we analyzed four European American datasets involving a total of 4,198 samples. These samples were genotyped on the Affymetrix GeneChip 500K or Illumina HumanHap300 marker sets in the context of genome-wide association studies for multiple sclerosis (MS), bipolar disorder (BD), Parkinson's disease (PD) and inflammatory bowel disease (IBD) (see Methods). For each dataset, we used the EIGENSOFT package to identify principal components describing the most variation in the data [11]. The top two principal components for each dataset are displayed in Figure 1. Strikingly, the results are very similar for each dataset, and are similar to our previous results on a smaller dataset involving the Affymetrix GeneChip 100K marker set [9], suggesting that the main sources of population structure are roughly consistent across European American sample sets.

Bottom Line: Discerning the ancestry of European Americans genotyped in association studies is important in order to prevent false-positive or false-negative associations due to population stratification and to identify genetic variants whose contribution to disease risk differs across European ancestries.Here, we investigate empirical patterns of population structure in European Americans, analyzing 4,198 samples from four genome-wide association studies to show that components roughly corresponding to northwest European, southeast European, and Ashkenazi Jewish ancestry are the main sources of European American population structure.Building on this insight, we constructed a panel of 300 validated markers that are highly informative for distinguishing these ancestries.

View Article: PubMed Central - PubMed

Affiliation: Department of Genetics, Harvard Medical School, Boston, Massachusetts, United States of America. aprice@broad.mit.edu

ABSTRACT
European Americans are often treated as a homogeneous group, but in fact form a structured population due to historical immigration of diverse source populations. Discerning the ancestry of European Americans genotyped in association studies is important in order to prevent false-positive or false-negative associations due to population stratification and to identify genetic variants whose contribution to disease risk differs across European ancestries. Here, we investigate empirical patterns of population structure in European Americans, analyzing 4,198 samples from four genome-wide association studies to show that components roughly corresponding to northwest European, southeast European, and Ashkenazi Jewish ancestry are the main sources of European American population structure. Building on this insight, we constructed a panel of 300 validated markers that are highly informative for distinguishing these ancestries. We demonstrate that this panel of markers can be used to correct for stratification in association studies that do not generate dense genotype data.

Show MeSH