Limits...
Discerning the ancestry of European Americans in genetic association studies.

Price AL, Butler J, Patterson N, Capelli C, Pascali VL, Scarnicci F, Ruiz-Linares A, Groop L, Saetta AA, Korkolopoulou P, Seligsohn U, Waliszewska A, Schirmer C, Ardlie K, Ramos A, Nemesh J, Arbeitman L, Goldstein DB, Reich D, Hirschhorn JN - PLoS Genet. (2007)

Bottom Line: Discerning the ancestry of European Americans genotyped in association studies is important in order to prevent false-positive or false-negative associations due to population stratification and to identify genetic variants whose contribution to disease risk differs across European ancestries.Here, we investigate empirical patterns of population structure in European Americans, analyzing 4,198 samples from four genome-wide association studies to show that components roughly corresponding to northwest European, southeast European, and Ashkenazi Jewish ancestry are the main sources of European American population structure.Building on this insight, we constructed a panel of 300 validated markers that are highly informative for distinguishing these ancestries.

View Article: PubMed Central - PubMed

Affiliation: Department of Genetics, Harvard Medical School, Boston, Massachusetts, United States of America. aprice@broad.mit.edu

ABSTRACT
European Americans are often treated as a homogeneous group, but in fact form a structured population due to historical immigration of diverse source populations. Discerning the ancestry of European Americans genotyped in association studies is important in order to prevent false-positive or false-negative associations due to population stratification and to identify genetic variants whose contribution to disease risk differs across European ancestries. Here, we investigate empirical patterns of population structure in European Americans, analyzing 4,198 samples from four genome-wide association studies to show that components roughly corresponding to northwest European, southeast European, and Ashkenazi Jewish ancestry are the main sources of European American population structure. Building on this insight, we constructed a panel of 300 validated markers that are highly informative for distinguishing these ancestries. We demonstrate that this panel of markers can be used to correct for stratification in association studies that do not generate dense genotype data.

Show MeSH
The Top Two Axes of Variation of the Height Samples Together with European SamplesResults are based on the 299 markers from our marker panel that are unlinked to the LCT locus. Height samples are labeled according to self-reported grandparental origin: northwest European (Height-NWreport), southeast European (Height-SEreport) or four USA-born grandparents (Height-USAreport).
© Copyright Policy
Related In: Results  -  Collection


getmorefigures.php?uid=PMC2211542&req=5

pgen-0030236-g004: The Top Two Axes of Variation of the Height Samples Together with European SamplesResults are based on the 299 markers from our marker panel that are unlinked to the LCT locus. Height samples are labeled according to self-reported grandparental origin: northwest European (Height-NWreport), southeast European (Height-SEreport) or four USA-born grandparents (Height-USAreport).

Mentions: Encouragingly, the panel of 300 markers detects and corrects for stratification in these 368 height samples. We applied the EIGENSTRAT program [9] with default parameters to this dataset, together with ancestral European samples, using the 299 markers unlinked to the candidate LCT locus to infer ancestry and correct for stratification (see Methods). We note that it is important to exclude markers linked to the candidate locus when inferring ancestry using a small number of markers, to avoid a loss in power when correcting for stratification [9]. A plot of the top two axes of variation is displayed in Figure 4, with height samples labeled by self-reported grandparental origin (NW Europe, SE Europe, or four USA-born grandparents) as described in the height study [1]. Unsurprisingly, nearly all Height-NWreport samples lie in cluster 1, which corresponds to northwest European ancestry. More interestingly, nearly all Height-USAreport samples also lie in cluster 1; because clusters 2 and 3 do not seem to be represented in the ancestry of USA-born grandparents of living European Americans, the contribution of these clusters to the ancestry of living European Americans may largely descend from foreign-born grandparents, implying relatively recent immigration. Finally, Height-SEreport samples lie in clusters 1, 2 and 3, indicating that self-reported ancestry does not closely track the genetic ancestry of these samples.


Discerning the ancestry of European Americans in genetic association studies.

Price AL, Butler J, Patterson N, Capelli C, Pascali VL, Scarnicci F, Ruiz-Linares A, Groop L, Saetta AA, Korkolopoulou P, Seligsohn U, Waliszewska A, Schirmer C, Ardlie K, Ramos A, Nemesh J, Arbeitman L, Goldstein DB, Reich D, Hirschhorn JN - PLoS Genet. (2007)

The Top Two Axes of Variation of the Height Samples Together with European SamplesResults are based on the 299 markers from our marker panel that are unlinked to the LCT locus. Height samples are labeled according to self-reported grandparental origin: northwest European (Height-NWreport), southeast European (Height-SEreport) or four USA-born grandparents (Height-USAreport).
© Copyright Policy
Related In: Results  -  Collection

Show All Figures
getmorefigures.php?uid=PMC2211542&req=5

pgen-0030236-g004: The Top Two Axes of Variation of the Height Samples Together with European SamplesResults are based on the 299 markers from our marker panel that are unlinked to the LCT locus. Height samples are labeled according to self-reported grandparental origin: northwest European (Height-NWreport), southeast European (Height-SEreport) or four USA-born grandparents (Height-USAreport).
Mentions: Encouragingly, the panel of 300 markers detects and corrects for stratification in these 368 height samples. We applied the EIGENSTRAT program [9] with default parameters to this dataset, together with ancestral European samples, using the 299 markers unlinked to the candidate LCT locus to infer ancestry and correct for stratification (see Methods). We note that it is important to exclude markers linked to the candidate locus when inferring ancestry using a small number of markers, to avoid a loss in power when correcting for stratification [9]. A plot of the top two axes of variation is displayed in Figure 4, with height samples labeled by self-reported grandparental origin (NW Europe, SE Europe, or four USA-born grandparents) as described in the height study [1]. Unsurprisingly, nearly all Height-NWreport samples lie in cluster 1, which corresponds to northwest European ancestry. More interestingly, nearly all Height-USAreport samples also lie in cluster 1; because clusters 2 and 3 do not seem to be represented in the ancestry of USA-born grandparents of living European Americans, the contribution of these clusters to the ancestry of living European Americans may largely descend from foreign-born grandparents, implying relatively recent immigration. Finally, Height-SEreport samples lie in clusters 1, 2 and 3, indicating that self-reported ancestry does not closely track the genetic ancestry of these samples.

Bottom Line: Discerning the ancestry of European Americans genotyped in association studies is important in order to prevent false-positive or false-negative associations due to population stratification and to identify genetic variants whose contribution to disease risk differs across European ancestries.Here, we investigate empirical patterns of population structure in European Americans, analyzing 4,198 samples from four genome-wide association studies to show that components roughly corresponding to northwest European, southeast European, and Ashkenazi Jewish ancestry are the main sources of European American population structure.Building on this insight, we constructed a panel of 300 validated markers that are highly informative for distinguishing these ancestries.

View Article: PubMed Central - PubMed

Affiliation: Department of Genetics, Harvard Medical School, Boston, Massachusetts, United States of America. aprice@broad.mit.edu

ABSTRACT
European Americans are often treated as a homogeneous group, but in fact form a structured population due to historical immigration of diverse source populations. Discerning the ancestry of European Americans genotyped in association studies is important in order to prevent false-positive or false-negative associations due to population stratification and to identify genetic variants whose contribution to disease risk differs across European ancestries. Here, we investigate empirical patterns of population structure in European Americans, analyzing 4,198 samples from four genome-wide association studies to show that components roughly corresponding to northwest European, southeast European, and Ashkenazi Jewish ancestry are the main sources of European American population structure. Building on this insight, we constructed a panel of 300 validated markers that are highly informative for distinguishing these ancestries. We demonstrate that this panel of markers can be used to correct for stratification in association studies that do not generate dense genotype data.

Show MeSH