Limits...
Inferring human colonization history using a copying model.

Hellenthal G, Auton A, Falush D - PLoS Genet. (2008)

Bottom Line: We apply our model to the SNP data for the 53 populations of the Human Genome Diversity Project described in Conrad et al. (Nature Genetics 38,1251-60, 2006).They also suggest novel details including: (1) the most northerly East Asian population in the sample (Yakut) has received a significant genetic contribution from the ancestors of the most northerly European one (Orcadian). (2) Native North [corrected] Americans have received ancestry from a source closely related to modern North-East Asians (Mongolians and Oroquen) that is distinct from the sources for native South [corrected] Americans, implying multiple waves of migration into the Americas.A detailed depiction of the peopling of the world is available in animated form.

View Article: PubMed Central - PubMed

Affiliation: Department of Statistics, University of Oxford, Oxford, United Kingdom.

ABSTRACT
Genome-wide scans of genetic variation can potentially provide detailed information on how modern humans colonized the world but require new methods of analysis. We introduce a statistical approach that uses Single Nucleotide Polymorphism (SNP) data to identify sharing of chromosomal segments between populations and uses the pattern of sharing to reconstruct a detailed colonization scenario. We apply our model to the SNP data for the 53 populations of the Human Genome Diversity Project described in Conrad et al. (Nature Genetics 38,1251-60, 2006). Our results are consistent with the consensus view of a single "Out-of-Africa" bottleneck and serial dilution of diversity during global colonization, including a prominent East Asian bottleneck. They also suggest novel details including: (1) the most northerly East Asian population in the sample (Yakut) has received a significant genetic contribution from the ancestors of the most northerly European one (Orcadian). (2) Native North [corrected] Americans have received ancestry from a source closely related to modern North-East Asians (Mongolians and Oroquen) that is distinct from the sources for native South [corrected] Americans, implying multiple waves of migration into the Americas. A detailed depiction of the peopling of the world is available in animated form.

Show MeSH
Simulations description and results.(a) and (c) A graphical representation of the simulation parameters. The initial colonization times for each of populations B-E are denoted with dashed lines, with the times t provided on the right in units of generations. Each rectangle represents the demography of one of populations A-E, as labeled, with the rectangle width scaled by the population size at time t. Each arrow represents the sources of colonization for populations B-E, pointing from source population to sink population, with arrow widths pointing into D and E roughly proportional to the proportion of genetic material coming from that source. (b) and (d) A graphical representation of typical examples of the results of our model applied to the simulated data, showing inferred ordering and sources for each population (black arrows). The widths of the rectangles are proportional to the number of sampled individuals for each population, and the thickness of the arrow shafts indicate how many of those chromosomes act as donors for subsequent populations.
© Copyright Policy
Related In: Results  -  Collection


getmorefigures.php?uid=PMC2367454&req=5

pgen-1000078-g002: Simulations description and results.(a) and (c) A graphical representation of the simulation parameters. The initial colonization times for each of populations B-E are denoted with dashed lines, with the times t provided on the right in units of generations. Each rectangle represents the demography of one of populations A-E, as labeled, with the rectangle width scaled by the population size at time t. Each arrow represents the sources of colonization for populations B-E, pointing from source population to sink population, with arrow widths pointing into D and E roughly proportional to the proportion of genetic material coming from that source. (b) and (d) A graphical representation of typical examples of the results of our model applied to the simulated data, showing inferred ordering and sources for each population (black arrows). The widths of the rectangles are proportional to the number of sampled individuals for each population, and the thickness of the arrow shafts indicate how many of those chromosomes act as donors for subsequent populations.

Mentions: We tested our inference method using data simulated under a coalescent model [13],[14], with individuals sampled from five populations, labelled A-E, that were generated by sequential bottlenecks (Figure 2-(a)). Parameters were guided by previous demographic estimates [15], with the first bottleneck approximately corresponding to the “Out of Africa” event. In 10 independent realisations of the same scenario (5 with simulated recombinational hotspots, 5 without), the model correctly inferred both the order in which the populations were founded and which populations gave rise to each new one (Figure 2-(b)) and did not infer any additional, spurious sources of ancestry. We then complicated the model by giving populations D and E ancestry from two sources (Figure 2-(c)). The model continued to infer the correct ordering for the formation of the populations and correctly identified the single sources for populations B and C and the two sources for population E in every case. However, in 7 of the 10 simulations, the ancestry of population D was inferred incorrectly, with the model either failing to include population A as an ancestor (as shown in Figure 2-(d)), mistakenly including population B, or both (Table S1). We conclude that, at least for relatively simple scenarios, the model provides an accurate indication of historical relationships between populations but does not always correctly identify minority sources of ancestry, in particular when admixture is ancient.


Inferring human colonization history using a copying model.

Hellenthal G, Auton A, Falush D - PLoS Genet. (2008)

Simulations description and results.(a) and (c) A graphical representation of the simulation parameters. The initial colonization times for each of populations B-E are denoted with dashed lines, with the times t provided on the right in units of generations. Each rectangle represents the demography of one of populations A-E, as labeled, with the rectangle width scaled by the population size at time t. Each arrow represents the sources of colonization for populations B-E, pointing from source population to sink population, with arrow widths pointing into D and E roughly proportional to the proportion of genetic material coming from that source. (b) and (d) A graphical representation of typical examples of the results of our model applied to the simulated data, showing inferred ordering and sources for each population (black arrows). The widths of the rectangles are proportional to the number of sampled individuals for each population, and the thickness of the arrow shafts indicate how many of those chromosomes act as donors for subsequent populations.
© Copyright Policy
Related In: Results  -  Collection

Show All Figures
getmorefigures.php?uid=PMC2367454&req=5

pgen-1000078-g002: Simulations description and results.(a) and (c) A graphical representation of the simulation parameters. The initial colonization times for each of populations B-E are denoted with dashed lines, with the times t provided on the right in units of generations. Each rectangle represents the demography of one of populations A-E, as labeled, with the rectangle width scaled by the population size at time t. Each arrow represents the sources of colonization for populations B-E, pointing from source population to sink population, with arrow widths pointing into D and E roughly proportional to the proportion of genetic material coming from that source. (b) and (d) A graphical representation of typical examples of the results of our model applied to the simulated data, showing inferred ordering and sources for each population (black arrows). The widths of the rectangles are proportional to the number of sampled individuals for each population, and the thickness of the arrow shafts indicate how many of those chromosomes act as donors for subsequent populations.
Mentions: We tested our inference method using data simulated under a coalescent model [13],[14], with individuals sampled from five populations, labelled A-E, that were generated by sequential bottlenecks (Figure 2-(a)). Parameters were guided by previous demographic estimates [15], with the first bottleneck approximately corresponding to the “Out of Africa” event. In 10 independent realisations of the same scenario (5 with simulated recombinational hotspots, 5 without), the model correctly inferred both the order in which the populations were founded and which populations gave rise to each new one (Figure 2-(b)) and did not infer any additional, spurious sources of ancestry. We then complicated the model by giving populations D and E ancestry from two sources (Figure 2-(c)). The model continued to infer the correct ordering for the formation of the populations and correctly identified the single sources for populations B and C and the two sources for population E in every case. However, in 7 of the 10 simulations, the ancestry of population D was inferred incorrectly, with the model either failing to include population A as an ancestor (as shown in Figure 2-(d)), mistakenly including population B, or both (Table S1). We conclude that, at least for relatively simple scenarios, the model provides an accurate indication of historical relationships between populations but does not always correctly identify minority sources of ancestry, in particular when admixture is ancient.

Bottom Line: We apply our model to the SNP data for the 53 populations of the Human Genome Diversity Project described in Conrad et al. (Nature Genetics 38,1251-60, 2006).They also suggest novel details including: (1) the most northerly East Asian population in the sample (Yakut) has received a significant genetic contribution from the ancestors of the most northerly European one (Orcadian). (2) Native North [corrected] Americans have received ancestry from a source closely related to modern North-East Asians (Mongolians and Oroquen) that is distinct from the sources for native South [corrected] Americans, implying multiple waves of migration into the Americas.A detailed depiction of the peopling of the world is available in animated form.

View Article: PubMed Central - PubMed

Affiliation: Department of Statistics, University of Oxford, Oxford, United Kingdom.

ABSTRACT
Genome-wide scans of genetic variation can potentially provide detailed information on how modern humans colonized the world but require new methods of analysis. We introduce a statistical approach that uses Single Nucleotide Polymorphism (SNP) data to identify sharing of chromosomal segments between populations and uses the pattern of sharing to reconstruct a detailed colonization scenario. We apply our model to the SNP data for the 53 populations of the Human Genome Diversity Project described in Conrad et al. (Nature Genetics 38,1251-60, 2006). Our results are consistent with the consensus view of a single "Out-of-Africa" bottleneck and serial dilution of diversity during global colonization, including a prominent East Asian bottleneck. They also suggest novel details including: (1) the most northerly East Asian population in the sample (Yakut) has received a significant genetic contribution from the ancestors of the most northerly European one (Orcadian). (2) Native North [corrected] Americans have received ancestry from a source closely related to modern North-East Asians (Mongolians and Oroquen) that is distinct from the sources for native South [corrected] Americans, implying multiple waves of migration into the Americas. A detailed depiction of the peopling of the world is available in animated form.

Show MeSH