Limits...
Inferring demographic history from a spectrum of shared haplotype lengths.

Harris K, Nielsen R - PLoS Genet. (2013)

Bottom Line: Using the same technique, we analyze deeply sequenced trio parents from the 1000 Genomes project.The data show evidence of extensive gene flow between Africa and Europe after the time of divergence as well as substructure and gene flow among ancestral hominids.In particular, we infer that recent African-European gene flow and ancient ghost admixture into Europe are both necessary to explain the spectrum of IBS sharing in the trios, rejecting simpler models that contain less population structure.

View Article: PubMed Central - PubMed

Affiliation: Department of Mathematics, University of California Berkeley, Berkeley, CA, USA. kharris@math.berkeley.edu

ABSTRACT
There has been much recent excitement about the use of genetics to elucidate ancestral history and demography. Whole genome data from humans and other species are revealing complex stories of divergence and admixture that were left undiscovered by previous smaller data sets. A central challenge is to estimate the timing of past admixture and divergence events, for example the time at which Neanderthals exchanged genetic material with humans and the time at which modern humans left Africa. Here, we present a method for using sequence data to jointly estimate the timing and magnitude of past admixture events, along with population divergence times and changes in effective population size. We infer demography from a collection of pairwise sequence alignments by summarizing their length distribution of tracts of identity by state (IBS) and maximizing an analytic composite likelihood derived from a Markovian coalescent approximation. Recent gene flow between populations leaves behind long tracts of identity by descent (IBD), and these tracts give our method power by influencing the distribution of shared IBS tracts. In simulated data, we accurately infer the timing and strength of admixture events, population size changes, and divergence times over a variety of ancient and recent time scales. Using the same technique, we analyze deeply sequenced trio parents from the 1000 Genomes project. The data show evidence of extensive gene flow between Africa and Europe after the time of divergence as well as substructure and gene flow among ancestral hominids. In particular, we infer that recent African-European gene flow and ancient ghost admixture into Europe are both necessary to explain the spectrum of IBS sharing in the trios, rejecting simpler models that contain less population structure.

Show MeSH
Shared IBS tracts within bottlenecked populations.As in Figure 2, each colored spectrum in Figure 3A was generated by using MS to simulate  base pairs of pairwise alignment. Both sequences are derived from the population depicted in Figure 3B that underwent a bottleneck from size  to size , the duration of the bottleneck being  generations. 1,000 generations ago, the population recovered to size 10,000. These bottlenecks leave similar frequencies of very long and very short IBS tracts because they have identical ratios of strength to duration, but they leave different signature increases compared to the no-bottleneck history in the abundance of –-base IBS tracts. In grey are the expected IBS tract spectra that we predict analytically for each simulated history.
© Copyright Policy
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC3675002&req=5

pgen-1003521-g003: Shared IBS tracts within bottlenecked populations.As in Figure 2, each colored spectrum in Figure 3A was generated by using MS to simulate base pairs of pairwise alignment. Both sequences are derived from the population depicted in Figure 3B that underwent a bottleneck from size to size , the duration of the bottleneck being generations. 1,000 generations ago, the population recovered to size 10,000. These bottlenecks leave similar frequencies of very long and very short IBS tracts because they have identical ratios of strength to duration, but they leave different signature increases compared to the no-bottleneck history in the abundance of –-base IBS tracts. In grey are the expected IBS tract spectra that we predict analytically for each simulated history.

Mentions: In the methods section, we derive a formula for the expected length distribution of IBS tracts shared between two DNA sequences from the same population, as well as the length distribution of tracts shared between sequences from diverging populations. Our formula approximates the distribution expected under the SMC' model of Marjoram and Wall [46], which in turn approximates the coalescent with recombination. We evaluate the accuracy of the approximation by simulating data under the full coalescent with recombination and comparing the results to our analytical predictions. In general, we find that the approximations are very accurate as illustrated for two example histories in Figures 2 and 3. To create each plot in Figure 2, we simulated several gigabases of pairwise alignment between populations that split apart 2,000 generations ago and experienced a 5% strength pulse of recent admixture, plotting the IBS tract spectrum of the alignment (for more details, see section 2 of Text S1). Figure 3 was generated by simulating population bottlenecks of varying duration and intensity. In both of these scenarios the analytical approximations closely follow the distributions obtained from full coalescent simulations.


Inferring demographic history from a spectrum of shared haplotype lengths.

Harris K, Nielsen R - PLoS Genet. (2013)

Shared IBS tracts within bottlenecked populations.As in Figure 2, each colored spectrum in Figure 3A was generated by using MS to simulate  base pairs of pairwise alignment. Both sequences are derived from the population depicted in Figure 3B that underwent a bottleneck from size  to size , the duration of the bottleneck being  generations. 1,000 generations ago, the population recovered to size 10,000. These bottlenecks leave similar frequencies of very long and very short IBS tracts because they have identical ratios of strength to duration, but they leave different signature increases compared to the no-bottleneck history in the abundance of –-base IBS tracts. In grey are the expected IBS tract spectra that we predict analytically for each simulated history.
© Copyright Policy
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC3675002&req=5

pgen-1003521-g003: Shared IBS tracts within bottlenecked populations.As in Figure 2, each colored spectrum in Figure 3A was generated by using MS to simulate base pairs of pairwise alignment. Both sequences are derived from the population depicted in Figure 3B that underwent a bottleneck from size to size , the duration of the bottleneck being generations. 1,000 generations ago, the population recovered to size 10,000. These bottlenecks leave similar frequencies of very long and very short IBS tracts because they have identical ratios of strength to duration, but they leave different signature increases compared to the no-bottleneck history in the abundance of –-base IBS tracts. In grey are the expected IBS tract spectra that we predict analytically for each simulated history.
Mentions: In the methods section, we derive a formula for the expected length distribution of IBS tracts shared between two DNA sequences from the same population, as well as the length distribution of tracts shared between sequences from diverging populations. Our formula approximates the distribution expected under the SMC' model of Marjoram and Wall [46], which in turn approximates the coalescent with recombination. We evaluate the accuracy of the approximation by simulating data under the full coalescent with recombination and comparing the results to our analytical predictions. In general, we find that the approximations are very accurate as illustrated for two example histories in Figures 2 and 3. To create each plot in Figure 2, we simulated several gigabases of pairwise alignment between populations that split apart 2,000 generations ago and experienced a 5% strength pulse of recent admixture, plotting the IBS tract spectrum of the alignment (for more details, see section 2 of Text S1). Figure 3 was generated by simulating population bottlenecks of varying duration and intensity. In both of these scenarios the analytical approximations closely follow the distributions obtained from full coalescent simulations.

Bottom Line: Using the same technique, we analyze deeply sequenced trio parents from the 1000 Genomes project.The data show evidence of extensive gene flow between Africa and Europe after the time of divergence as well as substructure and gene flow among ancestral hominids.In particular, we infer that recent African-European gene flow and ancient ghost admixture into Europe are both necessary to explain the spectrum of IBS sharing in the trios, rejecting simpler models that contain less population structure.

View Article: PubMed Central - PubMed

Affiliation: Department of Mathematics, University of California Berkeley, Berkeley, CA, USA. kharris@math.berkeley.edu

ABSTRACT
There has been much recent excitement about the use of genetics to elucidate ancestral history and demography. Whole genome data from humans and other species are revealing complex stories of divergence and admixture that were left undiscovered by previous smaller data sets. A central challenge is to estimate the timing of past admixture and divergence events, for example the time at which Neanderthals exchanged genetic material with humans and the time at which modern humans left Africa. Here, we present a method for using sequence data to jointly estimate the timing and magnitude of past admixture events, along with population divergence times and changes in effective population size. We infer demography from a collection of pairwise sequence alignments by summarizing their length distribution of tracts of identity by state (IBS) and maximizing an analytic composite likelihood derived from a Markovian coalescent approximation. Recent gene flow between populations leaves behind long tracts of identity by descent (IBD), and these tracts give our method power by influencing the distribution of shared IBS tracts. In simulated data, we accurately infer the timing and strength of admixture events, population size changes, and divergence times over a variety of ancient and recent time scales. Using the same technique, we analyze deeply sequenced trio parents from the 1000 Genomes project. The data show evidence of extensive gene flow between Africa and Europe after the time of divergence as well as substructure and gene flow among ancestral hominids. In particular, we infer that recent African-European gene flow and ancient ghost admixture into Europe are both necessary to explain the spectrum of IBS sharing in the trios, rejecting simpler models that contain less population structure.

Show MeSH