Limits...
Inferring demographic history from a spectrum of shared haplotype lengths.

Harris K, Nielsen R - PLoS Genet. (2013)

Bottom Line: Using the same technique, we analyze deeply sequenced trio parents from the 1000 Genomes project.The data show evidence of extensive gene flow between Africa and Europe after the time of divergence as well as substructure and gene flow among ancestral hominids.In particular, we infer that recent African-European gene flow and ancient ghost admixture into Europe are both necessary to explain the spectrum of IBS sharing in the trios, rejecting simpler models that contain less population structure.

View Article: PubMed Central - PubMed

Affiliation: Department of Mathematics, University of California Berkeley, Berkeley, CA, USA. kharris@math.berkeley.edu

ABSTRACT
There has been much recent excitement about the use of genetics to elucidate ancestral history and demography. Whole genome data from humans and other species are revealing complex stories of divergence and admixture that were left undiscovered by previous smaller data sets. A central challenge is to estimate the timing of past admixture and divergence events, for example the time at which Neanderthals exchanged genetic material with humans and the time at which modern humans left Africa. Here, we present a method for using sequence data to jointly estimate the timing and magnitude of past admixture events, along with population divergence times and changes in effective population size. We infer demography from a collection of pairwise sequence alignments by summarizing their length distribution of tracts of identity by state (IBS) and maximizing an analytic composite likelihood derived from a Markovian coalescent approximation. Recent gene flow between populations leaves behind long tracts of identity by descent (IBD), and these tracts give our method power by influencing the distribution of shared IBS tracts. In simulated data, we accurately infer the timing and strength of admixture events, population size changes, and divergence times over a variety of ancient and recent time scales. Using the same technique, we analyze deeply sequenced trio parents from the 1000 Genomes project. The data show evidence of extensive gene flow between Africa and Europe after the time of divergence as well as substructure and gene flow among ancestral hominids. In particular, we infer that recent African-European gene flow and ancient ghost admixture into Europe are both necessary to explain the spectrum of IBS sharing in the trios, rejecting simpler models that contain less population structure.

Show MeSH
Spectra of IBS sharing between simulated populations that differ only in admixture time.Each of the colored tract spectra in Figure 2A was generated from  base pairs of sequence alignment simulated with Hudson's MS [68]. The IBS tracts are shared between two populations of constant size 10,000 that diverged 2,000 generations ago, with one haplotype sampled from each population. 5% of the genetic material from one population is the product of a recent admixture pulse from the other population. Figure 2B illustrates the history being simulated. When the admixture occurred less than 1,000 generations ago, it noticeably increases the abundance of long IBS tracts. The gray lines in 2A are theoretical tract abundance predictions, and fit the simulated data extremely well. To smooth out noise in the simulated data, abundances are averaged over intervals with exponentially spaced endpoints .
© Copyright Policy
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC3675002&req=5

pgen-1003521-g002: Spectra of IBS sharing between simulated populations that differ only in admixture time.Each of the colored tract spectra in Figure 2A was generated from base pairs of sequence alignment simulated with Hudson's MS [68]. The IBS tracts are shared between two populations of constant size 10,000 that diverged 2,000 generations ago, with one haplotype sampled from each population. 5% of the genetic material from one population is the product of a recent admixture pulse from the other population. Figure 2B illustrates the history being simulated. When the admixture occurred less than 1,000 generations ago, it noticeably increases the abundance of long IBS tracts. The gray lines in 2A are theoretical tract abundance predictions, and fit the simulated data extremely well. To smooth out noise in the simulated data, abundances are averaged over intervals with exponentially spaced endpoints .

Mentions: In the methods section, we derive a formula for the expected length distribution of IBS tracts shared between two DNA sequences from the same population, as well as the length distribution of tracts shared between sequences from diverging populations. Our formula approximates the distribution expected under the SMC' model of Marjoram and Wall [46], which in turn approximates the coalescent with recombination. We evaluate the accuracy of the approximation by simulating data under the full coalescent with recombination and comparing the results to our analytical predictions. In general, we find that the approximations are very accurate as illustrated for two example histories in Figures 2 and 3. To create each plot in Figure 2, we simulated several gigabases of pairwise alignment between populations that split apart 2,000 generations ago and experienced a 5% strength pulse of recent admixture, plotting the IBS tract spectrum of the alignment (for more details, see section 2 of Text S1). Figure 3 was generated by simulating population bottlenecks of varying duration and intensity. In both of these scenarios the analytical approximations closely follow the distributions obtained from full coalescent simulations.


Inferring demographic history from a spectrum of shared haplotype lengths.

Harris K, Nielsen R - PLoS Genet. (2013)

Spectra of IBS sharing between simulated populations that differ only in admixture time.Each of the colored tract spectra in Figure 2A was generated from  base pairs of sequence alignment simulated with Hudson's MS [68]. The IBS tracts are shared between two populations of constant size 10,000 that diverged 2,000 generations ago, with one haplotype sampled from each population. 5% of the genetic material from one population is the product of a recent admixture pulse from the other population. Figure 2B illustrates the history being simulated. When the admixture occurred less than 1,000 generations ago, it noticeably increases the abundance of long IBS tracts. The gray lines in 2A are theoretical tract abundance predictions, and fit the simulated data extremely well. To smooth out noise in the simulated data, abundances are averaged over intervals with exponentially spaced endpoints .
© Copyright Policy
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC3675002&req=5

pgen-1003521-g002: Spectra of IBS sharing between simulated populations that differ only in admixture time.Each of the colored tract spectra in Figure 2A was generated from base pairs of sequence alignment simulated with Hudson's MS [68]. The IBS tracts are shared between two populations of constant size 10,000 that diverged 2,000 generations ago, with one haplotype sampled from each population. 5% of the genetic material from one population is the product of a recent admixture pulse from the other population. Figure 2B illustrates the history being simulated. When the admixture occurred less than 1,000 generations ago, it noticeably increases the abundance of long IBS tracts. The gray lines in 2A are theoretical tract abundance predictions, and fit the simulated data extremely well. To smooth out noise in the simulated data, abundances are averaged over intervals with exponentially spaced endpoints .
Mentions: In the methods section, we derive a formula for the expected length distribution of IBS tracts shared between two DNA sequences from the same population, as well as the length distribution of tracts shared between sequences from diverging populations. Our formula approximates the distribution expected under the SMC' model of Marjoram and Wall [46], which in turn approximates the coalescent with recombination. We evaluate the accuracy of the approximation by simulating data under the full coalescent with recombination and comparing the results to our analytical predictions. In general, we find that the approximations are very accurate as illustrated for two example histories in Figures 2 and 3. To create each plot in Figure 2, we simulated several gigabases of pairwise alignment between populations that split apart 2,000 generations ago and experienced a 5% strength pulse of recent admixture, plotting the IBS tract spectrum of the alignment (for more details, see section 2 of Text S1). Figure 3 was generated by simulating population bottlenecks of varying duration and intensity. In both of these scenarios the analytical approximations closely follow the distributions obtained from full coalescent simulations.

Bottom Line: Using the same technique, we analyze deeply sequenced trio parents from the 1000 Genomes project.The data show evidence of extensive gene flow between Africa and Europe after the time of divergence as well as substructure and gene flow among ancestral hominids.In particular, we infer that recent African-European gene flow and ancient ghost admixture into Europe are both necessary to explain the spectrum of IBS sharing in the trios, rejecting simpler models that contain less population structure.

View Article: PubMed Central - PubMed

Affiliation: Department of Mathematics, University of California Berkeley, Berkeley, CA, USA. kharris@math.berkeley.edu

ABSTRACT
There has been much recent excitement about the use of genetics to elucidate ancestral history and demography. Whole genome data from humans and other species are revealing complex stories of divergence and admixture that were left undiscovered by previous smaller data sets. A central challenge is to estimate the timing of past admixture and divergence events, for example the time at which Neanderthals exchanged genetic material with humans and the time at which modern humans left Africa. Here, we present a method for using sequence data to jointly estimate the timing and magnitude of past admixture events, along with population divergence times and changes in effective population size. We infer demography from a collection of pairwise sequence alignments by summarizing their length distribution of tracts of identity by state (IBS) and maximizing an analytic composite likelihood derived from a Markovian coalescent approximation. Recent gene flow between populations leaves behind long tracts of identity by descent (IBD), and these tracts give our method power by influencing the distribution of shared IBS tracts. In simulated data, we accurately infer the timing and strength of admixture events, population size changes, and divergence times over a variety of ancient and recent time scales. Using the same technique, we analyze deeply sequenced trio parents from the 1000 Genomes project. The data show evidence of extensive gene flow between Africa and Europe after the time of divergence as well as substructure and gene flow among ancestral hominids. In particular, we infer that recent African-European gene flow and ancient ghost admixture into Europe are both necessary to explain the spectrum of IBS sharing in the trios, rejecting simpler models that contain less population structure.

Show MeSH