Limits...
Simple models of genomic variation in human SNP density.

Sainudiin R, Clark AG, Durrett RT - BMC Genomics (2007)

Bottom Line: Descriptive hierarchical Poisson models and population-genetic coalescent mixture models are used to describe the observed variation in single-nucleotide polymorphism (SNP) density from samples of size two across the human genome.Using empirical estimates of recombination rate across the human genome and the observed SNP density distribution, we produce a maximum likelihood estimate of the genomic heterogeneity in the scaled mutation rate theta.Accounting for mutational and recombinational heterogeneities can allow for empirically sound distributions in genome scans for "outliers", when the alternative hypotheses include fundamentally historical and unobserved phenomena.

View Article: PubMed Central - HTML - PubMed

Affiliation: Department of Statistics, University of Oxford, Oxford, UK. sainudii@stats.ox.ac.uk

ABSTRACT

Background: Descriptive hierarchical Poisson models and population-genetic coalescent mixture models are used to describe the observed variation in single-nucleotide polymorphism (SNP) density from samples of size two across the human genome.

Results: Using empirical estimates of recombination rate across the human genome and the observed SNP density distribution, we produce a maximum likelihood estimate of the genomic heterogeneity in the scaled mutation rate theta. Such models produce significantly better fits to the observed SNP density distribution than those that ignore the empirically observed recombinational heterogeneities.

Conclusion: Accounting for mutational and recombinational heterogeneities can allow for empirically sound distributions in genome scans for "outliers", when the alternative hypotheses include fundamentally historical and unobserved phenomena.

Show MeSH
The SNP density distribution (joined gray dots), Poisson distribution with mean 90 (large gray dots), simulated distribution of SNPs with ρ = θ = 90 (gray line), and the Maximum Simulated Likelihood estimate from the coalescent simulations with ρ ~ θi ~  and (black line).
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC1919371&req=5

Figure 4: The SNP density distribution (joined gray dots), Poisson distribution with mean 90 (large gray dots), simulated distribution of SNPs with ρ = θ = 90 (gray line), and the Maximum Simulated Likelihood estimate from the coalescent simulations with ρ ~ θi ~ and (black line).

Mentions: We used the Newton's method to find the maximum simulated likelihood (MSL) estimates = 6.7 and = 14.9 (MSL = -185555). We also did a least-squares fit of the observed to the predicted densities and found comparable estimates. Empirical estimates of the sex-averaged recombination rates from deCODE, and Marshfield maps were also used in a similar analysis. Comparable estimates were obtained under a reasonably good fit (MSL = -185558) with the deCODE map whose empirical CDF resembles that of the Genethon Map. However, an analysis with the Marshfield map yielded a poorer fit (MSL = -186007). Figure 4 summarizes the fits to the observed SNP data while Figure 3 shows the marginal density of ρ from the Genethon map and the marginal density of θ under the maximum simulated likelihood estimates ( = 6.7, = 14.9) with mean, variance, and standard deviation given by 90.7, 876.1, and 29.6, respectively. Among the three coarse-scaled maps of the empirical estimates of the sex-averaged human recombination rates, the Genethon map gave the best fit to our observed SNP density distribution data.


Simple models of genomic variation in human SNP density.

Sainudiin R, Clark AG, Durrett RT - BMC Genomics (2007)

The SNP density distribution (joined gray dots), Poisson distribution with mean 90 (large gray dots), simulated distribution of SNPs with ρ = θ = 90 (gray line), and the Maximum Simulated Likelihood estimate from the coalescent simulations with ρ ~ θi ~  and (black line).
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC1919371&req=5

Figure 4: The SNP density distribution (joined gray dots), Poisson distribution with mean 90 (large gray dots), simulated distribution of SNPs with ρ = θ = 90 (gray line), and the Maximum Simulated Likelihood estimate from the coalescent simulations with ρ ~ θi ~ and (black line).
Mentions: We used the Newton's method to find the maximum simulated likelihood (MSL) estimates = 6.7 and = 14.9 (MSL = -185555). We also did a least-squares fit of the observed to the predicted densities and found comparable estimates. Empirical estimates of the sex-averaged recombination rates from deCODE, and Marshfield maps were also used in a similar analysis. Comparable estimates were obtained under a reasonably good fit (MSL = -185558) with the deCODE map whose empirical CDF resembles that of the Genethon Map. However, an analysis with the Marshfield map yielded a poorer fit (MSL = -186007). Figure 4 summarizes the fits to the observed SNP data while Figure 3 shows the marginal density of ρ from the Genethon map and the marginal density of θ under the maximum simulated likelihood estimates ( = 6.7, = 14.9) with mean, variance, and standard deviation given by 90.7, 876.1, and 29.6, respectively. Among the three coarse-scaled maps of the empirical estimates of the sex-averaged human recombination rates, the Genethon map gave the best fit to our observed SNP density distribution data.

Bottom Line: Descriptive hierarchical Poisson models and population-genetic coalescent mixture models are used to describe the observed variation in single-nucleotide polymorphism (SNP) density from samples of size two across the human genome.Using empirical estimates of recombination rate across the human genome and the observed SNP density distribution, we produce a maximum likelihood estimate of the genomic heterogeneity in the scaled mutation rate theta.Accounting for mutational and recombinational heterogeneities can allow for empirically sound distributions in genome scans for "outliers", when the alternative hypotheses include fundamentally historical and unobserved phenomena.

View Article: PubMed Central - HTML - PubMed

Affiliation: Department of Statistics, University of Oxford, Oxford, UK. sainudii@stats.ox.ac.uk

ABSTRACT

Background: Descriptive hierarchical Poisson models and population-genetic coalescent mixture models are used to describe the observed variation in single-nucleotide polymorphism (SNP) density from samples of size two across the human genome.

Results: Using empirical estimates of recombination rate across the human genome and the observed SNP density distribution, we produce a maximum likelihood estimate of the genomic heterogeneity in the scaled mutation rate theta. Such models produce significantly better fits to the observed SNP density distribution than those that ignore the empirically observed recombinational heterogeneities.

Conclusion: Accounting for mutational and recombinational heterogeneities can allow for empirically sound distributions in genome scans for "outliers", when the alternative hypotheses include fundamentally historical and unobserved phenomena.

Show MeSH