Limits...
Haplotypes versus genotypes on pedigrees.

Kirkpatrick BB - Algorithms Mol Biol (2011)

Bottom Line: Two algorithms are introduced: an exponential-time hidden Markov model (HMM) for haplotype data where some individuals are untyped, and a linear-time algorithm for pedigrees having haplotype data for all individuals.Having haplotype data on all individuals produces better estimates.However, having several untyped individuals can drastically reduce the utility of haplotype data.

View Article: PubMed Central - HTML - PubMed

Affiliation: Electrical Engineering and Computer Sciences, University of California Berkeley, Berkeley, CA 94720-1776, USA. bbkirk@eecs.berkeley.edu.

ABSTRACT

Background: Genome sequencing will soon produce haplotype data for individuals. For pedigrees of related individuals, sequencing appears to be an attractive alternative to genotyping. However, methods for pedigree analysis with haplotype data have not yet been developed, and the computational complexity of such problems has been an open question. Furthermore, it is not clear in which scenarios haplotype data would provide better estimates than genotype data for quantities such as recombination rates.

Results: To answer these questions, a reduction is given from genotype problem instances to haplotype problem instances, and it is shown that solving the haplotype problem yields the solution to the genotype problem, up to constant factors or coefficients. The pedigree analysis problems we will consider are the likelihood, maximum probability haplotype, and minimum recombination haplotype problems.

Conclusions: Two algorithms are introduced: an exponential-time hidden Markov model (HMM) for haplotype data where some individuals are untyped, and a linear-time algorithm for pedigrees having haplotype data for all individuals. Recombination estimates from the general haplotype HMM algorithm are compared to recombination estimates produced by a genotype HMM. Having haplotype data on all individuals produces better estimates. However, having several untyped individuals can drastically reduce the utility of haplotype data.

No MeSH data available.


Haplotype Pedigrees. Haplotyped individuals are shaded, and individuals have the same labels. For each of the genotyped individuals, i, from the previous figure, the mapping adds a nuclear family containing five new individuals labeled i0, i1, i2, i3, i4.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC3102622&req=5

Figure 2: Haplotype Pedigrees. Haplotyped individuals are shaded, and individuals have the same labels. For each of the genotyped individuals, i, from the previous figure, the mapping adds a nuclear family containing five new individuals labeled i0, i1, i2, i3, i4.

Mentions: Let G ⊂ I represent the set of genotyped individuals in a pedigree having individuals I and edges E. We will create a haplotype instance of the problem, with individuals H ∪ I and edges R ∪ E. To obtain the set H, we add five individuals, i0, i1, i2, i3, i4, to H for every individual i ∈ G. The set of new relationship edges, R, will connect individuals in sets H and G. Specifically, the edges stipulate that i and i0 are the parents of full-siblings i1, i2, i3, and i4 by including the edges: i0 → i1, i0 → i2, i0 → i3, i0 → i4, i → i1, i → i2, i → i3, and i → i4. We will refer to these five individuals, i0, i1, i2, i3, and i4, and their relationships with i as the proxy family for individual i. For example, the 6-individual genotype pedigree in Figure 1 becomes a 21-individual genotype pedigree in Figure 2. This produces a pedigree graph with exactly 5/G/ + /I / individuals and 8/G/ + /E/ edges.


Haplotypes versus genotypes on pedigrees.

Kirkpatrick BB - Algorithms Mol Biol (2011)

Haplotype Pedigrees. Haplotyped individuals are shaded, and individuals have the same labels. For each of the genotyped individuals, i, from the previous figure, the mapping adds a nuclear family containing five new individuals labeled i0, i1, i2, i3, i4.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC3102622&req=5

Figure 2: Haplotype Pedigrees. Haplotyped individuals are shaded, and individuals have the same labels. For each of the genotyped individuals, i, from the previous figure, the mapping adds a nuclear family containing five new individuals labeled i0, i1, i2, i3, i4.
Mentions: Let G ⊂ I represent the set of genotyped individuals in a pedigree having individuals I and edges E. We will create a haplotype instance of the problem, with individuals H ∪ I and edges R ∪ E. To obtain the set H, we add five individuals, i0, i1, i2, i3, i4, to H for every individual i ∈ G. The set of new relationship edges, R, will connect individuals in sets H and G. Specifically, the edges stipulate that i and i0 are the parents of full-siblings i1, i2, i3, and i4 by including the edges: i0 → i1, i0 → i2, i0 → i3, i0 → i4, i → i1, i → i2, i → i3, and i → i4. We will refer to these five individuals, i0, i1, i2, i3, and i4, and their relationships with i as the proxy family for individual i. For example, the 6-individual genotype pedigree in Figure 1 becomes a 21-individual genotype pedigree in Figure 2. This produces a pedigree graph with exactly 5/G/ + /I / individuals and 8/G/ + /E/ edges.

Bottom Line: Two algorithms are introduced: an exponential-time hidden Markov model (HMM) for haplotype data where some individuals are untyped, and a linear-time algorithm for pedigrees having haplotype data for all individuals.Having haplotype data on all individuals produces better estimates.However, having several untyped individuals can drastically reduce the utility of haplotype data.

View Article: PubMed Central - HTML - PubMed

Affiliation: Electrical Engineering and Computer Sciences, University of California Berkeley, Berkeley, CA 94720-1776, USA. bbkirk@eecs.berkeley.edu.

ABSTRACT

Background: Genome sequencing will soon produce haplotype data for individuals. For pedigrees of related individuals, sequencing appears to be an attractive alternative to genotyping. However, methods for pedigree analysis with haplotype data have not yet been developed, and the computational complexity of such problems has been an open question. Furthermore, it is not clear in which scenarios haplotype data would provide better estimates than genotype data for quantities such as recombination rates.

Results: To answer these questions, a reduction is given from genotype problem instances to haplotype problem instances, and it is shown that solving the haplotype problem yields the solution to the genotype problem, up to constant factors or coefficients. The pedigree analysis problems we will consider are the likelihood, maximum probability haplotype, and minimum recombination haplotype problems.

Conclusions: Two algorithms are introduced: an exponential-time hidden Markov model (HMM) for haplotype data where some individuals are untyped, and a linear-time algorithm for pedigrees having haplotype data for all individuals. Recombination estimates from the general haplotype HMM algorithm are compared to recombination estimates produced by a genotype HMM. Having haplotype data on all individuals produces better estimates. However, having several untyped individuals can drastically reduce the utility of haplotype data.

No MeSH data available.