Identity-by-descent-based phasing and imputation in founder populations using graphical models.
Bottom Line: Accurate knowledge of haplotypes, the combination of alleles co-residing on a single copy of a chromosome, enables powerful gene mapping and sequence imputation methods.In this study, we present a new computational model for haplotype phasing based on pairwise sharing of haplotypes inferred to be Identical-By-Descent (IBD).We apply the Bayesian network based model in a new phasing algorithm, called systematic long-range phasing (SLRP), that can capitalize on the close genetic relationships in isolated founder populations, and show with simulated and real genome-wide genotype data that SLRP substantially reduces the rate of phasing errors compared to previous phasing algorithms.
Affiliation: Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, United Kingdom.Show MeSH
Related in: MedlinePlus
Mentions: To use the model for phasing, we combine the HMMs for all pairs of individuals into a Bayesian network. The network illustrated in Figure 1, includes observed variables g for the genotypes and hidden variables h for the diplotypes and p for the IBD relationship between pairs of individuals. A variable encodes the diplotype for individual a on marker j. The distribution of the observed genotype depends essentially deterministically on the underlying diplotype but allowing for some noise from the genotyping assay. The network also includes an IBD variable for each SNP j and pair of individuals a and b. This variable encodes the IBD relationship between the two individuals at marker j.
Affiliation: Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, United Kingdom.