Limits...
Global haplotype partitioning for maximal associated SNP pairs.

Katanforoush A, Sadeghi M, Pezeshk H, Elahi E - BMC Bioinformatics (2009)

Bottom Line: By this test, each SNP pair is characterized as associated, independent, or not-statistically-significant.We set limits on the maximum acceptable proportion of independent pairs within all blocks and search for the partitioning with maximal proportion of associated SNP pairs.This approach presents a native design for dimension reduction in genome-wide association studies.

View Article: PubMed Central - HTML - PubMed

Affiliation: Institute of Biochemistry and Biophysics, University of Tehran, Tehran, Iran. katanfor@ibb.ut.ac.ir

ABSTRACT

Background: Global partitioning based on pairwise associations of SNPs has not previously been used to define haplotype blocks within genomes. Here, we define an association index based on LD between SNP pairs. We use the Fisher's exact test to assess the statistical significance of the LD estimator. By this test, each SNP pair is characterized as associated, independent, or not-statistically-significant. We set limits on the maximum acceptable proportion of independent pairs within all blocks and search for the partitioning with maximal proportion of associated SNP pairs. Essentially, this model is reduced to a constrained optimization problem, the solution of which is obtained by iterating a dynamic programming algorithm.

Results: In comparison with other methods, our algorithm reports blocks of larger average size. Nevertheless, the haplotype diversity within the blocks is captured by a small number of tagSNPs. Resampling HapMap haplotypes under a block-based model of recombination showed that our algorithm is robust in reproducing the same partitioning for recombinant samples. Our algorithm performed better than previously reported models in a case-control association study aimed at mapping a single locus trait, based on simulation results that were evaluated by a block-based statistical test. Compared to methods of haplotype block partitioning, we performed best on detection of recombination hotspots.

Conclusion: Our proposed method divides chromosomes into the regions within which allelic associations of SNP pairs are maximized. This approach presents a native design for dimension reduction in genome-wide association studies. Our results show that the pairwise allelic association of SNPs can describe various features of genomic variation, in particular recombination hotspots.

Show MeSH

Related in: MedlinePlus

Robustness of haplotype block partitioning. To assess the significance of a haplotype block partitioning algorithm, assume that the given samples establish a "founder" group apart from the main population. Both algorithms A and B find the same boundaries for haplotype blocks upon the "founder" sample (middle). If the population size is kept fixed, no mutation occurs and cross-overs happen only on boundaries of the blocks then after many generations all genotypes within the initial blocks stay the same while two locus alleles of SNP pairs between different blocks change. This results in different blocks by Algorithm A and the same block partitioning by Algorithm B (right). We call a block partitioning "robust" if the method reports the same block structure for haplotypes many generations after the "founder" haplotypes.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC2749056&req=5

Figure 1: Robustness of haplotype block partitioning. To assess the significance of a haplotype block partitioning algorithm, assume that the given samples establish a "founder" group apart from the main population. Both algorithms A and B find the same boundaries for haplotype blocks upon the "founder" sample (middle). If the population size is kept fixed, no mutation occurs and cross-overs happen only on boundaries of the blocks then after many generations all genotypes within the initial blocks stay the same while two locus alleles of SNP pairs between different blocks change. This results in different blocks by Algorithm A and the same block partitioning by Algorithm B (right). We call a block partitioning "robust" if the method reports the same block structure for haplotypes many generations after the "founder" haplotypes.

Mentions: A simplified explanation for existence of blocks is that recombination events in ancestral generations predominantly occurred at block boundaries, and not within blocks. As such, observed block boundaries may be taken as hotspots of recombination. Based on this model, a robust block partitioning algorithm will define the same block boundaries whether applied to data of an ancestral generation or to data of a recent generation. The preservation of boundaries by various block partitioning methods can be checked by comparing the boundaries produced at generation one and boundaries produced some generations later (Figure 1). For this purpose, 120 HapMap 9q34.11 haplotypes were followed by simulation through ten generations, assuming crossover probability of 0.5 at the boundaries per generation and a fixed population size. This process was repeated 500 times for each method and the configuration of blocks obtained in each iteration was recorded to assess robustness of block partitioning algorithms.


Global haplotype partitioning for maximal associated SNP pairs.

Katanforoush A, Sadeghi M, Pezeshk H, Elahi E - BMC Bioinformatics (2009)

Robustness of haplotype block partitioning. To assess the significance of a haplotype block partitioning algorithm, assume that the given samples establish a "founder" group apart from the main population. Both algorithms A and B find the same boundaries for haplotype blocks upon the "founder" sample (middle). If the population size is kept fixed, no mutation occurs and cross-overs happen only on boundaries of the blocks then after many generations all genotypes within the initial blocks stay the same while two locus alleles of SNP pairs between different blocks change. This results in different blocks by Algorithm A and the same block partitioning by Algorithm B (right). We call a block partitioning "robust" if the method reports the same block structure for haplotypes many generations after the "founder" haplotypes.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC2749056&req=5

Figure 1: Robustness of haplotype block partitioning. To assess the significance of a haplotype block partitioning algorithm, assume that the given samples establish a "founder" group apart from the main population. Both algorithms A and B find the same boundaries for haplotype blocks upon the "founder" sample (middle). If the population size is kept fixed, no mutation occurs and cross-overs happen only on boundaries of the blocks then after many generations all genotypes within the initial blocks stay the same while two locus alleles of SNP pairs between different blocks change. This results in different blocks by Algorithm A and the same block partitioning by Algorithm B (right). We call a block partitioning "robust" if the method reports the same block structure for haplotypes many generations after the "founder" haplotypes.
Mentions: A simplified explanation for existence of blocks is that recombination events in ancestral generations predominantly occurred at block boundaries, and not within blocks. As such, observed block boundaries may be taken as hotspots of recombination. Based on this model, a robust block partitioning algorithm will define the same block boundaries whether applied to data of an ancestral generation or to data of a recent generation. The preservation of boundaries by various block partitioning methods can be checked by comparing the boundaries produced at generation one and boundaries produced some generations later (Figure 1). For this purpose, 120 HapMap 9q34.11 haplotypes were followed by simulation through ten generations, assuming crossover probability of 0.5 at the boundaries per generation and a fixed population size. This process was repeated 500 times for each method and the configuration of blocks obtained in each iteration was recorded to assess robustness of block partitioning algorithms.

Bottom Line: By this test, each SNP pair is characterized as associated, independent, or not-statistically-significant.We set limits on the maximum acceptable proportion of independent pairs within all blocks and search for the partitioning with maximal proportion of associated SNP pairs.This approach presents a native design for dimension reduction in genome-wide association studies.

View Article: PubMed Central - HTML - PubMed

Affiliation: Institute of Biochemistry and Biophysics, University of Tehran, Tehran, Iran. katanfor@ibb.ut.ac.ir

ABSTRACT

Background: Global partitioning based on pairwise associations of SNPs has not previously been used to define haplotype blocks within genomes. Here, we define an association index based on LD between SNP pairs. We use the Fisher's exact test to assess the statistical significance of the LD estimator. By this test, each SNP pair is characterized as associated, independent, or not-statistically-significant. We set limits on the maximum acceptable proportion of independent pairs within all blocks and search for the partitioning with maximal proportion of associated SNP pairs. Essentially, this model is reduced to a constrained optimization problem, the solution of which is obtained by iterating a dynamic programming algorithm.

Results: In comparison with other methods, our algorithm reports blocks of larger average size. Nevertheless, the haplotype diversity within the blocks is captured by a small number of tagSNPs. Resampling HapMap haplotypes under a block-based model of recombination showed that our algorithm is robust in reproducing the same partitioning for recombinant samples. Our algorithm performed better than previously reported models in a case-control association study aimed at mapping a single locus trait, based on simulation results that were evaluated by a block-based statistical test. Compared to methods of haplotype block partitioning, we performed best on detection of recombination hotspots.

Conclusion: Our proposed method divides chromosomes into the regions within which allelic associations of SNP pairs are maximized. This approach presents a native design for dimension reduction in genome-wide association studies. Our results show that the pairwise allelic association of SNPs can describe various features of genomic variation, in particular recombination hotspots.

Show MeSH
Related in: MedlinePlus