Limits...
Inferring haplotypes at the NAT2 locus: the computational approach.

Sabbagh A, Darlu P - BMC Genet. (2005)

Bottom Line: However, molecular haplotyping methods are labour-intensive and expensive and do not appear to be good candidates for routine clinical applications.We empirically evaluated the effectiveness of four haplotyping algorithms in predicting haplotype phases at NAT2, by comparing the results with those directly obtained through molecular haplotyping.This investigation provides a solid basis for the confident and rational use of computational methods which appear to be a good alternative to infer haplotype phases in the particular case of the NAT2 gene, where there is near complete linkage disequilibrium between polymorphic markers.

View Article: PubMed Central - HTML - PubMed

Affiliation: Unité de Recherche en Génétique Epidémiologique et Structure des Populations Humaines, INSERM U535, Villejuif, France. sabbagh@vjf.inserm.fr

ABSTRACT

Background: Numerous studies have attempted to relate genetic polymorphisms within the N-acetyltransferase 2 gene (NAT2) to interindividual differences in response to drugs or in disease susceptibility. However, genotyping of individuals single-nucleotide polymorphisms (SNPs) alone may not always provide enough information to reach these goals. It is important to link SNPs in terms of haplotypes which carry more information about the genotype-phenotype relationship. Special analytical techniques have been designed to unequivocally determine the allocation of mutations to either DNA strand. However, molecular haplotyping methods are labour-intensive and expensive and do not appear to be good candidates for routine clinical applications. A cheap and relatively straightforward alternative is the use of computational algorithms. The objective of this study was to assess the performance of the computational approach in NAT2 haplotype reconstruction from phase-unknown genotype data, for population samples of various ethnic origin.

Results: We empirically evaluated the effectiveness of four haplotyping algorithms in predicting haplotype phases at NAT2, by comparing the results with those directly obtained through molecular haplotyping. All computational methods provided remarkably accurate and reliable estimates for NAT2 haplotype frequencies and individual haplotype phases. The Bayesian algorithm implemented in the PHASE program performed the best.

Conclusion: This investigation provides a solid basis for the confident and rational use of computational methods which appear to be a good alternative to infer haplotype phases in the particular case of the NAT2 gene, where there is near complete linkage disequilibrium between polymorphic markers.

Show MeSH
The change coefficient (C) as a function of haplotype frequency. The change coefficient reflects the discrepancy between haplotype frequencies deduced from phase-known data and those estimated computationally (here with the PHASE program). All haplotypes occurring in any of the five population samples are considered.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC1173101&req=5

Figure 3: The change coefficient (C) as a function of haplotype frequency. The change coefficient reflects the discrepancy between haplotype frequencies deduced from phase-known data and those estimated computationally (here with the PHASE program). All haplotypes occurring in any of the five population samples are considered.

Mentions: A comparison of the haplotype frequencies determined molecularly with those estimated computationally showed very high concordance. Both PL-EM and PHASE methods provided similarity index (IF) values very close to the maximal value of 1 in all investigated data sets (Table 5). Such high values may be explained by the fact that the IF index gives more weight to common haplotypes whose frequencies are the most accurately estimated by computational algorithms. To investigate the effect of haplotype frequency on estimation accuracy, we plotted the change coefficient (C) against the larger of the two haplotype frequencies (Max [, p0i]), for all possible haplotypes with nonzero frequency estimates determined by either analysis in any of the five population samples. As shown in Figure 3, substantial percentage changes >30% occur only at the lowest haplotype frequencies (<0.007). Even the moderate changes (range 10%–30%) occur only when the frequency estimates are <0.03, and any change in percentage value >5% concerns only haplotype frequencies <0.035. Two-thirds (67%) of the haplotype frequency estimates showed either no change or a small change (< 3%). In Table 6, we compared the relative estimation accuracy of PL-EM and PHASE programs by computing the average change coefficient for three classes of haplotype frequencies. The two methods perform similarly for haplotypes with frequencies >0.05, whereas PHASE provides more accurate estimates when rarer haplotypes are concerned.


Inferring haplotypes at the NAT2 locus: the computational approach.

Sabbagh A, Darlu P - BMC Genet. (2005)

The change coefficient (C) as a function of haplotype frequency. The change coefficient reflects the discrepancy between haplotype frequencies deduced from phase-known data and those estimated computationally (here with the PHASE program). All haplotypes occurring in any of the five population samples are considered.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC1173101&req=5

Figure 3: The change coefficient (C) as a function of haplotype frequency. The change coefficient reflects the discrepancy between haplotype frequencies deduced from phase-known data and those estimated computationally (here with the PHASE program). All haplotypes occurring in any of the five population samples are considered.
Mentions: A comparison of the haplotype frequencies determined molecularly with those estimated computationally showed very high concordance. Both PL-EM and PHASE methods provided similarity index (IF) values very close to the maximal value of 1 in all investigated data sets (Table 5). Such high values may be explained by the fact that the IF index gives more weight to common haplotypes whose frequencies are the most accurately estimated by computational algorithms. To investigate the effect of haplotype frequency on estimation accuracy, we plotted the change coefficient (C) against the larger of the two haplotype frequencies (Max [, p0i]), for all possible haplotypes with nonzero frequency estimates determined by either analysis in any of the five population samples. As shown in Figure 3, substantial percentage changes >30% occur only at the lowest haplotype frequencies (<0.007). Even the moderate changes (range 10%–30%) occur only when the frequency estimates are <0.03, and any change in percentage value >5% concerns only haplotype frequencies <0.035. Two-thirds (67%) of the haplotype frequency estimates showed either no change or a small change (< 3%). In Table 6, we compared the relative estimation accuracy of PL-EM and PHASE programs by computing the average change coefficient for three classes of haplotype frequencies. The two methods perform similarly for haplotypes with frequencies >0.05, whereas PHASE provides more accurate estimates when rarer haplotypes are concerned.

Bottom Line: However, molecular haplotyping methods are labour-intensive and expensive and do not appear to be good candidates for routine clinical applications.We empirically evaluated the effectiveness of four haplotyping algorithms in predicting haplotype phases at NAT2, by comparing the results with those directly obtained through molecular haplotyping.This investigation provides a solid basis for the confident and rational use of computational methods which appear to be a good alternative to infer haplotype phases in the particular case of the NAT2 gene, where there is near complete linkage disequilibrium between polymorphic markers.

View Article: PubMed Central - HTML - PubMed

Affiliation: Unité de Recherche en Génétique Epidémiologique et Structure des Populations Humaines, INSERM U535, Villejuif, France. sabbagh@vjf.inserm.fr

ABSTRACT

Background: Numerous studies have attempted to relate genetic polymorphisms within the N-acetyltransferase 2 gene (NAT2) to interindividual differences in response to drugs or in disease susceptibility. However, genotyping of individuals single-nucleotide polymorphisms (SNPs) alone may not always provide enough information to reach these goals. It is important to link SNPs in terms of haplotypes which carry more information about the genotype-phenotype relationship. Special analytical techniques have been designed to unequivocally determine the allocation of mutations to either DNA strand. However, molecular haplotyping methods are labour-intensive and expensive and do not appear to be good candidates for routine clinical applications. A cheap and relatively straightforward alternative is the use of computational algorithms. The objective of this study was to assess the performance of the computational approach in NAT2 haplotype reconstruction from phase-unknown genotype data, for population samples of various ethnic origin.

Results: We empirically evaluated the effectiveness of four haplotyping algorithms in predicting haplotype phases at NAT2, by comparing the results with those directly obtained through molecular haplotyping. All computational methods provided remarkably accurate and reliable estimates for NAT2 haplotype frequencies and individual haplotype phases. The Bayesian algorithm implemented in the PHASE program performed the best.

Conclusion: This investigation provides a solid basis for the confident and rational use of computational methods which appear to be a good alternative to infer haplotype phases in the particular case of the NAT2 gene, where there is near complete linkage disequilibrium between polymorphic markers.

Show MeSH