Limits...
Using DNA pools for genotyping trios.

Beckman KB, Abel KJ, Braun A, Halperin E - Nucleic Acids Res. (2006)

Bottom Line: The genotyping of mother-father-child trios is a very useful tool in disease association studies, as trios eliminate population stratification effects and increase the accuracy of haplotype inference.Unfortunately, the use of trios for association studies may reduce power, since it requires the genotyping of three individuals where only four independent haplotypes are involved.We demonstrate that the error rates in the genotype calls of the proposed protocol are comparable to those of standard genotyping techniques, although the cost is reduced considerably.

View Article: PubMed Central - PubMed

Affiliation: Children's Hospital Oakland Research Institute, 5700 Martin Luther King Jr Way, Oakland, CA 94609, USA.

ABSTRACT
The genotyping of mother-father-child trios is a very useful tool in disease association studies, as trios eliminate population stratification effects and increase the accuracy of haplotype inference. Unfortunately, the use of trios for association studies may reduce power, since it requires the genotyping of three individuals where only four independent haplotypes are involved. We describe here a method for genotyping a trio using two DNA pools, thus reducing the cost of genotyping trios to that of genotyping two individuals. Furthermore, we present extensions to the method that exploit the linkage disequilibrium structure to compensate for missing data and genotyping errors. We evaluated our method on trios from CEPH pedigree 66 of the Coriell Institute. We demonstrate that the error rates in the genotype calls of the proposed protocol are comparable to those of standard genotyping techniques, although the cost is reduced considerably. The approach described is generic and it can be applied to any genotyping platform that achieves a reasonable precision of allele frequency estimates from pools of two individuals. Using this approach, future trio-based association studies may be able to increase the sample size by 50% for the same cost and thereby increase the power to detect associations.

Show MeSH

Related in: MedlinePlus

The error rate of the predicted genotypes using multiplexing pools in trios and the greedy algorithm for missing data estimation. The x-axis is the missing data rate and the y-axis is the resulting genotyping error rate. The pools and the corresponding confidence intervals were simulated on two datasets: The first is a dataset simulated from the coalescent model, taken from the dataset publicly available by (7). The second is chromosome 22, taken from 30 Yoruban trios collected by the HapMap consortium, (15).
© Copyright Policy
Related In: Results  -  Collection


getmorefigures.php?uid=PMC1636483&req=5

fig3: The error rate of the predicted genotypes using multiplexing pools in trios and the greedy algorithm for missing data estimation. The x-axis is the missing data rate and the y-axis is the resulting genotyping error rate. The pools and the corresponding confidence intervals were simulated on two datasets: The first is a dataset simulated from the coalescent model, taken from the dataset publicly available by (7). The second is chromosome 22, taken from 30 Yoruban trios collected by the HapMap consortium, (15).

Mentions: We evaluated the performance of Triophase with missing data on the set of 30 trios of the Yoruban population available from the HapMap dataset (HapMap Consortium, 2005). We simulated confidence intervals for the pools resulting from this dataset across chromosome 22. The simulated confidence intervals depend on an ambiguous data rate p (for details see Materials and Methods). We evaluated Triophase for ambiguous data rate p ranging from 0 to 14% with 1% increments. The simulation study shows that the error rate is kept 15-fold lower than the rate of ambiguity (Figure 3). Thus, when random SNPs are genotyped in a similar density to the one given in the first phase of HapMap (every 5 kb on the average), standard rates of missing data can be filled in with high accuracy using Triophase. We note the errors in the inference in the missing data decrease the power of the TDT test. This however is also true when missing data is inferred from data generated by standard genotyping. Evidently, the power lost due to the resulting errors is negligible compared to the power gained by the ability to type 50% more trios.


Using DNA pools for genotyping trios.

Beckman KB, Abel KJ, Braun A, Halperin E - Nucleic Acids Res. (2006)

The error rate of the predicted genotypes using multiplexing pools in trios and the greedy algorithm for missing data estimation. The x-axis is the missing data rate and the y-axis is the resulting genotyping error rate. The pools and the corresponding confidence intervals were simulated on two datasets: The first is a dataset simulated from the coalescent model, taken from the dataset publicly available by (7). The second is chromosome 22, taken from 30 Yoruban trios collected by the HapMap consortium, (15).
© Copyright Policy
Related In: Results  -  Collection

Show All Figures
getmorefigures.php?uid=PMC1636483&req=5

fig3: The error rate of the predicted genotypes using multiplexing pools in trios and the greedy algorithm for missing data estimation. The x-axis is the missing data rate and the y-axis is the resulting genotyping error rate. The pools and the corresponding confidence intervals were simulated on two datasets: The first is a dataset simulated from the coalescent model, taken from the dataset publicly available by (7). The second is chromosome 22, taken from 30 Yoruban trios collected by the HapMap consortium, (15).
Mentions: We evaluated the performance of Triophase with missing data on the set of 30 trios of the Yoruban population available from the HapMap dataset (HapMap Consortium, 2005). We simulated confidence intervals for the pools resulting from this dataset across chromosome 22. The simulated confidence intervals depend on an ambiguous data rate p (for details see Materials and Methods). We evaluated Triophase for ambiguous data rate p ranging from 0 to 14% with 1% increments. The simulation study shows that the error rate is kept 15-fold lower than the rate of ambiguity (Figure 3). Thus, when random SNPs are genotyped in a similar density to the one given in the first phase of HapMap (every 5 kb on the average), standard rates of missing data can be filled in with high accuracy using Triophase. We note the errors in the inference in the missing data decrease the power of the TDT test. This however is also true when missing data is inferred from data generated by standard genotyping. Evidently, the power lost due to the resulting errors is negligible compared to the power gained by the ability to type 50% more trios.

Bottom Line: The genotyping of mother-father-child trios is a very useful tool in disease association studies, as trios eliminate population stratification effects and increase the accuracy of haplotype inference.Unfortunately, the use of trios for association studies may reduce power, since it requires the genotyping of three individuals where only four independent haplotypes are involved.We demonstrate that the error rates in the genotype calls of the proposed protocol are comparable to those of standard genotyping techniques, although the cost is reduced considerably.

View Article: PubMed Central - PubMed

Affiliation: Children's Hospital Oakland Research Institute, 5700 Martin Luther King Jr Way, Oakland, CA 94609, USA.

ABSTRACT
The genotyping of mother-father-child trios is a very useful tool in disease association studies, as trios eliminate population stratification effects and increase the accuracy of haplotype inference. Unfortunately, the use of trios for association studies may reduce power, since it requires the genotyping of three individuals where only four independent haplotypes are involved. We describe here a method for genotyping a trio using two DNA pools, thus reducing the cost of genotyping trios to that of genotyping two individuals. Furthermore, we present extensions to the method that exploit the linkage disequilibrium structure to compensate for missing data and genotyping errors. We evaluated our method on trios from CEPH pedigree 66 of the Coriell Institute. We demonstrate that the error rates in the genotype calls of the proposed protocol are comparable to those of standard genotyping techniques, although the cost is reduced considerably. The approach described is generic and it can be applied to any genotyping platform that achieves a reasonable precision of allele frequency estimates from pools of two individuals. Using this approach, future trio-based association studies may be able to increase the sample size by 50% for the same cost and thereby increase the power to detect associations.

Show MeSH
Related in: MedlinePlus