Limits...
Prioritizing animals for dense genotyping in order to impute missing genotypes of sparsely genotyped animals.

Yu X, Woolliams JA, Meuwissen TH - Genet. Sel. Evol. (2014)

Bottom Line: In a real pig pedigree, the 2500 most recently born pigs of the last generation, i.e. the target animals, were used for sparse genotyping.For all criteria, MCA and MCG performed better than other selection methods, significantly so for all methods other than selection of sires with the largest numbers of offspring.Methods that choose animals that have the closest average relationship or contribution to the target population gave the lowest accuracy of imputation, in some cases worse than random selection, and should be avoided in practice.

View Article: PubMed Central - PubMed

Affiliation: Department of Animal and Aquacultural Sciences, Norwegian University of Life Sciences, PO Box 5003, Ås 1432, Norway. xijiang.yu@nmbu.no.

ABSTRACT

Background: Genotyping accounts for a substantial part of the cost of genomic selection (GS). Using both dense and sparse SNP chips, together with imputation of missing genotypes, can reduce these costs. The aim of this study was to identify the set of candidates that are most important for dense genotyping, when they are used to impute the genotypes of sparsely genotyped animals. In a real pig pedigree, the 2500 most recently born pigs of the last generation, i.e. the target animals, were used for sparse genotyping. Their missing genotypes were imputed using either Beagle or LDMIP from T densely genotyped candidates chosen from the whole pedigree. A new optimization method was derived to identify the best animals for dense genotyping, which minimized the conditional genetic variance of the target animals, using either the pedigree-based relationship matrix (MCA), or a genotypic relationship matrix based on sparse marker genotypes (MCG). These, and five other methods for selecting the T animals were compared, using T = 100 or 200 animals, SNP genotypes were obtained assuming Ne =100 or 200, and MAF thresholds set to D = 0.01, 0.05 or 0.10. The performances of the methods were compared using the following criteria: call rate of true genotypes, accuracy of genotype prediction, and accuracy of genomic evaluations using the imputed genotypes.

Results: For all criteria, MCA and MCG performed better than other selection methods, significantly so for all methods other than selection of sires with the largest numbers of offspring. Methods that choose animals that have the closest average relationship or contribution to the target population gave the lowest accuracy of imputation, in some cases worse than random selection, and should be avoided in practice.

Conclusion: Minimization of the conditional variance of the genotypes in target animals provided an effective optimization procedure for prioritizing animals for genotyping or sequencing.

Show MeSH

Related in: MedlinePlus

Regression of the GEBV accuracy on the correlation between true and imputed genotypes for method RAN with varyingDwhen using Beagle. The three groups of points correspond to, from left to right, 50, 100, and 200 markers per Morgan; dashed line is the regression line for all the data points and solid lines are the local regression lines fitted within each level of D.
© Copyright Policy - open-access
Related In: Results  -  Collection

License 1 - License 2
getmorefigures.php?uid=PMC4283150&req=5

Fig3: Regression of the GEBV accuracy on the correlation between true and imputed genotypes for method RAN with varyingDwhen using Beagle. The three groups of points correspond to, from left to right, 50, 100, and 200 markers per Morgan; dashed line is the regression line for all the data points and solid lines are the local regression lines fitted within each level of D.

Mentions: Figure 3 shows all the data on accuracies of imputation and GEBV obtained with RAN. As D increased, the variance of imputation accuracy among replicates decreased faster than the variance of the accuracy of GEBV, and the regression of GEBV accuracy on imputation accuracy decreased, which suggests diminishing return in GEBV accuracy from increasing imputation accuracy. The variance of GEBV accuracy was larger than the variance of imputation accuracy because GEBV estimation has more sources of error.Figure 3


Prioritizing animals for dense genotyping in order to impute missing genotypes of sparsely genotyped animals.

Yu X, Woolliams JA, Meuwissen TH - Genet. Sel. Evol. (2014)

Regression of the GEBV accuracy on the correlation between true and imputed genotypes for method RAN with varyingDwhen using Beagle. The three groups of points correspond to, from left to right, 50, 100, and 200 markers per Morgan; dashed line is the regression line for all the data points and solid lines are the local regression lines fitted within each level of D.
© Copyright Policy - open-access
Related In: Results  -  Collection

License 1 - License 2
Show All Figures
getmorefigures.php?uid=PMC4283150&req=5

Fig3: Regression of the GEBV accuracy on the correlation between true and imputed genotypes for method RAN with varyingDwhen using Beagle. The three groups of points correspond to, from left to right, 50, 100, and 200 markers per Morgan; dashed line is the regression line for all the data points and solid lines are the local regression lines fitted within each level of D.
Mentions: Figure 3 shows all the data on accuracies of imputation and GEBV obtained with RAN. As D increased, the variance of imputation accuracy among replicates decreased faster than the variance of the accuracy of GEBV, and the regression of GEBV accuracy on imputation accuracy decreased, which suggests diminishing return in GEBV accuracy from increasing imputation accuracy. The variance of GEBV accuracy was larger than the variance of imputation accuracy because GEBV estimation has more sources of error.Figure 3

Bottom Line: In a real pig pedigree, the 2500 most recently born pigs of the last generation, i.e. the target animals, were used for sparse genotyping.For all criteria, MCA and MCG performed better than other selection methods, significantly so for all methods other than selection of sires with the largest numbers of offspring.Methods that choose animals that have the closest average relationship or contribution to the target population gave the lowest accuracy of imputation, in some cases worse than random selection, and should be avoided in practice.

View Article: PubMed Central - PubMed

Affiliation: Department of Animal and Aquacultural Sciences, Norwegian University of Life Sciences, PO Box 5003, Ås 1432, Norway. xijiang.yu@nmbu.no.

ABSTRACT

Background: Genotyping accounts for a substantial part of the cost of genomic selection (GS). Using both dense and sparse SNP chips, together with imputation of missing genotypes, can reduce these costs. The aim of this study was to identify the set of candidates that are most important for dense genotyping, when they are used to impute the genotypes of sparsely genotyped animals. In a real pig pedigree, the 2500 most recently born pigs of the last generation, i.e. the target animals, were used for sparse genotyping. Their missing genotypes were imputed using either Beagle or LDMIP from T densely genotyped candidates chosen from the whole pedigree. A new optimization method was derived to identify the best animals for dense genotyping, which minimized the conditional genetic variance of the target animals, using either the pedigree-based relationship matrix (MCA), or a genotypic relationship matrix based on sparse marker genotypes (MCG). These, and five other methods for selecting the T animals were compared, using T = 100 or 200 animals, SNP genotypes were obtained assuming Ne =100 or 200, and MAF thresholds set to D = 0.01, 0.05 or 0.10. The performances of the methods were compared using the following criteria: call rate of true genotypes, accuracy of genotype prediction, and accuracy of genomic evaluations using the imputed genotypes.

Results: For all criteria, MCA and MCG performed better than other selection methods, significantly so for all methods other than selection of sires with the largest numbers of offspring. Methods that choose animals that have the closest average relationship or contribution to the target population gave the lowest accuracy of imputation, in some cases worse than random selection, and should be avoided in practice.

Conclusion: Minimization of the conditional variance of the genotypes in target animals provided an effective optimization procedure for prioritizing animals for genotyping or sequencing.

Show MeSH
Related in: MedlinePlus