Limits...
Within- and across-breed genomic prediction using whole-genome sequence and single nucleotide polymorphism panels.

Iheshiulor OO, Woolliams JA, Yu X, Wellmann R, Meuwissen TH - Genet. Sel. Evol. (2016)

Bottom Line: The use of WGS data for within-population predictions resulted in small to large increases in accuracy for low to moderately heritable traits.While MixP outperformed SNP-BLUP at 45 QTL/Morgan, SNP-BLUP was as good as MixP when QTL density increased to 132 QTL/Morgan.Our results show that, genomic predictions in numerically small cattle populations would benefit from a combination of WGS data, a multi-breed reference population, and a variable selection method.

View Article: PubMed Central - PubMed

Affiliation: Department of Animal and Aquaculture Sciences, Norwegian University of Life Sciences, 1432, Ås, Norway. oscar.iheshiulor@nmbu.no.

ABSTRACT

Background: Currently, genomic prediction in cattle is largely based on panels of about 54k single nucleotide polymorphisms (SNPs). However with the decreasing costs of and current advances in next-generation sequencing technologies, whole-genome sequence (WGS) data on large numbers of individuals is within reach. Availability of such data provides new opportunities for genomic selection, which need to be explored.

Methods: This simulation study investigated how much predictive ability is gained by using WGS data under scenarios with QTL (quantitative trait loci) densities ranging from 45 to 132 QTL/Morgan and heritabilities ranging from 0.07 to 0.30, compared to different SNP densities, with emphasis on divergent dairy cattle breeds with small populations. The relative performances of best linear unbiased prediction (SNP-BLUP) and of a variable selection method with a mixture of two normal distributions (MixP) were also evaluated. Genomic predictions were based on within-population, across-population, and multi-breed reference populations.

Results: The use of WGS data for within-population predictions resulted in small to large increases in accuracy for low to moderately heritable traits. Depending on heritability of the trait, and on SNP and QTL densities, accuracy increased by up to 31 %. The advantage of WGS data was more pronounced (7 to 92 % increase in accuracy depending on trait heritability, SNP and QTL densities, and time of divergence between populations) with a combined reference population and when using MixP. While MixP outperformed SNP-BLUP at 45 QTL/Morgan, SNP-BLUP was as good as MixP when QTL density increased to 132 QTL/Morgan.

Conclusions: Our results show that, genomic predictions in numerically small cattle populations would benefit from a combination of WGS data, a multi-breed reference population, and a variable selection method.

Show MeSH
QTL variance for one of the replicates of populations A and B at 10 and 50 generations of divergence. QTL variance was calculated as 2pqa2. QTL that were fixed in both populations were excluded. Pop_A and Pop_B refers to populations A and B at T = 10 or 50 generations of divergence, respectively
© Copyright Policy - OpenAccess
Related In: Results  -  Collection

License 1 - License 2
getmorefigures.php?uid=PMC4759725&req=5

Fig2: QTL variance for one of the replicates of populations A and B at 10 and 50 generations of divergence. QTL variance was calculated as 2pqa2. QTL that were fixed in both populations were excluded. Pop_A and Pop_B refers to populations A and B at T = 10 or 50 generations of divergence, respectively

Mentions: Allele frequencies and QTL variances (2pqa2) differed between populations (A and B). As an example, Fig. 1 shows the distribution of allele frequencies for one of the simulated replicates (scenario of 45 QTL/Morgan and T = 10 or 50) for both populations, while Figs. 2 and 3 show the QTL variances. Some SNPs and QTL were fixed in both populations (result not shown). Figure 4 shows the average LD in the WGS dataset, measured as the squared correlation (r2) between adjacent SNPs and the persistency of LD phase of adjacent SNPs between the two populations at different times of divergence, measured as the correlation between the two populations of the phased LD, r, of marker pairs. LD ranged from ~0.36 to 0.40 at genomic distances of 0 to 50 kb, respectively, and this trend was similar in both populations. At a genomic distance of 100 kb, LD dropped to about 0.25. As expected, LD decreased further with increasing genomic distance between SNPs (Fig. 4). Persistence of r for adjacent SNPs between populations was equal to ~0.85 at 50 kb for the scenario with T = 10 and ~0.65 for T = 50. This implies that LD of very close SNPs was more persistent between the two populations at T = 10 than at T = 50. A gradual decline in r was observed with increasing genomic distance. For data3000 to data1000, r2 and r were lower (especially for data1000) as a result of the decreasing SNP density and increasing inter-SNP distance (result not shown). This also affected r2 and r results for data200. The average inter-SNP distances were equal to 21, 33, 50, 100 and 496 kb for the WGS data and data3000 to data200, respectively.Fig. 1


Within- and across-breed genomic prediction using whole-genome sequence and single nucleotide polymorphism panels.

Iheshiulor OO, Woolliams JA, Yu X, Wellmann R, Meuwissen TH - Genet. Sel. Evol. (2016)

QTL variance for one of the replicates of populations A and B at 10 and 50 generations of divergence. QTL variance was calculated as 2pqa2. QTL that were fixed in both populations were excluded. Pop_A and Pop_B refers to populations A and B at T = 10 or 50 generations of divergence, respectively
© Copyright Policy - OpenAccess
Related In: Results  -  Collection

License 1 - License 2
Show All Figures
getmorefigures.php?uid=PMC4759725&req=5

Fig2: QTL variance for one of the replicates of populations A and B at 10 and 50 generations of divergence. QTL variance was calculated as 2pqa2. QTL that were fixed in both populations were excluded. Pop_A and Pop_B refers to populations A and B at T = 10 or 50 generations of divergence, respectively
Mentions: Allele frequencies and QTL variances (2pqa2) differed between populations (A and B). As an example, Fig. 1 shows the distribution of allele frequencies for one of the simulated replicates (scenario of 45 QTL/Morgan and T = 10 or 50) for both populations, while Figs. 2 and 3 show the QTL variances. Some SNPs and QTL were fixed in both populations (result not shown). Figure 4 shows the average LD in the WGS dataset, measured as the squared correlation (r2) between adjacent SNPs and the persistency of LD phase of adjacent SNPs between the two populations at different times of divergence, measured as the correlation between the two populations of the phased LD, r, of marker pairs. LD ranged from ~0.36 to 0.40 at genomic distances of 0 to 50 kb, respectively, and this trend was similar in both populations. At a genomic distance of 100 kb, LD dropped to about 0.25. As expected, LD decreased further with increasing genomic distance between SNPs (Fig. 4). Persistence of r for adjacent SNPs between populations was equal to ~0.85 at 50 kb for the scenario with T = 10 and ~0.65 for T = 50. This implies that LD of very close SNPs was more persistent between the two populations at T = 10 than at T = 50. A gradual decline in r was observed with increasing genomic distance. For data3000 to data1000, r2 and r were lower (especially for data1000) as a result of the decreasing SNP density and increasing inter-SNP distance (result not shown). This also affected r2 and r results for data200. The average inter-SNP distances were equal to 21, 33, 50, 100 and 496 kb for the WGS data and data3000 to data200, respectively.Fig. 1

Bottom Line: The use of WGS data for within-population predictions resulted in small to large increases in accuracy for low to moderately heritable traits.While MixP outperformed SNP-BLUP at 45 QTL/Morgan, SNP-BLUP was as good as MixP when QTL density increased to 132 QTL/Morgan.Our results show that, genomic predictions in numerically small cattle populations would benefit from a combination of WGS data, a multi-breed reference population, and a variable selection method.

View Article: PubMed Central - PubMed

Affiliation: Department of Animal and Aquaculture Sciences, Norwegian University of Life Sciences, 1432, Ås, Norway. oscar.iheshiulor@nmbu.no.

ABSTRACT

Background: Currently, genomic prediction in cattle is largely based on panels of about 54k single nucleotide polymorphisms (SNPs). However with the decreasing costs of and current advances in next-generation sequencing technologies, whole-genome sequence (WGS) data on large numbers of individuals is within reach. Availability of such data provides new opportunities for genomic selection, which need to be explored.

Methods: This simulation study investigated how much predictive ability is gained by using WGS data under scenarios with QTL (quantitative trait loci) densities ranging from 45 to 132 QTL/Morgan and heritabilities ranging from 0.07 to 0.30, compared to different SNP densities, with emphasis on divergent dairy cattle breeds with small populations. The relative performances of best linear unbiased prediction (SNP-BLUP) and of a variable selection method with a mixture of two normal distributions (MixP) were also evaluated. Genomic predictions were based on within-population, across-population, and multi-breed reference populations.

Results: The use of WGS data for within-population predictions resulted in small to large increases in accuracy for low to moderately heritable traits. Depending on heritability of the trait, and on SNP and QTL densities, accuracy increased by up to 31 %. The advantage of WGS data was more pronounced (7 to 92 % increase in accuracy depending on trait heritability, SNP and QTL densities, and time of divergence between populations) with a combined reference population and when using MixP. While MixP outperformed SNP-BLUP at 45 QTL/Morgan, SNP-BLUP was as good as MixP when QTL density increased to 132 QTL/Morgan.

Conclusions: Our results show that, genomic predictions in numerically small cattle populations would benefit from a combination of WGS data, a multi-breed reference population, and a variable selection method.

Show MeSH