Limits...
Genomic prediction in maize breeding populations with genotyping-by-sequencing.

Crossa J, Beyene Y, Kassa S, Pérez P, Hickey JM, Chen C, de los Campos G, Burgueño J, Windhausen VS, Buckler E, Jannink JL, Lopez Cruz MA, Babu R - G3 (Bethesda) (2013)

Bottom Line: Therefore, GBS has become an attractive alternative technology for genomic selection.However, the use of GBS data poses important challenges, and the accuracy of genomic prediction using GBS is currently undergoing investigation in several crops, including maize, wheat, and cassava.The following results were found: relative to pedigree or marker-only models, there were consistent gains in prediction accuracy by combining pedigree and GBS data; there was increased predictive ability when using imputed or nonimputed GBS data over inferred haplotype in experiment 1, or nonimputed GBS and information-based imputed short and long haplotypes, as compared to the other methods in experiment 2; the level of prediction accuracy achieved using GBS data in experiment 2 is comparable to those reported by previous authors who analyzed this data set using SNP arrays; and GBLUP and RKHS models with pedigree with nonimputed and imputed GBS data provided the best prediction correlations for the three traits in experiment 1, whereas for experiment 2 RKHS provided slightly better prediction than GBLUP for drought-stressed environments, and both models provided similar predictions in well-watered environments.

View Article: PubMed Central - PubMed

Affiliation: International Maize and Wheat Improvement Center (CIMMYT), Apdo. Postal 6-641, 06600, Mexico DF, Mexico.

ABSTRACT
Genotyping-by-sequencing (GBS) technologies have proven capacity for delivering large numbers of marker genotypes with potentially less ascertainment bias than standard single nucleotide polymorphism (SNP) arrays. Therefore, GBS has become an attractive alternative technology for genomic selection. However, the use of GBS data poses important challenges, and the accuracy of genomic prediction using GBS is currently undergoing investigation in several crops, including maize, wheat, and cassava. The main objective of this study was to evaluate various methods for incorporating GBS information and compare them with pedigree models for predicting genetic values of lines from two maize populations evaluated for different traits measured in different environments (experiments 1 and 2). Given that GBS data come with a large percentage of uncalled genotypes, we evaluated methods using nonimputed, imputed, and GBS-inferred haplotypes of different lengths (short or long). GBS and pedigree data were incorporated into statistical models using either the genomic best linear unbiased predictors (GBLUP) or the reproducing kernel Hilbert spaces (RKHS) regressions, and prediction accuracy was quantified using cross-validation methods. The following results were found: relative to pedigree or marker-only models, there were consistent gains in prediction accuracy by combining pedigree and GBS data; there was increased predictive ability when using imputed or nonimputed GBS data over inferred haplotype in experiment 1, or nonimputed GBS and information-based imputed short and long haplotypes, as compared to the other methods in experiment 2; the level of prediction accuracy achieved using GBS data in experiment 2 is comparable to those reported by previous authors who analyzed this data set using SNP arrays; and GBLUP and RKHS models with pedigree with nonimputed and imputed GBS data provided the best prediction correlations for the three traits in experiment 1, whereas for experiment 2 RKHS provided slightly better prediction than GBLUP for drought-stressed environments, and both models provided similar predictions in well-watered environments.

Show MeSH

Related in: MedlinePlus

(A and B) Graphs of the box plot of r2 and linkage disequilibrium decay at different marker distances in the (A) 504 DH maize lines (experiment 1) and (B) 296 maize lines (experiment 2). The r2 and linkage disequilibrium decay are shown at different marker distances for chromosome 9.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC3815055&req=5

figB.9__a__b: (A and B) Graphs of the box plot of r2 and linkage disequilibrium decay at different marker distances in the (A) 504 DH maize lines (experiment 1) and (B) 296 maize lines (experiment 2). The r2 and linkage disequilibrium decay are shown at different marker distances for chromosome 9.

Mentions: Measures of LD at various genetic distances by data set and chromosome were computed. Plots of r2vs. distance by chromosome and data set are depicted in Appendix B (Figure B1, a and b; Figure B2, a and b; Figure B3, a and b; Figure B4, a and b; Figure B5, a and b; Figure B6, a and b; Figure B7, a and b; Figure B8, a and b; Figure B9, a and b; and Figure B10, a and b). The r2 between adjacent markers decreased very quickly in experiment 2, as expected for maize, and the median r2 achieved very low values at distances of 0.5 Mb or longer. The patterns of LD in experiment 1 were very different; here, the average r2 remained relatively high (values of ∼0.2) even at very long distances, and there was great deal of variability in r2 even at long distances. This occurs because the association of alleles in this data set is largely driven by family linkage, whereas in experiment 2 the association patterns of alleles are dominated by population LD.


Genomic prediction in maize breeding populations with genotyping-by-sequencing.

Crossa J, Beyene Y, Kassa S, Pérez P, Hickey JM, Chen C, de los Campos G, Burgueño J, Windhausen VS, Buckler E, Jannink JL, Lopez Cruz MA, Babu R - G3 (Bethesda) (2013)

(A and B) Graphs of the box plot of r2 and linkage disequilibrium decay at different marker distances in the (A) 504 DH maize lines (experiment 1) and (B) 296 maize lines (experiment 2). The r2 and linkage disequilibrium decay are shown at different marker distances for chromosome 9.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC3815055&req=5

figB.9__a__b: (A and B) Graphs of the box plot of r2 and linkage disequilibrium decay at different marker distances in the (A) 504 DH maize lines (experiment 1) and (B) 296 maize lines (experiment 2). The r2 and linkage disequilibrium decay are shown at different marker distances for chromosome 9.
Mentions: Measures of LD at various genetic distances by data set and chromosome were computed. Plots of r2vs. distance by chromosome and data set are depicted in Appendix B (Figure B1, a and b; Figure B2, a and b; Figure B3, a and b; Figure B4, a and b; Figure B5, a and b; Figure B6, a and b; Figure B7, a and b; Figure B8, a and b; Figure B9, a and b; and Figure B10, a and b). The r2 between adjacent markers decreased very quickly in experiment 2, as expected for maize, and the median r2 achieved very low values at distances of 0.5 Mb or longer. The patterns of LD in experiment 1 were very different; here, the average r2 remained relatively high (values of ∼0.2) even at very long distances, and there was great deal of variability in r2 even at long distances. This occurs because the association of alleles in this data set is largely driven by family linkage, whereas in experiment 2 the association patterns of alleles are dominated by population LD.

Bottom Line: Therefore, GBS has become an attractive alternative technology for genomic selection.However, the use of GBS data poses important challenges, and the accuracy of genomic prediction using GBS is currently undergoing investigation in several crops, including maize, wheat, and cassava.The following results were found: relative to pedigree or marker-only models, there were consistent gains in prediction accuracy by combining pedigree and GBS data; there was increased predictive ability when using imputed or nonimputed GBS data over inferred haplotype in experiment 1, or nonimputed GBS and information-based imputed short and long haplotypes, as compared to the other methods in experiment 2; the level of prediction accuracy achieved using GBS data in experiment 2 is comparable to those reported by previous authors who analyzed this data set using SNP arrays; and GBLUP and RKHS models with pedigree with nonimputed and imputed GBS data provided the best prediction correlations for the three traits in experiment 1, whereas for experiment 2 RKHS provided slightly better prediction than GBLUP for drought-stressed environments, and both models provided similar predictions in well-watered environments.

View Article: PubMed Central - PubMed

Affiliation: International Maize and Wheat Improvement Center (CIMMYT), Apdo. Postal 6-641, 06600, Mexico DF, Mexico.

ABSTRACT
Genotyping-by-sequencing (GBS) technologies have proven capacity for delivering large numbers of marker genotypes with potentially less ascertainment bias than standard single nucleotide polymorphism (SNP) arrays. Therefore, GBS has become an attractive alternative technology for genomic selection. However, the use of GBS data poses important challenges, and the accuracy of genomic prediction using GBS is currently undergoing investigation in several crops, including maize, wheat, and cassava. The main objective of this study was to evaluate various methods for incorporating GBS information and compare them with pedigree models for predicting genetic values of lines from two maize populations evaluated for different traits measured in different environments (experiments 1 and 2). Given that GBS data come with a large percentage of uncalled genotypes, we evaluated methods using nonimputed, imputed, and GBS-inferred haplotypes of different lengths (short or long). GBS and pedigree data were incorporated into statistical models using either the genomic best linear unbiased predictors (GBLUP) or the reproducing kernel Hilbert spaces (RKHS) regressions, and prediction accuracy was quantified using cross-validation methods. The following results were found: relative to pedigree or marker-only models, there were consistent gains in prediction accuracy by combining pedigree and GBS data; there was increased predictive ability when using imputed or nonimputed GBS data over inferred haplotype in experiment 1, or nonimputed GBS and information-based imputed short and long haplotypes, as compared to the other methods in experiment 2; the level of prediction accuracy achieved using GBS data in experiment 2 is comparable to those reported by previous authors who analyzed this data set using SNP arrays; and GBLUP and RKHS models with pedigree with nonimputed and imputed GBS data provided the best prediction correlations for the three traits in experiment 1, whereas for experiment 2 RKHS provided slightly better prediction than GBLUP for drought-stressed environments, and both models provided similar predictions in well-watered environments.

Show MeSH
Related in: MedlinePlus