Limits...
Genome-wide regression and prediction with the BGLR statistical package.

Pérez P, de los Campos G - Genetics (2014)

Bottom Line: The response can be continuous (censored or not) or categorical (either binary or ordinal).The algorithm is based on a Gibbs sampler with scalar updates and the implementation takes advantage of efficient compiled C and Fortran routines.In this article we describe the methods implemented in BGLR, present examples of the use of the package, and discuss practical issues emerging in real-data analysis.

View Article: PubMed Central - PubMed

Affiliation: Socio Economía Estadística e Informática, Colegio de Postgraduados 56230, México perpdgo@colpos.mx.

Show MeSH

Related in: MedlinePlus

Estimated correlations between phenotypes and predictions in testing data sets (for a total of 100 training–testing partitions) by model (pedigree in the horizontal axis and pedigree + markers in the vertical axis) by environment (E1–E4).
© Copyright Policy - open-access
Related In: Results  -  Collection


getmorefigures.php?uid=PMC4196607&req=5

fig6: Estimated correlations between phenotypes and predictions in testing data sets (for a total of 100 training–testing partitions) by model (pedigree in the horizontal axis and pedigree + markers in the vertical axis) by environment (E1–E4).

Mentions: Box 13, gives an example of an evaluation based on 100 TRN–TST partitions. In each partition two models (P, pedigree and PM, pedigree + markers) were fitted and used to predict yield in the TST data set. This yielded 100 estimates of the prediction correlation for each of the models fitted. These estimates should be regarded as paired samples because both share a common feature: the TRN–TST partition. Several statistics can be computed to compare the two models fitted, and a natural approach for testing the hypotheses H0: P and PM have the same prediction accuracy vs. HA: the prediction accuracy of models P and PM are different is to conduct a paired-t-test based on the difference of the correlation coefficients. Figure 6 gives the estimated correlations for the pedigree + markers model (PM, vertical axis) vs. the pedigree-only model (P, horizontal axis) by environment. The great majority of the points lay above the 45° line indicating that in most partitions the PM model had higher prediction accuracy than the P-only model. The paired-t-test had P-values <0.001 in all environments indicating strong evidence against the hypothesis (H0: P and PM have the same prediction accuracy). The code used to generate the plot in Figure 6 and to carry out the t-test is given in Box S7, File S1.


Genome-wide regression and prediction with the BGLR statistical package.

Pérez P, de los Campos G - Genetics (2014)

Estimated correlations between phenotypes and predictions in testing data sets (for a total of 100 training–testing partitions) by model (pedigree in the horizontal axis and pedigree + markers in the vertical axis) by environment (E1–E4).
© Copyright Policy - open-access
Related In: Results  -  Collection

Show All Figures
getmorefigures.php?uid=PMC4196607&req=5

fig6: Estimated correlations between phenotypes and predictions in testing data sets (for a total of 100 training–testing partitions) by model (pedigree in the horizontal axis and pedigree + markers in the vertical axis) by environment (E1–E4).
Mentions: Box 13, gives an example of an evaluation based on 100 TRN–TST partitions. In each partition two models (P, pedigree and PM, pedigree + markers) were fitted and used to predict yield in the TST data set. This yielded 100 estimates of the prediction correlation for each of the models fitted. These estimates should be regarded as paired samples because both share a common feature: the TRN–TST partition. Several statistics can be computed to compare the two models fitted, and a natural approach for testing the hypotheses H0: P and PM have the same prediction accuracy vs. HA: the prediction accuracy of models P and PM are different is to conduct a paired-t-test based on the difference of the correlation coefficients. Figure 6 gives the estimated correlations for the pedigree + markers model (PM, vertical axis) vs. the pedigree-only model (P, horizontal axis) by environment. The great majority of the points lay above the 45° line indicating that in most partitions the PM model had higher prediction accuracy than the P-only model. The paired-t-test had P-values <0.001 in all environments indicating strong evidence against the hypothesis (H0: P and PM have the same prediction accuracy). The code used to generate the plot in Figure 6 and to carry out the t-test is given in Box S7, File S1.

Bottom Line: The response can be continuous (censored or not) or categorical (either binary or ordinal).The algorithm is based on a Gibbs sampler with scalar updates and the implementation takes advantage of efficient compiled C and Fortran routines.In this article we describe the methods implemented in BGLR, present examples of the use of the package, and discuss practical issues emerging in real-data analysis.

View Article: PubMed Central - PubMed

Affiliation: Socio Economía Estadística e Informática, Colegio de Postgraduados 56230, México perpdgo@colpos.mx.

Show MeSH
Related in: MedlinePlus