Limits...
External validation and calibration of IVFpredict: a national prospective cohort study of 130,960 in vitro fertilisation cycles.

Smith AD, Tilling K, Lawlor DA, Nelson SM - PLoS ONE (2015)

Bottom Line: IVFpredict had markedly better calibration and higher diagnostic accuracy, with calibration plot intercept of 0.040 (95% CI: 0.017-0.063) and slope of 0.932 (95% CI: 0.839-1.025) compared with 0.080 (95% CI: 0.044-0.117) and 1.419 (95% CI: 1.149-1.690) for the Templeton model.Updating the models to reflect improvements in live birth rates since the models were developed enhanced their performance, but IVFpredict remained superior.External validation in a large population cohort confirms IVFpredict has superior discrimination and calibration for informing patients, clinicians and healthcare policy makers of the probability of live birth following IVF.

View Article: PubMed Central - PubMed

Affiliation: Medical Research Council Integrative Epidemiology Unit, the University of Bristol, Bristol, United Kingdom; School of Social and Community Medicine, University of Bristol, Bristol, United Kingdom.

ABSTRACT

Background: Accurately predicting the probability of a live birth after in vitro fertilisation (IVF) is important for patients, healthcare providers and policy makers. Two prediction models (Templeton and IVFpredict) have been previously developed from UK data and are widely used internationally. The more recent of these, IVFpredict, was shown to have greater predictive power in the development dataset. The aim of this study was external validation of the two models and comparison of their predictive ability.

Methods and findings: 130,960 IVF cycles undertaken in the UK in 2008-2010 were used to validate and compare the Templeton and IVFpredict models. Discriminatory power was calculated using the area under the receiver-operator curve and calibration assessed using a calibration plot and Hosmer-Lemeshow statistic. The scaled modified Brier score, with measures of reliability and resolution, were calculated to assess overall accuracy. Both models were compared after updating for current live birth rates to ensure that the average observed and predicted live birth rates were equal. The discriminative power of both methods was comparable: the area under the receiver-operator curve was 0.628 (95% confidence interval (CI): 0.625-0.631) for IVFpredict and 0.616 (95% CI: 0.613-0.620) for the Templeton model. IVFpredict had markedly better calibration and higher diagnostic accuracy, with calibration plot intercept of 0.040 (95% CI: 0.017-0.063) and slope of 0.932 (95% CI: 0.839-1.025) compared with 0.080 (95% CI: 0.044-0.117) and 1.419 (95% CI: 1.149-1.690) for the Templeton model. Both models underestimated the live birth rate, but this was particularly marked in the Templeton model. Updating the models to reflect improvements in live birth rates since the models were developed enhanced their performance, but IVFpredict remained superior.

Conclusion: External validation in a large population cohort confirms IVFpredict has superior discrimination and calibration for informing patients, clinicians and healthcare policy makers of the probability of live birth following IVF.

No MeSH data available.


Related in: MedlinePlus

Calibration plot for the IVFpredict and Templeton models.Based on 130,960 IVF cycles. Hosmer-Lemeshow test statistics: p<0.001. Solid line, IVFpredict model; dashed line, Templeton model; dotted, diagonal line, perfect prediction (reference).
© Copyright Policy
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC4390202&req=5

pone.0121357.g002: Calibration plot for the IVFpredict and Templeton models.Based on 130,960 IVF cycles. Hosmer-Lemeshow test statistics: p<0.001. Solid line, IVFpredict model; dashed line, Templeton model; dotted, diagonal line, perfect prediction (reference).

Mentions: Fig. 2 shows calibration plots for the IVFpredict and Templeton models, showing observed versus expected live birth rates per decile of the linear predictor of each model. Perfect calibration is depicted by the reference line in Fig. 2, which has an intercept of 0 and a slope of 1. In contrast, the IVFpredict calibration plot had an intercept of 0.040 (95% CI: 0.017–0.063) and slope of 0.932 (95% CI: 0.839–1.025), and the Templeton model calibration plot had an intercept of 0.080 (95% CI: 0.044–0.117) and slope of 1.419 (95% CI: 1.149–1.690). Both models underestimate the live birth rate—this is seen in Fig. 2 as the calibration curves lie above the reference line—indicating that observed live birth rates were above those predicted. This is particularly marked in the Templeton Model. The actual differences between observed and predicted live birth rates are given in S1 Table.


External validation and calibration of IVFpredict: a national prospective cohort study of 130,960 in vitro fertilisation cycles.

Smith AD, Tilling K, Lawlor DA, Nelson SM - PLoS ONE (2015)

Calibration plot for the IVFpredict and Templeton models.Based on 130,960 IVF cycles. Hosmer-Lemeshow test statistics: p<0.001. Solid line, IVFpredict model; dashed line, Templeton model; dotted, diagonal line, perfect prediction (reference).
© Copyright Policy
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC4390202&req=5

pone.0121357.g002: Calibration plot for the IVFpredict and Templeton models.Based on 130,960 IVF cycles. Hosmer-Lemeshow test statistics: p<0.001. Solid line, IVFpredict model; dashed line, Templeton model; dotted, diagonal line, perfect prediction (reference).
Mentions: Fig. 2 shows calibration plots for the IVFpredict and Templeton models, showing observed versus expected live birth rates per decile of the linear predictor of each model. Perfect calibration is depicted by the reference line in Fig. 2, which has an intercept of 0 and a slope of 1. In contrast, the IVFpredict calibration plot had an intercept of 0.040 (95% CI: 0.017–0.063) and slope of 0.932 (95% CI: 0.839–1.025), and the Templeton model calibration plot had an intercept of 0.080 (95% CI: 0.044–0.117) and slope of 1.419 (95% CI: 1.149–1.690). Both models underestimate the live birth rate—this is seen in Fig. 2 as the calibration curves lie above the reference line—indicating that observed live birth rates were above those predicted. This is particularly marked in the Templeton Model. The actual differences between observed and predicted live birth rates are given in S1 Table.

Bottom Line: IVFpredict had markedly better calibration and higher diagnostic accuracy, with calibration plot intercept of 0.040 (95% CI: 0.017-0.063) and slope of 0.932 (95% CI: 0.839-1.025) compared with 0.080 (95% CI: 0.044-0.117) and 1.419 (95% CI: 1.149-1.690) for the Templeton model.Updating the models to reflect improvements in live birth rates since the models were developed enhanced their performance, but IVFpredict remained superior.External validation in a large population cohort confirms IVFpredict has superior discrimination and calibration for informing patients, clinicians and healthcare policy makers of the probability of live birth following IVF.

View Article: PubMed Central - PubMed

Affiliation: Medical Research Council Integrative Epidemiology Unit, the University of Bristol, Bristol, United Kingdom; School of Social and Community Medicine, University of Bristol, Bristol, United Kingdom.

ABSTRACT

Background: Accurately predicting the probability of a live birth after in vitro fertilisation (IVF) is important for patients, healthcare providers and policy makers. Two prediction models (Templeton and IVFpredict) have been previously developed from UK data and are widely used internationally. The more recent of these, IVFpredict, was shown to have greater predictive power in the development dataset. The aim of this study was external validation of the two models and comparison of their predictive ability.

Methods and findings: 130,960 IVF cycles undertaken in the UK in 2008-2010 were used to validate and compare the Templeton and IVFpredict models. Discriminatory power was calculated using the area under the receiver-operator curve and calibration assessed using a calibration plot and Hosmer-Lemeshow statistic. The scaled modified Brier score, with measures of reliability and resolution, were calculated to assess overall accuracy. Both models were compared after updating for current live birth rates to ensure that the average observed and predicted live birth rates were equal. The discriminative power of both methods was comparable: the area under the receiver-operator curve was 0.628 (95% confidence interval (CI): 0.625-0.631) for IVFpredict and 0.616 (95% CI: 0.613-0.620) for the Templeton model. IVFpredict had markedly better calibration and higher diagnostic accuracy, with calibration plot intercept of 0.040 (95% CI: 0.017-0.063) and slope of 0.932 (95% CI: 0.839-1.025) compared with 0.080 (95% CI: 0.044-0.117) and 1.419 (95% CI: 1.149-1.690) for the Templeton model. Both models underestimated the live birth rate, but this was particularly marked in the Templeton model. Updating the models to reflect improvements in live birth rates since the models were developed enhanced their performance, but IVFpredict remained superior.

Conclusion: External validation in a large population cohort confirms IVFpredict has superior discrimination and calibration for informing patients, clinicians and healthcare policy makers of the probability of live birth following IVF.

No MeSH data available.


Related in: MedlinePlus