Limits...
Statistical methods to correct for verification bias in diagnostic studies are inadequate when there are few false negatives: a simulation study.

Cronin AM, Vickers AJ - BMC Med Res Methodol (2008)

Bottom Line: A common feature of diagnostic research is that results for a diagnostic gold standard are available primarily for patients who are positive for the test under investigation.The 2.5th - 97.5th centile range constituted as much as 60% of the possible range of AUCs for some simulations.Screening programs are designed such that there are few false negatives.

View Article: PubMed Central - HTML - PubMed

Affiliation: Department of Epidemiology and Biostatistics, Memorial Sloan-Kettering Cancer Center, NY, NY 10021, USA. serioa@mskcc.org

ABSTRACT

Background: A common feature of diagnostic research is that results for a diagnostic gold standard are available primarily for patients who are positive for the test under investigation. Data from such studies are subject to what has been termed "verification bias". We evaluated statistical methods for verification bias correction when there are few false negatives.

Methods: A simulation study was conducted of a screening study subject to verification bias. We compared estimates of the area-under-the-curve (AUC) corrected for verification bias varying both the rate and mechanism of verification.

Results: In a single simulated data set, varying false negatives from 0 to 4 led to verification bias corrected AUCs ranging from 0.550 to 0.852. Excess variation associated with low numbers of false negatives was confirmed in simulation studies and by analyses of published studies that incorporated verification bias correction. The 2.5th - 97.5th centile range constituted as much as 60% of the possible range of AUCs for some simulations.

Conclusion: Screening programs are designed such that there are few false negatives. Standard statistical methods for verification bias correction are inadequate in this circumstance.

Show MeSH
Example of data subject to verification bias.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC2600821&req=5

Figure 1: Example of data subject to verification bias.

Mentions: Take the case of study of screening for cancer, where the aim is to determine the relationship between results of the screening test and true disease status. Patients are screened using an imaging technology (the diagnostic test), and those with abnormal findings recommended for biopsy (the gold standard assessment). A hypothetical example from such a screening program is shown in Figure 1. A total of 500 patients are screened and 100 have abnormal findings. Since those with abnormal findings are strongly recommended to undergo biopsy, 75/100 decide to have a biopsy and 50/75 are confirmed to have disease present. Of the 400 patients with normal findings, 40 are nonetheless biopsied, and 5 are found to have disease.


Statistical methods to correct for verification bias in diagnostic studies are inadequate when there are few false negatives: a simulation study.

Cronin AM, Vickers AJ - BMC Med Res Methodol (2008)

Example of data subject to verification bias.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC2600821&req=5

Figure 1: Example of data subject to verification bias.
Mentions: Take the case of study of screening for cancer, where the aim is to determine the relationship between results of the screening test and true disease status. Patients are screened using an imaging technology (the diagnostic test), and those with abnormal findings recommended for biopsy (the gold standard assessment). A hypothetical example from such a screening program is shown in Figure 1. A total of 500 patients are screened and 100 have abnormal findings. Since those with abnormal findings are strongly recommended to undergo biopsy, 75/100 decide to have a biopsy and 50/75 are confirmed to have disease present. Of the 400 patients with normal findings, 40 are nonetheless biopsied, and 5 are found to have disease.

Bottom Line: A common feature of diagnostic research is that results for a diagnostic gold standard are available primarily for patients who are positive for the test under investigation.The 2.5th - 97.5th centile range constituted as much as 60% of the possible range of AUCs for some simulations.Screening programs are designed such that there are few false negatives.

View Article: PubMed Central - HTML - PubMed

Affiliation: Department of Epidemiology and Biostatistics, Memorial Sloan-Kettering Cancer Center, NY, NY 10021, USA. serioa@mskcc.org

ABSTRACT

Background: A common feature of diagnostic research is that results for a diagnostic gold standard are available primarily for patients who are positive for the test under investigation. Data from such studies are subject to what has been termed "verification bias". We evaluated statistical methods for verification bias correction when there are few false negatives.

Methods: A simulation study was conducted of a screening study subject to verification bias. We compared estimates of the area-under-the-curve (AUC) corrected for verification bias varying both the rate and mechanism of verification.

Results: In a single simulated data set, varying false negatives from 0 to 4 led to verification bias corrected AUCs ranging from 0.550 to 0.852. Excess variation associated with low numbers of false negatives was confirmed in simulation studies and by analyses of published studies that incorporated verification bias correction. The 2.5th - 97.5th centile range constituted as much as 60% of the possible range of AUCs for some simulations.

Conclusion: Screening programs are designed such that there are few false negatives. Standard statistical methods for verification bias correction are inadequate in this circumstance.

Show MeSH