Limits...
Threshold-free measures for assessing the performance of medical screening tests.

Yuan Y, Su W, Zhu M - Front Public Health (2015)

Bottom Line: These measures are compared using two screening test examples under rare and common disease prevalence rates.The AP is an attractive alternative to the AUC for the evaluation and comparison of medical screening tests.It could improve the effectiveness of screening programs during the planning stage.

View Article: PubMed Central - PubMed

Affiliation: School of Public Health, University of Alberta , Edmonton, AB , Canada.

ABSTRACT

Background: The area under the receiver operating characteristic curve (AUC) is frequently used as a performance measure for medical tests. It is a threshold-free measure that is independent of the disease prevalence rate. We evaluate the utility of the AUC against an alternate measure called the average positive predictive value (AP), in the setting of many medical screening programs where the disease has a low prevalence rate.

Methods: We define the two measures using a common notation system and show that both measures can be expressed as a weighted average of the density function of the diseased subjects. The weights for the AP include prevalence in some form, but those for the AUC do not. These measures are compared using two screening test examples under rare and common disease prevalence rates.

Results: The AP measures the predictive power of a test, which varies when the prevalence rate changes, unlike the AUC, which is prevalence independent. The relationship between the AP and the prevalence rate depends on the underlying screening/diagnostic test. Therefore, the AP provides relevant information to clinical researchers and regulators about how a test is likely to perform in a screening population.

Conclusion: The AP is an attractive alternative to the AUC for the evaluation and comparison of medical screening tests. It could improve the effectiveness of screening programs during the planning stage.

No MeSH data available.


Related in: MedlinePlus

Prostate cancer example. Histograms for biomarkers that are ranked differently by the AP and by the AUC. Red and yellow histograms represent cases and controls, respectively. Pair (A) (8355.562, 7819.751) scored similarly on the AUC-scale but very differently on the AP-scale. Pair (B) (9149.121, 5074.164) scored somewhat similarly on the AP-scale but very differently on the AUC-scale.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC4403252&req=5

Figure 2: Prostate cancer example. Histograms for biomarkers that are ranked differently by the AP and by the AUC. Red and yellow histograms represent cases and controls, respectively. Pair (A) (8355.562, 7819.751) scored similarly on the AUC-scale but very differently on the AP-scale. Pair (B) (9149.121, 5074.164) scored somewhat similarly on the AP-scale but very differently on the AUC-scale.

Mentions: To explore and investigate the implications, we selected two pairs of biomarkers: pair A (8355.562 and 7819.751), which had very similar AUC scores but very different AP scores; and pair B (9149.121 and 5074.164), which scored similarly on the AP-scale but very differently on the AUC-scale. Figure 2 displays the histograms of the raw data for the two selected pairs. Figure 3 compares the resulting ROC curves of the two pairs. We can see clearly from Figure 3 that the two biomarkers in pair A have qualitatively different ROC curves, yet their AUC values are very similar. For the two biomarkers in pair B, one can immediately discern that 5074.164 has a larger area under its ROC curve (i.e., larger AUC), yet their AP-values are similar.


Threshold-free measures for assessing the performance of medical screening tests.

Yuan Y, Su W, Zhu M - Front Public Health (2015)

Prostate cancer example. Histograms for biomarkers that are ranked differently by the AP and by the AUC. Red and yellow histograms represent cases and controls, respectively. Pair (A) (8355.562, 7819.751) scored similarly on the AUC-scale but very differently on the AP-scale. Pair (B) (9149.121, 5074.164) scored somewhat similarly on the AP-scale but very differently on the AUC-scale.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC4403252&req=5

Figure 2: Prostate cancer example. Histograms for biomarkers that are ranked differently by the AP and by the AUC. Red and yellow histograms represent cases and controls, respectively. Pair (A) (8355.562, 7819.751) scored similarly on the AUC-scale but very differently on the AP-scale. Pair (B) (9149.121, 5074.164) scored somewhat similarly on the AP-scale but very differently on the AUC-scale.
Mentions: To explore and investigate the implications, we selected two pairs of biomarkers: pair A (8355.562 and 7819.751), which had very similar AUC scores but very different AP scores; and pair B (9149.121 and 5074.164), which scored similarly on the AP-scale but very differently on the AUC-scale. Figure 2 displays the histograms of the raw data for the two selected pairs. Figure 3 compares the resulting ROC curves of the two pairs. We can see clearly from Figure 3 that the two biomarkers in pair A have qualitatively different ROC curves, yet their AUC values are very similar. For the two biomarkers in pair B, one can immediately discern that 5074.164 has a larger area under its ROC curve (i.e., larger AUC), yet their AP-values are similar.

Bottom Line: These measures are compared using two screening test examples under rare and common disease prevalence rates.The AP is an attractive alternative to the AUC for the evaluation and comparison of medical screening tests.It could improve the effectiveness of screening programs during the planning stage.

View Article: PubMed Central - PubMed

Affiliation: School of Public Health, University of Alberta , Edmonton, AB , Canada.

ABSTRACT

Background: The area under the receiver operating characteristic curve (AUC) is frequently used as a performance measure for medical tests. It is a threshold-free measure that is independent of the disease prevalence rate. We evaluate the utility of the AUC against an alternate measure called the average positive predictive value (AP), in the setting of many medical screening programs where the disease has a low prevalence rate.

Methods: We define the two measures using a common notation system and show that both measures can be expressed as a weighted average of the density function of the diseased subjects. The weights for the AP include prevalence in some form, but those for the AUC do not. These measures are compared using two screening test examples under rare and common disease prevalence rates.

Results: The AP measures the predictive power of a test, which varies when the prevalence rate changes, unlike the AUC, which is prevalence independent. The relationship between the AP and the prevalence rate depends on the underlying screening/diagnostic test. Therefore, the AP provides relevant information to clinical researchers and regulators about how a test is likely to perform in a screening population.

Conclusion: The AP is an attractive alternative to the AUC for the evaluation and comparison of medical screening tests. It could improve the effectiveness of screening programs during the planning stage.

No MeSH data available.


Related in: MedlinePlus