Limits...
Design of early validation trials of biomarkers.

Normolle D, Ruffin MT, Brenner D - Cancer Inform (2005)

Bottom Line: Early-phase studies must be designed as part of a development program, considering the final use of the marker, directly informing the decision to made at the study's conclusion.Therefore, they should test for sensitivity and specificity that would be minimally acceptable to proceed to the next stage of development.Receiver operating characteristic (ROC) curves, which are useful descriptive tools, may be misleading when evaluating tests in low-prevalence populations, because they emphasize the relationship between specificity and sensitivity in the range of specificity likely to be too low to be useful in mass screening applications.

View Article: PubMed Central - PubMed

Affiliation: Department of Radiation Oncology, University of Michigan Medical School, University of Michigan Comprehensive Cancer Center Biostatistics Unit, USA. monk@umich.edu

ABSTRACT
The design of early-phase studies of putative screening markers in clinical populations is discussed. Biological, epidemiological, statistical and computational issues all affect the design of early-phase studies of these markers, but there are frequently little or no data in hand to facilitate the design. Early-phase studies must be designed as part of a development program, considering the final use of the marker, directly informing the decision to made at the study's conclusion. Therefore, they should test for sensitivity and specificity that would be minimally acceptable to proceed to the next stage of development. Designing such trials requires explicit assumptions about prevalence and false positive and negative costs in the ultimate target population. Early discussion of these issues strengthens the development process, since enthusiasm for developing technologies is balanced by realism about the requirements of a valid population screen. Receiver operating characteristic (ROC) curves, which are useful descriptive tools, may be misleading when evaluating tests in low-prevalence populations, because they emphasize the relationship between specificity and sensitivity in the range of specificity likely to be too low to be useful in mass screening applications.

No MeSH data available.


Monte Carlo simulation (10,000 replications) demonstrating predictive accuracy of a statistical classifier based on 30 cases and 30 controls. Cross-validation estimator pools all 60 cases and recalculates the classifier by leaving out the training observation to be classified. The independent test set classifier splits the cases and controls in half and uses the first half for training and the second half for testing. The vertical bars connect the first and third quartiles of the observed accuracies, while the profiles connect the medians. The horizontal line is defined as in Figure 2.
© Copyright Policy - open-access
Related In: Results  -  Collection

License 1 - License 2
getmorefigures.php?uid=PMC2657653&req=5

f3-cin-01-25: Monte Carlo simulation (10,000 replications) demonstrating predictive accuracy of a statistical classifier based on 30 cases and 30 controls. Cross-validation estimator pools all 60 cases and recalculates the classifier by leaving out the training observation to be classified. The independent test set classifier splits the cases and controls in half and uses the first half for training and the second half for testing. The vertical bars connect the first and third quartiles of the observed accuracies, while the profiles connect the medians. The horizontal line is defined as in Figure 2.

Mentions: From the experiment described in Figure 2, we observe that, even when the variables to be used in the classifier are fixed in advance, the re-substitution estimator significantly over-estimates the accuracy, the over-estimation increases as the number of parameters estimated, relative to the number of training samples, increases, and the cross-validation estimator achieves nearly the same accuracy as the independent test set estimator. Given a fixed number of cases and controls, should an independent test set be held out, or should the classifier be cross-validated? Figure 3 demonstrates an answer to the question when the classifier is based on a linear function of the data (e.g. LDF or logistic regression) and the variables to be used in classification are known in advance. Here, sixty observations are classified two ways, by dividing them into equal-sized training and testing sets, and by using all sixty observations in a cross-validated classifier. It is seen that, since there are almost twice as many training observations available to the cross-validated classifier than the independent test set classifier, it is more accurate, and displays less variation in accuracy. This example demonstrates that, if the variables used for prediction are known, cross-validation, if possible (given the method of classification), uses the observations more efficiently than holding some aside from coefficient estimation as a test set.


Design of early validation trials of biomarkers.

Normolle D, Ruffin MT, Brenner D - Cancer Inform (2005)

Monte Carlo simulation (10,000 replications) demonstrating predictive accuracy of a statistical classifier based on 30 cases and 30 controls. Cross-validation estimator pools all 60 cases and recalculates the classifier by leaving out the training observation to be classified. The independent test set classifier splits the cases and controls in half and uses the first half for training and the second half for testing. The vertical bars connect the first and third quartiles of the observed accuracies, while the profiles connect the medians. The horizontal line is defined as in Figure 2.
© Copyright Policy - open-access
Related In: Results  -  Collection

License 1 - License 2
Show All Figures
getmorefigures.php?uid=PMC2657653&req=5

f3-cin-01-25: Monte Carlo simulation (10,000 replications) demonstrating predictive accuracy of a statistical classifier based on 30 cases and 30 controls. Cross-validation estimator pools all 60 cases and recalculates the classifier by leaving out the training observation to be classified. The independent test set classifier splits the cases and controls in half and uses the first half for training and the second half for testing. The vertical bars connect the first and third quartiles of the observed accuracies, while the profiles connect the medians. The horizontal line is defined as in Figure 2.
Mentions: From the experiment described in Figure 2, we observe that, even when the variables to be used in the classifier are fixed in advance, the re-substitution estimator significantly over-estimates the accuracy, the over-estimation increases as the number of parameters estimated, relative to the number of training samples, increases, and the cross-validation estimator achieves nearly the same accuracy as the independent test set estimator. Given a fixed number of cases and controls, should an independent test set be held out, or should the classifier be cross-validated? Figure 3 demonstrates an answer to the question when the classifier is based on a linear function of the data (e.g. LDF or logistic regression) and the variables to be used in classification are known in advance. Here, sixty observations are classified two ways, by dividing them into equal-sized training and testing sets, and by using all sixty observations in a cross-validated classifier. It is seen that, since there are almost twice as many training observations available to the cross-validated classifier than the independent test set classifier, it is more accurate, and displays less variation in accuracy. This example demonstrates that, if the variables used for prediction are known, cross-validation, if possible (given the method of classification), uses the observations more efficiently than holding some aside from coefficient estimation as a test set.

Bottom Line: Early-phase studies must be designed as part of a development program, considering the final use of the marker, directly informing the decision to made at the study's conclusion.Therefore, they should test for sensitivity and specificity that would be minimally acceptable to proceed to the next stage of development.Receiver operating characteristic (ROC) curves, which are useful descriptive tools, may be misleading when evaluating tests in low-prevalence populations, because they emphasize the relationship between specificity and sensitivity in the range of specificity likely to be too low to be useful in mass screening applications.

View Article: PubMed Central - PubMed

Affiliation: Department of Radiation Oncology, University of Michigan Medical School, University of Michigan Comprehensive Cancer Center Biostatistics Unit, USA. monk@umich.edu

ABSTRACT
The design of early-phase studies of putative screening markers in clinical populations is discussed. Biological, epidemiological, statistical and computational issues all affect the design of early-phase studies of these markers, but there are frequently little or no data in hand to facilitate the design. Early-phase studies must be designed as part of a development program, considering the final use of the marker, directly informing the decision to made at the study's conclusion. Therefore, they should test for sensitivity and specificity that would be minimally acceptable to proceed to the next stage of development. Designing such trials requires explicit assumptions about prevalence and false positive and negative costs in the ultimate target population. Early discussion of these issues strengthens the development process, since enthusiasm for developing technologies is balanced by realism about the requirements of a valid population screen. Receiver operating characteristic (ROC) curves, which are useful descriptive tools, may be misleading when evaluating tests in low-prevalence populations, because they emphasize the relationship between specificity and sensitivity in the range of specificity likely to be too low to be useful in mass screening applications.

No MeSH data available.