Limits...
A method to predict breast cancer stage using Medicare claims.

Smith GL, Shih YC, Giordano SH, Smith BD, Buchholz TA - Epidemiol Perspect Innov (2010)

Bottom Line: The equation predicting stage IV disease achieved sensitivity of 81%, specificity 89%, positive predictive value (PPV) 24%, and negative predictive value (NPV) 99%, while the equation distinguishing stage I/II from stage III disease achieved sensitivity 83%, specificity 78%, PPV 98%, and NPV 31%.Combined, the equations most accurately identified early stage disease and ascertained a sample in which 98% of patients were stage I or II.A claims-based algorithm was utilized to predict breast cancer stage, and was particularly successful when used to identify early stage disease.

View Article: PubMed Central - HTML - PubMed

Affiliation: Department of Radiation Oncology, The University of Texas MD Anderson Cancer Center, 1515 Holcombe Blvd Houston, Texas 77030, USA.

ABSTRACT

Background: In epidemiologic studies, cancer stage is an important predictor of outcomes. However, cancer stage is typically unavailable in medical insurance claims datasets, thus limiting the usefulness of such data for epidemiologic studies. Therefore, we sought to develop an algorithm to predict cancer stage based on covariates available from claims-based data.

Methods: We identified a cohort of 77,306 women age >/= 66 years with stage I-IV breast cancer, using the Surveillence Epidemiology and End Results (SEER)-Medicare database. We formulated an algorithm to predict cancer stage using covariates (demographic, tumor, and treatment characteristics) obtained from claims. Logistic regression models derived prediction equations in a training set, and equations' test characteristics (sensitivity, specificity, positive predictive value (PPV), and negative predictive value [NPV]) were calculated in a validation set.

Results: Of the entire sample of women diagnosed with invasive breast cancer, 51% had stage I; 26% stage II; 11% stage III; and 4% stage IV disease. The equation predicting stage IV disease achieved sensitivity of 81%, specificity 89%, positive predictive value (PPV) 24%, and negative predictive value (NPV) 99%, while the equation distinguishing stage I/II from stage III disease achieved sensitivity 83%, specificity 78%, PPV 98%, and NPV 31%. Combined, the equations most accurately identified early stage disease and ascertained a sample in which 98% of patients were stage I or II.

Conclusions: A claims-based algorithm was utilized to predict breast cancer stage, and was particularly successful when used to identify early stage disease. These prediction equations may be applied in future studies of breast cancer patients, substantially improving the utility of claims-based studies in this group. This method may similarly be employed to develop algorithms permitting claims-based epidemiologic studies of patients with other cancers.

No MeSH data available.


Related in: MedlinePlus

Receiver Operating Curve (ROC) for equation to predict stage I-III disease.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC2818641&req=5

Figure 3: Receiver Operating Curve (ROC) for equation to predict stage I-III disease.

Mentions: The prediction equations were most accurate for isolating patients with early stage disease. Specifically, after applying the two prediction equations sequentially to the validation sample to identify patients with predicted stage I/II disease, a subset of 23,285 patients were selected (of 38,653 patients, 36,417 were predicted to have non-stage IV disease, and of these patients, 23,285 were predicted to have stage I/II disease). The predictive sample actually consisted of 98% gold standard stage I/II disease (22,706 of 23,285), 2% stage III disease (549 of 23,285), and <1% stage IV disease (110 of 23,285). Of all patients with gold standard stage I/II disease (29,546 of 38,653 validation patients), 23% (6,840 of 29,546) were excluded (classified as other than stage I/II) as a result of the algorithm (4,604 from the first model and 2,236 from the second model). (Figure 2, Figure 3).


A method to predict breast cancer stage using Medicare claims.

Smith GL, Shih YC, Giordano SH, Smith BD, Buchholz TA - Epidemiol Perspect Innov (2010)

Receiver Operating Curve (ROC) for equation to predict stage I-III disease.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC2818641&req=5

Figure 3: Receiver Operating Curve (ROC) for equation to predict stage I-III disease.
Mentions: The prediction equations were most accurate for isolating patients with early stage disease. Specifically, after applying the two prediction equations sequentially to the validation sample to identify patients with predicted stage I/II disease, a subset of 23,285 patients were selected (of 38,653 patients, 36,417 were predicted to have non-stage IV disease, and of these patients, 23,285 were predicted to have stage I/II disease). The predictive sample actually consisted of 98% gold standard stage I/II disease (22,706 of 23,285), 2% stage III disease (549 of 23,285), and <1% stage IV disease (110 of 23,285). Of all patients with gold standard stage I/II disease (29,546 of 38,653 validation patients), 23% (6,840 of 29,546) were excluded (classified as other than stage I/II) as a result of the algorithm (4,604 from the first model and 2,236 from the second model). (Figure 2, Figure 3).

Bottom Line: The equation predicting stage IV disease achieved sensitivity of 81%, specificity 89%, positive predictive value (PPV) 24%, and negative predictive value (NPV) 99%, while the equation distinguishing stage I/II from stage III disease achieved sensitivity 83%, specificity 78%, PPV 98%, and NPV 31%.Combined, the equations most accurately identified early stage disease and ascertained a sample in which 98% of patients were stage I or II.A claims-based algorithm was utilized to predict breast cancer stage, and was particularly successful when used to identify early stage disease.

View Article: PubMed Central - HTML - PubMed

Affiliation: Department of Radiation Oncology, The University of Texas MD Anderson Cancer Center, 1515 Holcombe Blvd Houston, Texas 77030, USA.

ABSTRACT

Background: In epidemiologic studies, cancer stage is an important predictor of outcomes. However, cancer stage is typically unavailable in medical insurance claims datasets, thus limiting the usefulness of such data for epidemiologic studies. Therefore, we sought to develop an algorithm to predict cancer stage based on covariates available from claims-based data.

Methods: We identified a cohort of 77,306 women age >/= 66 years with stage I-IV breast cancer, using the Surveillence Epidemiology and End Results (SEER)-Medicare database. We formulated an algorithm to predict cancer stage using covariates (demographic, tumor, and treatment characteristics) obtained from claims. Logistic regression models derived prediction equations in a training set, and equations' test characteristics (sensitivity, specificity, positive predictive value (PPV), and negative predictive value [NPV]) were calculated in a validation set.

Results: Of the entire sample of women diagnosed with invasive breast cancer, 51% had stage I; 26% stage II; 11% stage III; and 4% stage IV disease. The equation predicting stage IV disease achieved sensitivity of 81%, specificity 89%, positive predictive value (PPV) 24%, and negative predictive value (NPV) 99%, while the equation distinguishing stage I/II from stage III disease achieved sensitivity 83%, specificity 78%, PPV 98%, and NPV 31%. Combined, the equations most accurately identified early stage disease and ascertained a sample in which 98% of patients were stage I or II.

Conclusions: A claims-based algorithm was utilized to predict breast cancer stage, and was particularly successful when used to identify early stage disease. These prediction equations may be applied in future studies of breast cancer patients, substantially improving the utility of claims-based studies in this group. This method may similarly be employed to develop algorithms permitting claims-based epidemiologic studies of patients with other cancers.

No MeSH data available.


Related in: MedlinePlus