Limits...
Balance diagnostics for comparing the distribution of baseline covariates between treatment groups in propensity-score matched samples.

Austin PC - Stat Med (2009)

Bottom Line: Conditional on the true propensity score, treated and untreated subjects have similar distributions of observed baseline covariates.Using this approach, matched sets of treated and untreated subjects with similar values of the propensity score are formed.Inferences about treatment effect made using propensity-score matching are valid only if, in the matched sample, treated and untreated subjects have similar distributions of measured baseline covariates.

View Article: PubMed Central - PubMed

Affiliation: Institute for Clinical Evaluative Sciences, G1 06, 2075 Bayview Avenue, Toronto, Ontario, Canada M4N 3M5. peter.austin@ices.on.ca

ABSTRACT
The propensity score is a subject's probability of treatment, conditional on observed baseline covariates. Conditional on the true propensity score, treated and untreated subjects have similar distributions of observed baseline covariates. Propensity-score matching is a popular method of using the propensity score in the medical literature. Using this approach, matched sets of treated and untreated subjects with similar values of the propensity score are formed. Inferences about treatment effect made using propensity-score matching are valid only if, in the matched sample, treated and untreated subjects have similar distributions of measured baseline covariates. In this paper we discuss the following methods for assessing whether the propensity score model has been correctly specified: comparing means and prevalences of baseline characteristics using standardized differences; ratios comparing the variance of continuous covariates between treated and untreated subjects; comparison of higher order moments and interactions; five-number summaries; and graphical methods such as quantile-quantile plots, side-by-side boxplots, and non-parametric density plots for comparing the distribution of baseline covariates between treatment groups. We describe methods to determine the sampling distribution of the standardized difference when the true standardized difference is equal to zero, thereby allowing one to determine the range of standardized differences that are plausible with the propensity score model having been correctly specified. We highlight the limitations of some previously used methods for assessing the adequacy of the specification of the propensity-score model. In particular, methods based on comparing the distribution of the estimated propensity score between treated and untreated subjects are uninformative.

Show MeSH
Density plots and cumulative distribution functions for age.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC3472075&req=5

fig04: Density plots and cumulative distribution functions for age.

Mentions: Figure 3 display side-by-side boxplots and quantile–quantile plots for age in both the unmatched and matched samples. We considered two different propensity-score models. The first was the originally specified propensity-score model. The second was the modification of initial model that was described in Section 4.2.1, in which the original propensity-score model was modified by using restricted cubic splines to model the relationship of the six continuous variables (age, white blood count, hemoglobin, sodium, glucose, and creatinine). Figure 4 displays non-parametric density plots and empirical cumulative distribution functions comparing the distribution of age between treated and untreated subjects in both the unmatched and matched samples. In examining Figure 3, one observes that matching on the initial propensity score has diminished differences in the distribution of age between treated and untreated patients. However, minor residual differences in the upper tail of the distribution persist in the matched sample. The side-by-side boxplots indicate that, in the matched sample, the distribution of age exhibits modestly greater variability in the untreated subjects than it does in the treated subjects. These observations are confirmed in examining the non-parametric density plots and empirical cumulative distribution functions displayed in Figure 4. Matching on the propensity score diminished differences in the distribution of age between treated and untreated subjects. However, slight differences in the distribution of age between the two groups persisted after matching. These residual differences in age between treated and untreated subjects were less apparent when means were compared (either directly or using standardized differences). The graphical comparisons indicate that the distribution of age is essentially identical between treated and untreated subjects in the sample obtained by matching on the modified propensity score. Thus, the graphical comparisons indicate that the modified propensity-score model is preferable to the original propensity-score model.


Balance diagnostics for comparing the distribution of baseline covariates between treatment groups in propensity-score matched samples.

Austin PC - Stat Med (2009)

Density plots and cumulative distribution functions for age.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC3472075&req=5

fig04: Density plots and cumulative distribution functions for age.
Mentions: Figure 3 display side-by-side boxplots and quantile–quantile plots for age in both the unmatched and matched samples. We considered two different propensity-score models. The first was the originally specified propensity-score model. The second was the modification of initial model that was described in Section 4.2.1, in which the original propensity-score model was modified by using restricted cubic splines to model the relationship of the six continuous variables (age, white blood count, hemoglobin, sodium, glucose, and creatinine). Figure 4 displays non-parametric density plots and empirical cumulative distribution functions comparing the distribution of age between treated and untreated subjects in both the unmatched and matched samples. In examining Figure 3, one observes that matching on the initial propensity score has diminished differences in the distribution of age between treated and untreated patients. However, minor residual differences in the upper tail of the distribution persist in the matched sample. The side-by-side boxplots indicate that, in the matched sample, the distribution of age exhibits modestly greater variability in the untreated subjects than it does in the treated subjects. These observations are confirmed in examining the non-parametric density plots and empirical cumulative distribution functions displayed in Figure 4. Matching on the propensity score diminished differences in the distribution of age between treated and untreated subjects. However, slight differences in the distribution of age between the two groups persisted after matching. These residual differences in age between treated and untreated subjects were less apparent when means were compared (either directly or using standardized differences). The graphical comparisons indicate that the distribution of age is essentially identical between treated and untreated subjects in the sample obtained by matching on the modified propensity score. Thus, the graphical comparisons indicate that the modified propensity-score model is preferable to the original propensity-score model.

Bottom Line: Conditional on the true propensity score, treated and untreated subjects have similar distributions of observed baseline covariates.Using this approach, matched sets of treated and untreated subjects with similar values of the propensity score are formed.Inferences about treatment effect made using propensity-score matching are valid only if, in the matched sample, treated and untreated subjects have similar distributions of measured baseline covariates.

View Article: PubMed Central - PubMed

Affiliation: Institute for Clinical Evaluative Sciences, G1 06, 2075 Bayview Avenue, Toronto, Ontario, Canada M4N 3M5. peter.austin@ices.on.ca

ABSTRACT
The propensity score is a subject's probability of treatment, conditional on observed baseline covariates. Conditional on the true propensity score, treated and untreated subjects have similar distributions of observed baseline covariates. Propensity-score matching is a popular method of using the propensity score in the medical literature. Using this approach, matched sets of treated and untreated subjects with similar values of the propensity score are formed. Inferences about treatment effect made using propensity-score matching are valid only if, in the matched sample, treated and untreated subjects have similar distributions of measured baseline covariates. In this paper we discuss the following methods for assessing whether the propensity score model has been correctly specified: comparing means and prevalences of baseline characteristics using standardized differences; ratios comparing the variance of continuous covariates between treated and untreated subjects; comparison of higher order moments and interactions; five-number summaries; and graphical methods such as quantile-quantile plots, side-by-side boxplots, and non-parametric density plots for comparing the distribution of baseline covariates between treatment groups. We describe methods to determine the sampling distribution of the standardized difference when the true standardized difference is equal to zero, thereby allowing one to determine the range of standardized differences that are plausible with the propensity score model having been correctly specified. We highlight the limitations of some previously used methods for assessing the adequacy of the specification of the propensity-score model. In particular, methods based on comparing the distribution of the estimated propensity score between treated and untreated subjects are uninformative.

Show MeSH