PROPENSITY SCORES IN THE PRESENCE OF EFFECT MODIFICATION
Ylian S. Liem, MD, MSc, Erasmus MC - University Medical Center Rotterdam, Rotterdam, Netherlands, Wolfgang C. Winkelmayer, MD, ScD, Brigham & Women's Hospital, Harvard Medical School, Boston, MA, John B. Wong, MD, Tufts-New England Medical Center, BOSTON, MA, Frank Th. De Charro, PhD, Dutch End-Stage Renal Disease Registry, Rotterdam, Netherlands, and M.G. Myriam Hunink, PhD, MD, Erasmus Medical Center, Rotterdam, Netherlands.
Purpose: Comparisons of outcomes of different treatments using observational data may be biased because of non-random treatment assignment. Use of propensity scores is increasingly popular to control for this confounding bias but alternative approaches exist. Our aim was to compare a traditional multivariable adjusted with a propensity score adjusted model, specifically in estimating survival of hemodialysis (HD) versus peritoneal dialysis (PD) patients. Methods: We developed a Cox proportional hazards regression model to estimate survival on PD relative to HD using 16,643 patients from the Dutch End-Stage Renal Disease Registry (RENINE). The multivariable model contained dialysis modality adjusted for age, gender, primary renal disease, center and year of start of renal replacement therapy. Several interaction terms were tested. The propensity score predicted PD assignment integrating the above covariates and all possible interaction terms. We evaluated the propensity score according to the following criteria: 1) fit and discriminative power; 2) calibration; and 3) covariate balance. We stratified a univariable Cox model, containing only dialysis modality on strata of the propensity score and, finally, compared the relative mortality risk estimates for PD compared to HD patients of the two models. Results: The propensity score had an R2 of 0.24 and a c-statistic of 0.75. The mean predicted and (observed) scores of being treated with PD by quintiles were 9.1% (8.8%), 21.4% (20.7%), 32.9% (34.3%), 46.3% (46.1%) and 64.6% (64.4%). All five covariates differed significantly between HD and PD patients in the total sample, but when stratifying by propensity score quintiles, only 3 of the 25 covariate comparisons remained significant. The multivariable adjusted model without interaction variables and the propensity score adjusted univariable Cox model resulted in similar relative mortality risk estimates of PD compared to HD (0.99 and 0.97, respectively). These models, however, assumed no effect modification. Entering interaction variables into both models also showed comparable relative mortality risk estimates (multivariable adjusted model: 0.43; propensity score model: 0.44). Conclusions: Although the propensity score showed a good fit, had a good c-statistic, calibrated well and balanced the covariates, it did not alter the treatment effect in the outcomes model and lost its advantage of better interpretability. Our study contributes to the growing literature supporting the use of traditional multivariable regression methods unless sample size is small and outcomes rare.