THEORETICAL ADVANCES IN MODELING METHODS
* Finalists for the Lee B. Lusted Student Prize
Meta-analysis of individual participant data (IPD) is increasingly utilized for combining data from both randomized trials and observational studies. However, an important concern in IPD meta-analysis is that outcomes are often partially or completely missing for some studies. This paper aims to: i) present a multivariate Bayesian hierarchical model for addressing the missing data, the between-study heterogeneity and multiple mixed outcomes (continuous and discrete); ii) compare the relative performance of the Bayesian approach with alternative, simpler methods such as multiple imputation (MI) and complete-case analysis.
We use a simulation study to assess the relative performance of alternative strategies for dealing with the missing data across different settings in IPD meta-analysis. For example, we consider scenarios which differ according to: number of studies, level of between-study heterogeneity, proportion of missing data, missingness predictors (e.g. covariates vs outcomes), and missingness mechanisms including missing at random (MAR) and missing not at random (MNAR). We report performance in terms of bias, root mean square error (rMSE), and confidence interval (CI) coverage for estimating treatment effects on both continuous and binary outcomes. To illustrate these approaches, we consider a meta-analysis of randomized controlled trials comparing the effectiveness of implantable cardiac devices to treat heart failure, with missing binary and continuous outcomes (mortality, functional and quality of life endpoints).
Under ideal circumstances, for example, 20 studies, low between-study heterogeneity and MAR conditional on observed covariates, both MI and the Bayesian joint model provided unbiased estimates and CI coverage close to the nominal level (95%). With increased levels of missing data, strong correlation between outcomes, and each outcome being MAR conditional on the other endpoint, the Bayesian joint model provided estimates relatively closer to the true values and better CI coverage than MI methods (Figure 1). When data were MNAR (but the methods assumed MAR) the Bayesian model still provided estimates less biased than MI. In the case-study, inferences about the effectiveness of alternative heart failure devices differed according to method.
The Bayesian approach performed well across a wide range of settings, and provides an appropriate tool for jointly handling the missing data, between-study heterogeneity, and correlated mixed outcomes in IPD meta-analysis.
Methods: We simulated a two-arm randomized clinical trial and compared the performance of four different bootstrap approaches to predict average treatment costs. In the first approach, we bootstrapped the sample and ran a GLM to estimate costs for all patients regardless of discontinuation status, as is often done in the literature. In the second method, we only used predicted costs from the same GLM for missing values. In the third method, we modified the second approach by including a random component to the predicted costs based on the fit of the GLM. Finally, we used predicted costs with a random component for all of the patients. We repeated this exercise under varying scenarios to identify factors that may influence the predicted costs (e.g., proportion of missing data, mechanism of missingness, distribution of costs, sample size, GLM link function).
Results: We compared the observed costs with predicted costs from the 4 approaches on the basis of bias and coverage. All four approaches were similar with respect to bias, but coverage was improved when a random component was included.
Conclusions: Our study enabled us to evaluate different bootstrap procedures for handling missing costs data. Approaches that utilized observed data or incorporate a random component to the predicted cost perform better under certain scenarios.
Current decision analytic modeling guidelines suggest the use of life tables for the derivation of all-cause mortality probabilities. There exist two main methods for estimating mortality probabilities; the period method, which is based on data from one year’s life table, and the cohort method which projects future mortality rates based on historical life tables. This simulation study aims at identifying the impact of using cohort versus period methods on the outcomes of cost effectiveness analyses.
A simulation study was designed based on a two-state Markov model (alive-death) that compared a hypothetical intervention against no intervention. The model was populated with age-specific all-cause mortality probabilities estimated using the period and cohort methods. Mortality and population data were extracted from the Human Mortality Database. The cohort mortality probabilities were estimated using the Lee-Carter method. The model outcomes were total costs, total life years (LY) and incremental net benefit (INB), assuming a hypothetical threshold of $50,000/LY. The proportional distance between the INBs of the two mortality estimation methods (pINB) was the outcome of each simulation. The following parameters were simultaneously varied: discounting rate (0- 0.07), intervention effect (relative risk of mortality: 0.5-0.9), age at intervention ( birth-40 years old), duration of intervention effect (1 year/10 years/ lifelong), acute/chronic intervention, time gap between intervention start and intervention effect (immediate/in 10 years). Simulations were conducted for two countries, one with an average increase in life expectancy (Canada) and one with rapid increase in life expectancy (Taiwan). The impact of each parameter on the pINB was measured as the proportion of total variation explained by the parameter.
The mortality estimation method had a large impact on the pINB across the parameter combinations (0%-63.2%). The impact was greater when varying age at intervention (21% of variation explained), discount rates (8.8%), countries (7%) and time gap between intervention start and intervention effect (8%). The impact of the magnitude of the intervention effect was less pronounced.
When using cohort versus period methods, substantial differences were observed in model outcomes. Given that the magnitude and the direction of the impact of mortality estimation methods on the model outcomes is multifactorial, decisions on the mortality estimation method used in economic evaluations should be considered after conducting sensitivity analyses using both methods.
Methods: The use of common random numbers (CRN) is a powerful variance reduction technique for simulation, and the resulting synchronization of events across counterfactual model runs allows for causal inference. However, the structure of competing events can result in bias when CRN is used. In particular, the simulation of events with multiple, mutually exclusive, non-ordered outcomes is prone to bias under CRN. Using examples from the Maternal Health Policy microsimulation model we explore the construction of various event simulations and present solutions for removing structural bias.
Result: Constructing a typical cumulative probability distribution (CPD) allows mutually exclusive events to be simulated easily. Given an inherent ordering of outcomes (e.g. severity levels of a disease), constructing the CPD in decreasing order of severity results in unbiased simulations under CRN across counterfactual scenarios at both the aggregate and individual level. However, for outcomes with no inherent ordering (e.g. type of obstetric complication, choice of contraceptive method, etc.) the construction of a typical CPD will result in biased outcomes at the individual level. CRN ensures that for each event, the same point in the CPD is sampled in every simulation. Therefore, only changes in the probability of an event in the CPD (such as the reduction of one type of obstetric complication) will result in an alternate outcome. However, given a fixed random number and static order of events in the CPD, any alternate outcome is likely to be adjacent or near to the original outcome. In other words, the chance of each alternate outcome occurring is no longer proportional to its probability, but rather is highly influenced by its arbitrary position in the CPD relative to the original outcome. To correct for this structural bias we developed a method to divide and randomly allocate outcome shares within a CPD, and demonstrate how this method resolves bias at the individual event level.
Conclusion: The use of CRN is an important modeling technique, but care should be taken to avoid unintended consequences and ensure that structural bias does not occur. The random allocation of shares of non-ordered outcomes within a CPD is a feasible approach to address this problem.
Purpose: The ISPOR-SMDM Modeling Good Research Practices Task Force recommends the use of half-cycle correction (HCC) to model outcomes such as costs and effectiveness calculated with discrete-time state-transition models (DTSTM). However, there is still no consensus in the modeling community on why and how to perform the correction. In addition, published studies did not use the true gold standard.
Objective: To provide a theoretical foundation of HCC and compare the performance of different HCC methods in reducing errors in DTSTM outcomes both mathematically and numerically.
Method: We defined six half-cycle correction methods from numerical integration field: Riemann sum of rectangles (left, midpoint, right) or trapezoids (trapezoidal rule), life-table, Simpson's composite 1/3rd and 3/8th rules. We applied these methods to a standard three-state disease progression Markov chain to evaluate the cost-effectiveness of a hypothetical intervention. We solved the discrete- and continuous-time (our gold standard) versions of the model analytically and derived expressions for cumulative incidence of disease, life expectancy, discounted quality-adjusted life years, discounted disease costs, incremental cost-effectiveness ratio.
Results: The basis for the currently recommended HCC method of correcting by half of the reward in the first cycle and in the final cycle is a trapezoidal rule. Because the standard HCC method was based on a comparison with the trapezoidal rule rather than a comparison with the gold standard method, there will always be an error. We also found situations where applying the standard HCC can do more harm than good. The performance of each method depends on the function that needs to be integrated. Contrary to conventional wisdom, the approximation errors need not cancel each other out or become insignificant when incremental outcomes are calculated. We found that a wrong decision can be made if the more accurate method is not applied (Fig. 1A). The size of the error was vastly reduced when a shorter cycle length was selected; Simpson's 1/3rd rule was fastest method to converge to the gold standard (Fig. 1B).
Conclusion: Cumulative outcomes in DTSTMs are prone to errors that can be reduced with more accurate methods like Simpson's rules. We clarified several misconceptions and provided recommendations and algorithms for practical implementation of these methods.