ORAL ABSTRACTS: OPTIMIZING DECISION MAKING
Mark Eckman, MD, MS
University of Cincinnati, Division of General Internal Medicine and Center for Clinical Effectiveness
Professor of Medicine
Purpose: The ability of Markov Chain Monte Carlo (MCMC) Bayesian network meta-analysis (NMA) to rank treatments is one of its most appealing features; however, rankings of treatments may be misleading if they are not considered together with estimates of their relative effectiveness. The purpose of this work was to explore the robustness of the rank probabilities obtained from Bayesian NMA by calculating them under increasingly stringent thresholds for the relative effect that defines a treatment effect difference.
Method: We modified the usual rankings procedure for Bayesian NMA to allow that two MCMC samples of treatment effects had to differ by a non-zero amount before one effect would be considered better than the other. On the odds ratios scale, we examined thresholds for the relative effect from 0.6 (a large difference) up to 1 (any difference). We applied this revised rankings procedure to all published systematic reviews using NMA from the field of cardiovascular medicine that had trial-level binary data available. We reran all the NMAs and in each one, for the two treatments identified as being the best, examined the effect on the rank probabilities of using increasingly stringent decision thresholds.
Result: We included 14 systematic reviews, having a median of 20 randomized trials and 9 treatments. The best treatments had rank probabilities that ranged from 38% to 85.3%. The effect of increasing the stringency of the decision thresholds on the probability of a treatment being best varied across reviews, with the probability of being best changing less than 20% in the most robust settings, but decreasing to almost 0% in the least robust.
Conclusion: Rank probabilities can be fragile to changes in the decision threshold used to claim that one treatment is more effective than another. Our revised procedure that includes these thresholds in the calculation of rankings may aid their interpretation and use in clinical practice.
Figure: Example of rapidly decreasing probability of a treatment being best when clinically important thresholds are considered.
Methods: Data consisted of 1,045 cases from Johns Hopkins University who were diagnosed with localized prostate cancer between 1980 and 2015 and experienced PSA-R after RP. 52.1% received some form of ST consisting of radiation therapy, hormone therapy or both. We used marginal structural models (MSM) to estimate the benefit of ST administered at PSA-R while accounting for time-dependent selection into ST. The fitted MSM was then used as a basis for a simulation model that projected times to metastasis in the absence and presence of ST and time to other-cause death. Benefit was estimated as the reduction in the fraction of cases metastasizing over 10 years and harm as the fraction of cases overtreated, in the sense that without ST they would not have reached the point of clinical metastasis in their lifetimes.
Results: The median follow-up time after PSA-R was 5 years and 26% of the cohort experienced metastatic progression. The hazard ratio associated with ST was 0.41 (95% CI (0.31,0.55)) indicating an almost 60% reduction in the risk of metastasis associated with ST. The projected 10-year cumulative risk in the absence and presence of ST was 43% and 23% respectively. The benefit of treating all cases at PSA-R was 20% reduction in the 10-year risk of metastasis with 30% overtreated (harm/benefit=1.5). Men with Gleason score 6 or below had 9% benefit with 61% overtreated (harm/benefit=6.95) and men with Gleason score 6 or below and 5+ years from RP to PSA-R had 5% benefit with 74% overtreated (harm/benefit 14.9).
Conclusion: Management of PSA-R should take into account the benefits of and harms of ST. Benefit-harm tradeoffs are highly variable across some of patient risk. Selective ST approaches may improve harm-benefit tradeoffs relative to treating all patients at PSA-R.
Purpose: Most actors in U.S. health policy would agree that low-value care should be avoided, but value is not consistently defined. Physician groups, such as those included in the Choosing Wisely Campaign, and other researchers have created lists of services that should be avoided in value-based frameworks, but these lists have typically focused on services that both increase costs and do not show significant clinical benefit. We sought to identify low-value services that improve health but are not worth their additional costs based on cost-effectiveness evidence.
Methods: We search the Tufts Cost-Effectiveness Analysis Registry (CEAR) online database for published cost-effectiveness studies with incremental cost-effectiveness ratios (ICERs) >$100,000/quality-adjusted life year (QALY). Our search terms included highly prevalent disease areas and widely-used and/or expensive procedures (Footnote). We included cost-utility studies for the U.S. setting published between 2000-2014 and recorded the ICER, prevention stage, intervention type, disease classification, and quality score (ranging from 1 [worst] to 7 [best]) from the CEAR.
Results: We found 102 published cost-effectiveness studies for healthcare services the U.S. with ICERs >$100,000/QALY, of which 67 had ICERs greater than $150,000/QALY and 39 greater than $250,000/QALY. ICERs ranged from $110,000-$5,400,000/QALY (median of $210,000/QALY). Study quality scores ranged from 3-6 (median of 5.0). About half of the ICERs were for preventive services (19% primary prevention, 34% secondary prevention). The most common intervention types among these services were screening (30%), pharmaceutical (25%), and diagnostic (24%); the most common disease classifications were cancers (30%), cardiovascular diseases (18%), and infectious diseases (9%). Among the 45 low-value services on the initial Choosing Wisely Campaign list, only 5 were also on our list.
Conclusions: Reducing low-value care will require not only moving away from healthcare services that harm health (or have no significant impact on health) but also those that improve health but at a cost that is unfavorable per conventional cost-effectiveness standards. Our list aids this process by using ICERs as a systematic metric for defining value, leveraging publicly available and comprehensive cost-utility data from CEAR, and focusing on a previously unaddressed set of services (health-improving but cost-ineffective). Even with our limited search and not having access to the complete CEAR database, we found 97 potentially low-value services per cost-effectiveness standards not included in the initial Choosing Wisely Campaign list.
Method: We develop a multi-period treatment budget allocation model to evaluate the trade-offs of treatment prioritization guidelines including first-come first-served, priority to patients with most severe disease, priority to patients with most severe disease with age-stratification, priority to patients in order of the incremental cost effectiveness ratio (ICER) of treatment, and a priority sequence identified through optimization to maximize population lifetime discounted net monetary benefit (NMB). For the case of hepatitis C, we compare prioritization guidelines in terms of the number of individuals treated, the number of individuals with compensated cirrhosis, the number of individuals who progress to end-stage liver-disease (ESLD), population total quality-adjusted life-years (QALYs), and NMB.
Result: First-come first-served treats more people at lower near-term risk of disease progression or complications. When age-stratification is included, priority to younger patients (compared to older patients with the same disease severity) results in fewer cases of disease progression and/or disease-related complications because of higher competing mortality risks faced by older patients. A guideline developed from maximizing the population lifetime discounted NMB in a multi-period framework explicitly accounts for the trade-offs in the timing of prioritizing subgroups including the consequence of requiring other specific subgroups to wait longer for treatment and the expected QALYs lost from potential disease progression. In contrast, prioritization based on ICER does not incorporate the relative consequences of waiting across subgroups. In the case of hepatitis C treatment prioritization, the optimization strategy yields the greatest population QALYs and NMB, though prioritization by disease severity prevents more cases of ESLD.
Conclusion: Explicit prioritization can improve population health outcomes. Differences in outcomes between prioritization guidelines increase when the available budget is smaller. Prioritizing on ICER does not necessarily maximize QALYs over multiple cohorts because its allocation of treatment resources to one group does not account for the optimal timing of resources for other groups. Determining the optimal prioritization guideline is important in terms of care delivery and for evaluating alternative prioritization guidelines.
Method: We developed two analogous VOI decision models for antiplatelet-treated post-MI STEMI patients receiving primary PCI, and chronologically assessed how cardiovascular risk evidence levels, drug costs, and PGx assay cost impacted the expected value of perfect information (EVPI) over time. Modeled comparators included PGx-guided therapy, concomitant clopidogrel-proton pump inhibitor therapy, and clopidogrel, prasugrel, and ticagrelor monotherapies. For clopidogrel PGx and DDI, we derived chronological estimates of the risk for major adverse cardiovascular events (MACE) from cumulative meta-analysis of observational studies specific to PCI patients. Drug costs were wholesale acquisition costs over the past 6.75 years. We assumed PGx assay costs decreased from $500 to $100 over 6.75 years. The primary outcome for both models was the EVPI per patient over 6.75 years. The secondary outcome for both models was the clinical EVPI (costs set to zero) per patient over 6.75 years.
Result: The EVPI for both the PGx and DDI decision problems were generally similar and tended to decrease over time as evidence accumulated and costs evolved, decreasing from approximately $900/patient in both analyses in 2009 to $200/patient (PGx) and $400/patient (DDI) by the end of 2015. PGx-guided antiplatelet therapy became the preferred strategy as MACE risk uncertainty decreased, while DDI became less favorable as MACE risk uncertainty decreased. There was greater simulation uncertainty (thus higher EVPI) when costs were not considered, as prasugrel (~30%) or ticagrelor (~50%) tended to provide the most net benefit after the comparatively lower cost of off-patent clopidogrel was ignored. The EVPI in both models was most impacted by: a) the 2012 expiration of clopidogrel patent protection, and b) the 2011 introduction of ticagrelor.
Conclusion: American Heart Association guidelines refrain from recommending both CYP2C19 genotyping and avoidance of proton pump inhibitors for clopidogrel patients in the absence of randomized controlled trial evidence. Conversely, clopidogrel’s FDA label recommends avoidance of both PGx and DDI drug-attenuating interactions. Our findings suggest that the evidence levels for clopidogrel PGx and DDI are similar. Chronological evidence level quantification using VOI analysis may be useful for informing pharmacogenomic testing clinical guidelines and drug labeling decisions.
Current clinical guidelines recommend thrombolysis with alteplase (tPA) within 4.5 hours of acute ischemic stroke onset. We quantified the potential value of new research in patients treated with tPA in the 4.5-6.0 hour window after stroke onset and determined the optimal size of a future trial using value of information analysis.
Expected value of partial perfect and sample information analyses (EVPPI and EVSI) were conducted using a previously developed probabilistic acute stroke Markov model. Stroke outcome was characterized using the modified Rankin Scale (mRS), which ranges from 0 (no stroke symptoms) to 6 (death); in the model, mRS outcomes determine discounted lifetime stroke costs and quality-adjusted life years (QALYs). Data for mRS distributions in patients 4.5-6.0 hours since stroke onset for tPA (n=576) and placebo (n=543) were obtained from eight pooled RCTs, with odds ratios for good outcome (mRS0-1) of 1.22 (95% CI: 0.92-1.61) and death (mRS6) of 1.49 (95% CI: 1.00-2.21). We parameterized mRS outcomes for tPA and placebo using dirichlet distributions. EVSI was quantified with net monetary benefit (assuming willingness-to-pay for health=$100,000/QALY). We calculated discounted population-level EVSI by multiplying per-person EVSI by the annual number of stroke patients in the U.S. eligible for tPA in the 4.5-6.0 hour treatment window (115,572) and assuming a 10-year timeframe of treatment use. Study costs were based on administrative costs (fixed costs=$500,000, per-person variable costs=$2,000) and the costs of tPA (per-person costs=$6,720, treatment group only).
The base-case lifetime cost-effectiveness analysis showed that tPA was dominated (i.e., more costly and less effective) by placebo in this patient group. EVPPI for mRS distributions was $1,003 per person. Based on EVSI, the optimal sample size of a new trial collecting data on tPA efficacy (quantified by mRS distributions) in these patients would be 5,600 across study arms with expected population-level societal returns (EVSI minus study costs) of $68.7 million (Figure 1).
Expanding research attention to the 4.5-6.0 hour time window for tPA treatment of acute ischemic stroke patients is justified as the expected returns are substantial. Even a relatively large trial in which more information on treatment efficacy based on mRS scores is collected would represent good value for information. Results were sensitive to willingness-to-pay for health, timeframe of treatment use, and variable study costs inputs.