|Category Reference for Presentations|
|AHE||Applied Health Economics||DEC||Decision Psychology and Shared Decision Making|
|HSP||Health Services, and Policy Research||MET||Quantitative Methods and Theoretical Developments|
* Candidate for the Lee B. Lusted Student Prize Competition
Methods: Using data from the Health Survey for England (HSE) and the National Health Measurement Study (NHMS) we develop a multivariate ordered probit (MVOP) model for the analysis of the EQ-5D responses and compare its performance against other approaches proposed in the literature, such as response mapping (eg. multinomial logit, ML) and univariate regression models (applied directly on the EQ-5D index score). Models goodness‑of-fit assessment is carried out using the Deviance Information Criteria (DIC), while their in-sample and out-of-sample predictive abilities (crucial when developing mapping algorithms) are assessed using Bayesian proper scoring rules. Departing from the use of measures based on the predicted mean such as the (root) mean squared error, scoring rules exploit instead the whole posterior predictive distribution of the parameters of the model, thus reflecting both central tendency and uncertainty in the prediction. The analysis is implemented within a Bayesian framework.
Results : The MVOP fits the two independent datasets better (DIC: 15,145 for the NHMS and 45,550 HSE) than the ML (DIC: 15,703 for the NHMS and 47,140 HSE) and of the independent ordered probit for each dimensions (DIC: 15,720 for the NHMS and 45,550 HSE). Assessment of their predictive posterior distribution shows that the MVOP has better coverage of the central tendency measure (in-sample validation), and better out-of-sample predictive ability (0.531 for the MVOP vs 0.513 for the independent univariate ordered probit vs 0.481 for the ML).
Conclusions: Explicit modelling of both correlation between the responses on each of the five dimensions of the EQ‑5D and the natural ordering of the severity levels within each dimension yields more accurate predictions. Modelling at the response level, rather than at the index score, facilitates a more generalisable assessment of the EQ-5D responses which is not confounded by the valuation set used in each country.
Purpose: Cardiac implantable electronic device (CIED) leads fail stochastically, requiring the immediate implantation of a new lead(s). Because the total number of concurrently implanted leads (both functioning and failed) is subject to a maximum (i.e., five leads according to current guidelines), whenever a lead fails, it may be beneficial to extract this lead and/or any previously abandoned leads. Extraction, however, carries small but real life-threatening risks that increase in lead dwell time. Therefore, a tradeoff exists between maintaining space for new leads and avoiding risky extractions. Furthermore, surgical lead procedures involve a risk of infection. If an infection occurs, all implanted leads must be extracted. Hence, choosing to leave leads in place at the time of failures may result in risky, mandatory extractions. The purpose of this study is to determine a patient-specific extraction policy to maximize the expected lifetime of a single chamber pacemaker patient using a Markov decision process (MDP) model.
Method: We develop a MDP model to dynamically make extraction decisions at the time of lead failures as a function of patient and all lead ages. We also simulate this process to obtain prediction intervals on measures of interest including the expected patient lifetime and the likelihood of CIED-related deaths (as opposed to natural causes). Finally, we conduct comparisons to three heuristics commonly used in practice.
Results: Under the optimal policy, the extraction decision for each lead only depends on its age, patient age, its rank among the lead ages and the total number of implanted leads, i.e., the decision does not depend on the exact ages of all implanted leads. Figure 1 illustrates the optimal lead maintenance policy for a specific, single chamber pacemaker patient. Compared to the heuristic policies, the optimal policy significantly decreases CIED-related deaths and increases the expected lifetime, e.g., under the policy in Figure 1, a 60-year-old patient with failed leads of ages 20, 17, 12 and 2, observes an increase (decrease) of up to 1.5 years (7%) in their expected lifetime (likelihood of CIED-related deaths).
Conclusion: Cardiac leads are often referred to as “the weakest link” in implantable cardiac device treatment. Despite its importance, lead maintenance varies widely from practice to practice. We develop an approach that helps clinicians make patient-specific lead extraction/abandonment decisions optimally.
Methods: EVSI measures the expected net health gains from conducting a new research study given a proposed study design. EVSI relies on a synthesis of the current evidence available on treatment efficacy, and a cost-effectiveness model. Network Meta-Analysis (NMA) pools together evidence on relative efficacy of multiple competing health technologies that have been compared in Randomised Controlled Trials that form a connected network of comparisons. The results obtained from NMA provide a coherent basis on which to make comparisons across the entire set of treatments, and NMA is now commonly used to inform decision models to identify the most cost-effective treatment.
We describe methods to evaluate EVSI when the efficacy outcome is binary and the net benefit function is linear on the absolute probability scale. We distinguish between absolute effects (used in the decision model) and relative effects (which the RCT provides information on). The methods allow for heterogeneity in the existing NMA evidence, which forms a hierarchical prior for the result from the new study. We view this hierarchical prior structure as data so that we can obtain a posterior, given new data, in closed form. We use a Taylor series approximation to obtain the updated expectation of the net benefit given new data, without needing an inner simulation step.
Results: We illustrate the approach using as an example a network meta-analysis and cost-effectiveness analysis of 6 competing treatments for bipolar disorders, to identify the optimal number of arms and sample size per arm to include in a new study to inform this decision.
Conclusions: EVSI can be a valuable tool to assist in the prioritisation and optimal design of new research studies when there are multiple competing technologies.
Method: We extend existing bivariate meta-analysis methods to simultaneously synthesize multiple index tests. The proposed methods respect the natural grouping of data by studies, account for the within-study correlation (induced because tests are applied to the same participants) between the tests’ true-positive rates (TPRs) and between their false-positive rates (FPRs), and allow for between-study correlations between TPRs and FPRs (such as those induced by threshold effects). We focus mainly on algorithms in the Bayesian setting, using discrete (binomial and multinomial) likelihoods. We use as an example a meta-analysis of 11 studies on the screening accuracy of detecting Down syndrome in liveborn infants using two tests: shortened humerus (arm bone), and shortened femur (thigh bone). Secondary analyses included an additional 19 studies on shortened femur only.
Result: In the application, separate and joint meta-analyses yielded very similar estimates. For example, in models using the discrete likelihood, the summary TPR for a shortened humerus was 35.3% (95% credible interval [CrI]: 26.9, 41.8%) with the novel method, and 37.9% (27.7 to 50.3%) when shortened humerus was analyzed on its own. The corresponding numbers for the summary FPR were 4.9% (2.8 to 7.5%) and 4.8% (3.0 to 7.4%).
However, when calculating comparative accuracy, joint meta-analyses resulted in shorter credible intervals compared with separate meta-analyses for each test. In analyses using the discrete likelihood, the difference in the summary TPRs was 0.0% (-8.9, 9.5%; TPR higher for shortened humerus) with the novel method versus 2.6% (-14.7, 19.8%) with separate meta-analyses. The standard deviation of the posterior distribution of the difference in TPR with joint meta-analyses is half of that with separate meta-analyses.
Conclusion: The joint meta-analysis of multiple tests is feasible. It may be preferable to separate analyses for estimating measures of comparative accuracy of diagnostic tests, and therefore, of primary interest in parameterizing models that compare diagnostic strategies. Simulation and empirical analyses are needed to better define the role of the proposed methodology.
Purpose: To optimize recommendations for biopsy after mammography using random forests and maximum expected utility.
Methods: We used a dataset of 62,219 mammographic findings matched with cancer registry data to construct a random forest estimating the probability that each finding is malignant (positive). Random forests consist of an ensemble of classification trees constructed using randomly resampled data and randomly selected subsets of predictor variables and tuned to improve out-of-sample prediction. We used patient demographic risk factors, radiologist-observed standardized descriptors using the Breast Imaging-Reporting and Data System (BI-RADS) lexicon, radiologist subjective opinion (BI-RADS category 0-5, indicating increasing likelihood of malignancy) and the eventual outcomes (benign/malignant) of the finding to recursively partition the data into groups with different probabilities of malignancy. We applied previously reported estimates of utilities associated with false positives, true positives and false negatives (relative to true negative) to calculate expected utility associated with different thresholds and used the threshold that maximizes expected utility to determine the “optimal” random forest.
Results: ROC curves were constructed from the BI-RADS categories assigned by the radiologists and the predicted malignancy probabilities of the random forest (Figure 1). The radiologists operating point is regularly considered at the BI-RADS category 3 corresponding to a threshold above which biopsy would be recommended (approximately 2% likelihood of malignancy). The random forest improved AUC overall (0.948 vs. 0.935), comparing the forest at the 2% classification threshold, improved sensitivity (85.4% vs. 85.3%) and specificity (97.55% vs. 88.1%). When considering maximum expected utility, the optimal threshold of predicted malignancy by the forest was 0.4%, altering sensitivity and specificity to 88.6% and 96.3% respectively.
Conclusion: Random forests have the potential to improve the accuracy of biopsy recommendations over standard practice. When accounting for the relative consequences of true and false positives and negatives, the threshold for recommending biopsy using a random forest differs from regular threshold used by radiologists.