Category Reference for Presentations | |||||
---|---|---|---|---|---|

AHE | Applied Health Economics | DEC | Decision Psychology and Shared Decision Making | ||

HSP | Health Services, and Policy Research | MET | Quantitative Methods and Theoretical Developments |

* Candidate for the Lee B. Lusted Student Prize Competition

**This paper proposes a novel modelling strategy for the analysis of the EQ-5D responses, which recognises both the likely dependence between the five dimensions of the questionnaire at the patient level, and the fact that the severity levels of each dimension are naturally ordered. We also address the key problem of choosing an appropriate summary measure of agreement between predicted and observed data, when these models are used to develop mapping algorithms between patients reported outcome measures (PROMs).**

**Purpose**:** Methods: ** Using data from the Health Survey for England (HSE) and the National Health Measurement Study (NHMS) we develop a multivariate ordered probit (MVOP) model for the analysis of the EQ-5D responses and compare its performance against other approaches proposed in the literature, such as response mapping (eg. multinomial logit, ML) and univariate regression models (applied directly on the EQ-5D index score). Models goodness‑of-fit assessment is carried out using the

*Deviance Information Criteria (DIC),*while their in-sample and out-of-sample predictive abilities (crucial when developing mapping algorithms) are assessed using Bayesian

*proper scoring rules*. Departing from the use of measures based on the predicted mean such as the (root) mean squared error, scoring rules exploit instead the whole posterior predictive distribution of the parameters of the model, thus reflecting both central tendency and uncertainty in the prediction. The analysis is implemented within a Bayesian framework.

**Results** **: **The MVOP fits the two independent datasets better (DIC: 15,145 for the NHMS and 45,550 HSE) than the ML (DIC: 15,703 for the NHMS and 47,140 HSE) and of the independent ordered probit for each dimensions (DIC: 15,720 for the NHMS and 45,550 HSE). Assessment of their predictive posterior distribution shows that the MVOP has better coverage of the central tendency measure (in-sample validation), and better out-of-sample predictive ability (0.531 for the MVOP vs 0.513 for the independent univariate ordered probit vs 0.481 for the ML).

** Conclusions: **Explicit modelling of both correlation between the responses on each of the five dimensions of the EQ‑5D and the natural ordering of the severity levels within each dimension yields more accurate predictions. Modelling at the response level, rather than at the index score, facilitates a more generalisable assessment of the EQ-5D responses which is not confounded by the valuation set used in each country.

** Purpose: ** Cardiac implantable electronic device (CIED)
leads fail stochastically, requiring the immediate implantation of a new
lead(s). Because the total number of concurrently implanted leads (both
functioning and failed) is subject to a maximum (i.e., five leads according to current
guidelines), whenever a lead fails, it may be beneficial to extract this lead
and/or any previously abandoned leads. Extraction, however, carries small but
real life-threatening risks that increase in lead dwell time. Therefore, a
tradeoff exists between maintaining space for new leads and avoiding risky
extractions. Furthermore, surgical lead procedures involve a risk of infection.
If an infection occurs, all implanted leads must be extracted. Hence, choosing
to leave leads in place at the time of failures may result in risky, mandatory
extractions. The purpose of this study is to determine a patient-specific
extraction policy to maximize the expected lifetime of a single chamber
pacemaker patient using a Markov decision process (MDP) model.

** Method: ** We develop a MDP model to dynamically make
extraction decisions at the time of lead failures as a function of patient and
all lead ages. We also simulate this process to obtain prediction intervals on
measures of interest including the expected patient lifetime and the likelihood
of CIED-related deaths (as opposed to natural causes). Finally, we conduct
comparisons to three heuristics commonly used in practice.

** Results: **Under the optimal policy, the extraction decision
for each lead only depends on its age, patient age, its rank among the lead
ages and the total number of implanted leads, i.e., the decision does not
depend on the exact ages of all implanted leads. Figure 1 illustrates the optimal
lead maintenance policy for a specific, single chamber pacemaker patient. Compared
to the heuristic policies, the optimal policy significantly decreases CIED-related
deaths and increases the expected lifetime, e.g., under the policy in Figure 1,
a 60-year-old patient with failed leads of ages 20, 17, 12 and 2, observes an
increase (decrease) of up to 1.5 years (7%) in their expected lifetime (likelihood
of CIED-related deaths).

** Conclusion: **Cardiac leads are often referred to as
“the weakest link” in implantable cardiac device treatment. Despite its
importance, lead maintenance varies widely from practice to practice. We
develop an approach that helps clinicians make patient-specific lead
extraction/abandonment decisions optimally.

*To illustrate how Expected Value of Sample Information (EVSI) can be used to assist the prioritisation of future randomised controlled trials when there are multiple competing health technologies. In particular, the decision as to how many arms and which technologies to include, as well as the sample size on each arm.*

**Purpose:*** Methods: * EVSI measures the expected net health gains from conducting a new research study given a proposed study design. EVSI relies on a synthesis of the current evidence available on treatment efficacy, and a cost-effectiveness model. Network Meta-Analysis (NMA) pools together evidence on relative efficacy of multiple competing health technologies that have been compared in Randomised Controlled Trials that form a connected network of comparisons. The results obtained from NMA provide a coherent basis on which to make comparisons across the entire set of treatments, and NMA is now commonly used to inform decision models to identify the most cost-effective treatment.

We describe methods to evaluate EVSI when the efficacy outcome is binary and the net benefit function is linear on the absolute probability scale. We distinguish between absolute effects (used in the decision model) and relative effects (which the RCT provides information on). The methods allow for heterogeneity in the existing NMA evidence, which forms a hierarchical prior for the result from the new study. We view this hierarchical prior structure as data so that we can obtain a posterior, given new data, in closed form. We use a Taylor series approximation to obtain the updated expectation of the net benefit given new data, without needing an inner simulation step.

* Results: * We illustrate the approach using as an example a network meta-analysis and cost-effectiveness analysis of 6 competing treatments for bipolar disorders, to identify the optimal number of arms and sample size per arm to include in a new study to inform this decision.

* Conclusions: * EVSI can be a valuable tool to assist in the prioritisation and optimal design of new research studies when there are multiple competing technologies.

**Existing methods for meta-analysis of diagnostic test accuracy focus primarily on a single index test rather than comparing two or more tests that have been applied to the same patients in paired designs. We develop novel methods for the joint meta-analysis of studies of diagnostic accuracy that compare two or more tests on the same participants.**

**Purpose:**** Method: **We extend existing bivariate meta-analysis methods to simultaneously synthesize multiple index tests. The proposed methods respect the natural grouping of data by studies, account for the within-study correlation (induced because tests are applied to the same participants) between the tests’ true-positive rates (TPRs) and between their false-positive rates (FPRs), and allow for between-study correlations between TPRs and FPRs (such as those induced by threshold effects). We focus mainly on algorithms in the Bayesian setting, using discrete (binomial and multinomial) likelihoods. We use as an example a meta-analysis of 11 studies on the screening accuracy of detecting Down syndrome in liveborn infants using two tests: shortened humerus (arm bone), and shortened femur (thigh bone). Secondary analyses included an additional 19 studies on shortened femur only.

** Result: **In the application, separate and joint meta-analyses yielded very similar estimates. For example, in models using the discrete likelihood, the summary TPR for a shortened humerus was 35.3% (95% credible interval [CrI]: 26.9, 41.8%) with the novel method, and 37.9% (27.7 to 50.3%) when shortened humerus was analyzed on its own. The corresponding numbers for the summary FPR were 4.9% (2.8 to 7.5%) and 4.8% (3.0 to 7.4%).

However, when calculating comparative accuracy, joint meta-analyses resulted in shorter credible intervals compared with separate meta-analyses for each test. In analyses using the discrete likelihood, the difference in the summary TPRs was 0.0% (-8.9, 9.5%; TPR higher for shortened humerus) with the novel method versus 2.6% (-14.7, 19.8%) with separate meta-analyses. The standard deviation of the posterior distribution of the difference in TPR with joint meta-analyses is half of that with separate meta-analyses.

** Conclusion: ** The joint meta-analysis of multiple tests is feasible. It may be preferable to separate analyses for estimating measures of comparative accuracy of diagnostic tests, and therefore, of primary interest in parameterizing models that compare diagnostic strategies. Simulation and empirical analyses are needed to better define the role of the proposed methodology.

**Purpose: ** To optimize recommendations for biopsy after
mammography using random forests and maximum expected utility.

**Methods: ** We used a dataset of 62,219 mammographic findings
matched with cancer registry data to construct a random forest estimating the
probability that each finding is malignant (positive). Random forests consist of an ensemble of
classification trees constructed using randomly resampled data and randomly
selected subsets of predictor variables and tuned to improve out-of-sample
prediction. We used patient demographic
risk factors, radiologist-observed standardized descriptors using the Breast
Imaging-Reporting and Data System (BI-RADS) lexicon, radiologist subjective
opinion (BI-RADS category 0-5, indicating increasing likelihood of malignancy)
and the eventual outcomes (benign/malignant) of the finding to recursively
partition the data into groups with different probabilities of malignancy. We applied previously reported estimates of
utilities associated with false positives, true positives and false negatives
(relative to true negative) to calculate expected utility associated with
different thresholds and used the threshold that maximizes expected utility to
determine the “optimal” random forest.

**Results: ** ROC curves were constructed from the BI-RADS
categories assigned by the radiologists and the predicted malignancy probabilities
of the random forest (Figure 1). The radiologists operating point is regularly
considered at the BI-RADS category 3 corresponding to a threshold above which
biopsy would be recommended (approximately 2% likelihood of malignancy). The
random forest improved AUC overall (0.948 vs. 0.935), comparing the forest at
the 2% classification threshold, improved sensitivity (85.4% vs. 85.3%) and
specificity (97.55% vs. 88.1%). When considering maximum expected utility, the
optimal threshold of predicted malignancy by the forest was 0.4%, altering sensitivity
and specificity to 88.6% and 96.3% respectively.

**Conclusion: ** Random forests have the potential to improve the
accuracy of biopsy recommendations over standard practice. When accounting for the relative consequences
of true and false positives and negatives, the threshold for recommending
biopsy using a random forest differs from regular threshold used by radiologists.

**Modelers lack a simple tool to examine decision sensitivity (i.e., the change in the probability of a strategy being optimal due to parameter uncertainty). We propose multinomial logistic regression (MNR) metamodeling to reveal decision sensitivity.**

**Purpose:**

**MNR is useful in analyses where the dependent varaible is categorical and not ordered. In this study, we apply MNR in a novel way to analyze the probabilistic sensitivity analysis (PSA) from a decision model in order to reveal decision sensitivity. We demonstrate our approach with a previously published decision model for treating a suspected case of herpes simplex encephalopathy. The model compares three strategies: treat everyone, biopsy, and do not treat or biopsy. We performed 10,000 PSA scenarios. For the MNR, we treated the model's input parameter values as independent variables and the optimal strategy in each iteration as the dependent variable. In this capacity the MNR is a second (meta) model. Because the regression coefficients are difficult to interpret, we report the marginal effects (ME) as a direct measure of decision sensitivity. The MEs measure the change in the probability of each strategy being optimal due to one unit change in each parameter. Furthermore, we developed a new score, the sum of absolute marginal effects (SAME) to combine the ME of a parameter on all the strategies, and compared our results to expected value of partial perfect information (EVPPI).**

**Methods:****The probability of severe sequalae following biopsy was associated with the highest decision sensitivity. The ME of this parameter on biopsy was -0.28, indicating that the probability of biopsy being optimal decreases by 0.28 if the value of this parameter is increased by one standard deviation from its mean. Similarly, the importance of all the model parameters were ranked by their ME and SAME scores. In addition, the SAME scores were highly correlated with the EVPPI (correlation coefficient = 0.97) (see Figure).**

**Results:**

**Regression analsyis can be used to evaluate the impact of decision model parameters and is highly correlated with EVPPI results.**

**Conclusion:**