Purpose: Risk factors increase the incidence and severity of many chronic diseases. While some risk factors are fixed (e.g., genotypes), exposures to other risk factors (e.g., smoking) may change and are amenable to intervention. Accurate population health estimates require modeling these time-varying risk factors – a difficult task, as few longitudinal data are available. We developed a calibration procedure to infer time-varying exposures, exploiting available cross-sectional data.
Methods: We developed a simple Markov model structure that tracks the duration of continuous risk factor exposure (e.g., years as a smoker) or lack of exposure (e.g., years as a non-smoker). Risk factor exposure increases mortality risks, and exposure duration alters the probability of reducing exposure (e.g., quitting smoking); likewise, duration without exposure alters the probability of initiating exposure (e.g., starting smoking). These probabilities can vary by age and sex. The structure is deliberately simplified to facilitate incorporation into disease models (e.g., diabetes) via feasible stratifications. As an example, we calibrate sex-specific models of smoking to 10 Indian regions defined by geography and urbanicity. Indian data on sex, age, region-specific prevalence and smoking duration are derived from the Global Adult Tobacco Survey. Similarly-stratified mortality rates are derived from the Sample Registration System and age-specific smoking relative risks from the published literature. For each model, Neldor-Mead searches from 200,000 starting locations identify starting and quitting rates that minimize the difference between modeled and observed outcomes.
Results: Calibration yields close matches between modeled and observed outcomes for men and women in all regions. Generally, the probability of starting to smoke rises and falls with age (peak in teens/early 20s for men and early/mid 20s for women) while the probability of quitting smoking falls with age. Population life expectancy losses were 3-5 years for men with greater losses in higher-prevalence regions. For women, whose prevalence is 10x lower, losses were smaller. Accounting for differential starting and quitting rates based on exposure duration is potentially important as models without such variation produced greater estimates of life expectancy losses due to smoking.
Conclusions: Calibrating changes in rates of exposure for time-varying risk factors is feasible using widely-available, population-level, cross-sectional data. Incorporating exposure-change rates can improve modeled estimates of incidence and severity of related chronic diseases.