DATA ANALYSIS WITH LARGE NUMBERS OF MISSING BODY MASS INDEX VALUES

Monday, October 24, 2011
Grand Ballroom AB (Hyatt Regency Chicago)
Poster Board # 53
(ESP) Applied Health Economics, Services, and Policy Research

Paul Kolm, PhD, Claudine Jurkovitz, MD, MPH, Zugui Zhang, PhD and James Bowen, Christiana Care Health System, Newark, DE

Purpose: Missing data present a challenge for analysis, particularly data from clinical databases and patient registries where missing values are not easily obtained after the initial data collection, or cannot be obtained at all.  Additionally, there is the issue of whether data are at least missing at random (MAR) and the implications for imputing missing values.  In this study, we assess the relationship between cardiovascular (CV) events, chronic kidney disease (CKD) and obesity.  Specifically, we investigated the impact of missing body mass index (BMI) values in a clinical database of 36,000 patients with nearly 600,000 observations over a 10+ year period.

Method: Parametric regression models, latent growth curve modeling and hierarchical Poisson regression, were used to assess whether a longitudinal decline in kidney function modifies the association between obesity and cardiovascular events.  Because the parametric models excluded patients with missing data, time series of CV events, BMI and glomerular filtration rate (GFR), were constructed by averaging values in 30-day increments and comparing the CV event series with those of GFR and BMI using cross-correlation analysis and Granger causality tests.

Result: Over 99% of the CV events were excluded from the growth curve modeling analysis because of missing BMI values.  Poisson models of CV events as a function of BMI at various levels of GFR included from 17% of CV events to 66% because of missing BMI values.  The time series analysis showed a significant, positive relationship between CV events and BMI for normal kidney function patients, but no significant relationship for abnormal kidney function patients.

Conclusion: BMI is a function of height and weight, so missing values for either one will limit the data available for analysis.  Results from the parametric are suspect at best because of the large number of missing values.  Creating and analyzing time series makes use of all available data without resorting to imputing missing values.  However, time series analysis ignores the within-subject nature of these data which the parametric analyses account for.