Meeting Brochure and registration form      SMDM Homepage

Sunday, October 21, 2007
P1-7

AN INFORMATION VISUALIZATION APPROACH TO CLASSIFICATION AND ASSESSMENT OF DIABETES RISK

Christopher A. Harle, MS, Rema Padman, PhD, and Daniel B. Neill, PhD. Carnegie Mellon University, Pittsburgh, PA

Purpose: Development and evaluation of a novel approach that communicates patients' diabetes risk levels in the context of many risk factors by visualizing a large, multidimensional patient data set.

Methods: A general method was developed which allows a clinician to (i) select a set of relevant risk factor variables for exploratory analysis of disease risk in a patient population; (ii) apply dimensionality reduction techniques for extracting informative two-dimensional projections of patient observations based on the selected variables; (iii) draw visual classification boundaries for risk group stratification; and (iv) plot “attracting anchors” which depict relationships between and relevance of each risk factor variable. The anchors' size and location convey this information through an attraction metaphor. Two case studies were used to evaluate the proposed method: the visualization of type 2 diabetes onset risk and the visualization of heart attack risk in adults with type 2 diabetes. Visual models were instantiated using a database from the American Diabetes Association's Diabetes PHD application which contained individual level health information and corresponding risk predictions made by the trial-validated Archimedes model. Principal component and linear discriminant projections of the data were augmented with two- and three-class decision boundaries and “attracting anchors” for each selected risk factor. Models were evaluated in terms of accuracy in stratifying patients according to risk and the ability to identify outlying individual or clusters of patients.

Results: The two-dimensional linear models approximated the Archimedes risk predictions and classified patients as well or better than other common machine learning methods that were applied to the original high-dimensional data. The models also visually provided an overview of risk levels in the population, estimates of confidence in each patient's risk grouping, insight into relationships between risk factor variables, and identification of outlying patients. The general framework is based on computationally efficient and well understood statistical methods, and its parameters can be modified to meet the needs of individual clinicians.

Conclusions: The proposed methods provide accurate and interpretable high-dimensional visual stratifications of many patients according to their risk of diabetes or heart attack. The techniques may be embedded in information systems to provide interactive visual analysis tools that complement less efficient or less transparent analytic methods in supporting information processing and decision making related to diabetes prevention and management.