46 AUTOMATING CLINICAL DATA EXTRACTION TO SUPPORT COMPARATIVE EFFECTIVENESS RESEARCH

Thursday, October 18, 2012
The Atrium (Hyatt Regency)
Poster Board # 46
Health Services, and Policy Research (HSP)

Erik G. Van Eaton, MD, Meliha Yetisgen-Yildiz, PhD, Allison D. Rhodes, MS, Daniel Capurro, MD, Emily Beth Devine, PhD, PharmD, MBA, Rafael Alfonso, MD, David R. Flum, MD, MPH and Peter Tarczy-Hornoch, MD, University of Washington, Seattle, WA

Purpose: The Surgical Care and Outcomes Assessment Program (SCOAP) is a clinician-led collaborative for data-sharing and benchmarking to improve surgical quality across Washington State. Currently, manual data abstraction is required, and this limits scalability. With funding from an Agency for Healthcare Research and Quality Enhanced Registry grant, the SCOAP automation project is electronically linking clinical databases across the state to expand and conduct CER – comparative effectiveness research (SCOAP CER Translation Network, “SCOAP CERTAIN”). Here we report the rate of site recruitment and feasibility of data automation.

Methods: This project installs the Amalga Unified Intelligence System™ (Microsoft Corporation, Redmond, Washington) for participant SCOAP sites and assembles appropriate interfaces. Fourteen SCOAP hospitals representing 6 health-systems were candidate sites for automation. The project was described to the Chief Information Officer first, and in a later meeting to site Information Technology (IT) analysts. A panel of experts reviewed a spreadsheet of aggregated SCOAP variables cross-referenced to candidate Electronic Health Record (EHR) sources. Each variable was labeled: (a) structured electronic data at all sites (e.g. laboratory values); (b) structured electronic data at some sites (e.g. lowest intraoperative body temperature); (c) machine-readable text (e.g. discharge diagnosis); or (d) not assessed/not feasible.

Results: After 18 months of negotiations and technical work, automation is underway at 3 of the 14 candidates. The primary barrier to the project is direct competition for IT resources from the Medicare and Medicaid EHR Incentive Programs (“Meaningful Use”). In order to comply, 6 of the candidate sites will entirely replace their EHR by July 3, 2012. This renders resources entirely unavailable for creating feeds. Four data feeds represent the highest yield of SCOAP variables from the EHR: Registration system; Laboratory reporting; Medication orders; and Transcription (e.g. discharge summaries). Structured electronic data at all sites represent 54 (7.3%) variables. Structured electronic data at some sites represent 129 (17.4%) variables. Machine-readable text represent 189 (25.5%) variables, for which 85 (11.5%) are current targets to extract with Natural Language Processing. The remaining 368 (49.7%) were not assessed or not feasible to automate.

Conclusions: The SCOAP CERTAIN automation project illustrates the complexities involved in automating data flow among linked clinical databases to conduct CER. Institutional drive to participate, and interface analyst availability, are greater rate limiting steps than are technical challenges.