A NOVEL USE OF NATURAL LANGUAGE PROCESSING TO ANALYZE MEDICAL DEVICE ADVERSE EVENT REPORTS

Monday, October 25, 2010
Sheraton Hall E/F (Sheraton Centre Toronto Hotel)
Michael S. Broder, MD, MSHS, Partnership for Health Analytic Research, LLC, Beverly Hills, CA, Yang Huang, PhD, MSCS, Kaiser Permanente, Pasadena, CA and Sean O'Neill, M.Phil., Northwestern University, Santa Monica, CA

Purpose: To develop and validate a method for analyzing medical device adverse event reports using natural language processing (NLP).

Method: Analysis of the FDAs Manufacturer and User Facility Device Experience Database (MAUDE) has previously required expert reviewers. We developed a method using natural language processing (NLP) to review MAUDE reports and validated the method by comparing to a published study that had used human reviewers.  The corpus for this study was derived from 326 MAUDE reports, which was a subset of 679 such reports used in a published analysis of patterns of implantable cardioverter-defibrillator (ICD) failures. The corpus was tokenized following the Penn Treebank convention and phrases of interests were retrieved using a Conditional Random Field classifier. We developed an ontology for device adverse event reports and annotated a training set using Knowtator, an annotation plug-in for Protege. After two rounds of manual review and correction, the token-wise accuracy of the NLP system was 92%. To test the system’s clinical utility, we used it to map MAUDE event reports to specific categories of clinical findings used in a published analysis of ICD failures.

Result:
Table: MAUDE Reports with Specific Complaints/Observations, by Data Extraction Method
 

Published Study

N=679

NLP

n=326

Complaint/Observation

%

%

Inappropriate Shocks

33

32

Oversensing or noise (without inappropriate shocks)

14

25

Fixation mechanism malfunction

5

0

High impedance

33

34

Fracture

35

26

Insulation defect

5

3

Dislodgement, difficult positioning

7

11

Failure to capture, high threshold

18

19

Conclusion: mproving the assessment of biomedical device risk has become a US national priority. Methods that harness existing data sources may be increasingly useful as the country moves increasingly to electronic data collection and storage.  We developed and tested an NLP-based system for assessing adverse event reports. Our initial results demonstrate the feasibility of incorporating artificial intelligence into the ongoing monitoring of such reports. Further refinements are planned to improve accuracy and to allow processing of reports on other cardiac and non-cardiac devices.