Abstract

Narrative text reports represent a significant source of clinical data. However, the information stored in these reports is inaccessible to many automated decision support systems. Data mining techniques can assist in extracting information from narrative data. Multiple classification methods, such as rule generation, decision trees, Bayesian classifiers, and information retrieval were used to classify a set of 200 chest X-ray reports according to 6 clinical conditions indicated. A general-purpose natural language processor was used to convert the narrative text into a coded form that could be used by the classification algorithms. Significant differences in performance were found between algorithms. The best performing algorithm applied to the processor output was significantly better than information retrieval applied to raw text. Predictor variables from the coded processor output were limited to avoid overfitting. Methods that limited by domain knowledge performed significantly better than those that limited by conditional probabilities of the variables in the training set. Algorithms were also shown to be dependent on training set size.

Original languageEnglish
Pages (from-to)455-459
Number of pages5
JournalProceedings / AMIA ... Annual Symposium. AMIA Symposium
StatePublished - 1999

Fingerprint

Dive into the research topics of 'Classification algorithms applied to narrative reports.'. Together they form a unique fingerprint.

Cite this