Large-scale electronic health record research introduces biases compared to traditional manually curated retrospective research. We used data from a community-acquired pneumonia study for which we had a gold standard to illustrate such biases. The challenges include data inaccuracy, incompleteness, and complexity, and they can produce in distorted results. We found that a nal̈ve approach approximated the gold standard, but errors ona minority of cases shifted mortality substantially. Manual review revealed errors in both selecting and characterizing the cohort, and narrowing the cohort improved the result. Nevertheless, a significantly narrowed cohort might contain its own biases that would be difficult to estimate.

Original languageEnglish
Pages (from-to)48-52
Number of pages5
JournalJournal of Biomedical Discovery and Collaboration
Issue number1
StatePublished - 2011


Dive into the research topics of 'Bias associated with mining electronic health records'. Together they form a unique fingerprint.

Cite this