Reference standards, judges, and comparison subjects: Roles for experts in evaluating system performance

George Hripcsak, Adam Wilcox

Research output: Contribution to journalReview articlepeer-review

56 Scopus citations

Abstract

Medical informatics systems are often designed to perform at the level of human experts. Evaluation of the performance of these systems is often constrained by lack of reference standards, either because the appropriate response is not known or because no simple appropriate response exists. Even when performance can be assessed, it is not always clear whether the performance is sufficient or reasonable. These challenges can be addressed if an evaluator enlists the help of clinical domain experts. 1) The experts can carry out the same tasks as the system, and then their responses can be combined to generate a reference standard. 2) The experts can judge the appropriateness of system output directly. 3) The experts can serve as comparison subjects with which the system can be compared. These are separate roles that have different implications for study design, metrics, and issues of reliability and validity. Diagrams help delineate the roles of experts in complex study designs.

Original languageEnglish
Pages (from-to)1-15
Number of pages15
JournalJournal of the American Medical Informatics Association
Volume9
Issue number1
DOIs
StatePublished - 2002

Fingerprint

Dive into the research topics of 'Reference standards, judges, and comparison subjects: Roles for experts in evaluating system performance'. Together they form a unique fingerprint.

Cite this