TY - JOUR
T1 - Inter-Rater Reliability and Impact of Disagreements on Acute Physiology and Chronic Health Evaluation IV Mortality Predictions
AU - Simkins, Michelle
AU - Iqbal, Ayesha
AU - Gronemeyer, Audrey
AU - Konzen, Lisa
AU - White, Jason
AU - Koenig, Michael
AU - Palmer, Chris
AU - Kerby, Paul
AU - Buckman, Sara
AU - Despotovic, Vladimir
AU - Hoehner, Christine
AU - Boyle, Walter
N1 - Publisher Copyright:
© 2019 Authors. All rights reserved.
PY - 2019/10/17
Y1 - 2019/10/17
N2 - Objectives: Acute Physiology and Chronic Health Evaluation is a well-validated method to risk-Adjust ICU patient outcomes. However, predictions may be affected by inter-rater reliability for manually entered elements. We evaluated inter-rater reliability for Acute Physiology and Chronic Health Evaluation IV manually entered elements among clinician abstractors and assessed the impacts of disagreements on mortality predictions. Design: Cross-sectional. Setting: Academic medical center. Subjects: Patients admitted to five adult ICUs. Interventions: None. Measurements and Main Results: Acute Physiology and Chronic Health Evaluation IV manually entered elements were abstracted from a selection of charts (n = 41) by two clinician "raters" trained in Acute Physiology and Chronic Health Evaluation IV methodology. Rater agreement (%) was determined for each manually entered element, including Acute Physiology and Chronic Health Evaluation diagnosis, Glasgow Coma Scale score, admission source, chronic conditions, elective/emergency surgery, and ventilator use. Cohen's kappa (K) or intraclass correlation coefficient was calculated for nominal and continuous manually entered elements, respectively. The impacts of manually entered element choices on Acute Physiology and Chronic Health Evaluation IV mortality predictions were computed using published Acute Physiology and Chronic Health Evaluation IV equations, and observed to expected hospital mortality ratios were compared between rater groups. The majority of manually entered element inconsistency was due to disagreement in choice of Glasgow Coma Scale (63.8% agreement, 0.83 intraclass correlation coefficient), Acute Physiology and Chronic Health Evaluation diagnosis (68.3% agreement, 0.67 kappa), and admission source (90.2% agreement, 0.85 kappa). The difference in predicted mortality between raters related to Glasgow Coma Scale disagreements was significant (observed to expected mortality ratios for Rater 1 [1.009] vs Rater 2 [1.134]; p < 0.05). Differences related to Acute Physiology and Chronic Health Evaluation diagnosis or admission source disagreements were negligible. The new "unable to score" choice for Glasgow Coma Scale was used for 18% of Glasgow Coma Scale measurements but accounted for 63% of "major" Glasgow Coma Scale disagreements, and 50% of the overall difference in Acute Physiology and Chronic Health Evaluation-predicted mortality between raters. Conclusions: Inconsistent use among raters of the new "unable to score" choice for Glasgow Coma Scale introduced in Acute Physiology and Chronic Health Evaluation IV was responsible for important decreases in both Glasgow Coma Scale and Acute Physiology and Chronic Health Evaluation IV mortality prediction reliability in our study. A Glasgow Coma Scale algorithm we developed after the study to improve reliability related to use of this new "unable to score" choice is presented.
AB - Objectives: Acute Physiology and Chronic Health Evaluation is a well-validated method to risk-Adjust ICU patient outcomes. However, predictions may be affected by inter-rater reliability for manually entered elements. We evaluated inter-rater reliability for Acute Physiology and Chronic Health Evaluation IV manually entered elements among clinician abstractors and assessed the impacts of disagreements on mortality predictions. Design: Cross-sectional. Setting: Academic medical center. Subjects: Patients admitted to five adult ICUs. Interventions: None. Measurements and Main Results: Acute Physiology and Chronic Health Evaluation IV manually entered elements were abstracted from a selection of charts (n = 41) by two clinician "raters" trained in Acute Physiology and Chronic Health Evaluation IV methodology. Rater agreement (%) was determined for each manually entered element, including Acute Physiology and Chronic Health Evaluation diagnosis, Glasgow Coma Scale score, admission source, chronic conditions, elective/emergency surgery, and ventilator use. Cohen's kappa (K) or intraclass correlation coefficient was calculated for nominal and continuous manually entered elements, respectively. The impacts of manually entered element choices on Acute Physiology and Chronic Health Evaluation IV mortality predictions were computed using published Acute Physiology and Chronic Health Evaluation IV equations, and observed to expected hospital mortality ratios were compared between rater groups. The majority of manually entered element inconsistency was due to disagreement in choice of Glasgow Coma Scale (63.8% agreement, 0.83 intraclass correlation coefficient), Acute Physiology and Chronic Health Evaluation diagnosis (68.3% agreement, 0.67 kappa), and admission source (90.2% agreement, 0.85 kappa). The difference in predicted mortality between raters related to Glasgow Coma Scale disagreements was significant (observed to expected mortality ratios for Rater 1 [1.009] vs Rater 2 [1.134]; p < 0.05). Differences related to Acute Physiology and Chronic Health Evaluation diagnosis or admission source disagreements were negligible. The new "unable to score" choice for Glasgow Coma Scale was used for 18% of Glasgow Coma Scale measurements but accounted for 63% of "major" Glasgow Coma Scale disagreements, and 50% of the overall difference in Acute Physiology and Chronic Health Evaluation-predicted mortality between raters. Conclusions: Inconsistent use among raters of the new "unable to score" choice for Glasgow Coma Scale introduced in Acute Physiology and Chronic Health Evaluation IV was responsible for important decreases in both Glasgow Coma Scale and Acute Physiology and Chronic Health Evaluation IV mortality prediction reliability in our study. A Glasgow Coma Scale algorithm we developed after the study to improve reliability related to use of this new "unable to score" choice is presented.
KW - Acute Physiology and Chronic Health Evaluation
KW - Glasgow Coma Scale
KW - hospital mortality
KW - intensive care units
KW - outcome assessment (healthcare)
KW - predictive scoring systems
KW - reproducibility of results
KW - statistical models
KW - telemedicine/tele-intensive care unit
UR - http://www.scopus.com/inward/record.url?scp=85165328619&partnerID=8YFLogxK
U2 - 10.1097/CCE.0000000000000059
DO - 10.1097/CCE.0000000000000059
M3 - Article
AN - SCOPUS:85165328619
SN - 2639-8028
VL - 1
SP - E0059
JO - Critical Care Explorations
JF - Critical Care Explorations
IS - 10
ER -