TY - JOUR
T1 - Respiratory support status from EHR data for adult population
T2 - Classification, heuristics, and usage in predictive modeling
AU - Yu, Sean C.
AU - Hofford, MacKenzie R.
AU - Lai, Albert M.
AU - Kollef, Marin H.
AU - Payne, Philip R.O.
AU - Michelson, Andrew P.
N1 - Publisher Copyright:
© 2022 The Author(s) 2022. Published by Oxford University Press on behalf of the American Medical Informatics Association. All rights reserved. For permissions, please email: [email protected].
PY - 2022/5/1
Y1 - 2022/5/1
N2 - Objective: Respiratory support status is critical in understanding patient status, but electronic health record data are often scattered, incomplete, and contradictory. Further, there has been limited work on standardizing representations for respiratory support. The objective of this work was to (1) propose a practical terminology system for respiratory support methods; (2) develop (meta-)heuristics for constructing respiratory support episodes; and (3) evaluate the utility of respiratory support information for mortality prediction. Materials and Methods: All analyses were performed using electronic health record data of COVID-19-tested, emergency department-admit, adult patients at a large, Midwestern healthcare system between March 1, 2020 and April 1, 2021. Logistic regression and XGBoost models were trained with and without respiratory support information, and performance metrics were compared. Importance of respiratory-support-based features was explored using absolute coefficient values for logistic regression and SHapley Additive exPlanations values for the XGBoost model. Results: The proposed terminology system for respiratory support methods is as follows: Low-Flow Oxygen Therapy (LFOT), High-Flow Oxygen Therapy (HFOT), Non-Invasive Mechanical Ventilation (NIMV), Invasive Mechanical Ventilation (IMV), and ExtraCorporeal Membrane Oxygenation (ECMO). The addition of respiratory support information significantly improved mortality prediction (logistic regression area under receiver operating characteristic curve, median [IQR] from 0.855 [0.852 - 0.855] to 0.881 [0.876 - 0.884]; area under precision recall curve from 0.262 [0.245 - 0.268] to 0.319 [0.313 - 0.325], both P < 0.01). The proposed generalizable, interpretable, and episodic representation had commensurate performance compared to alternate representations despite loss of granularity. Respiratory support features were among the most important in both models. Conclusion: Respiratory support information is critical in understanding patient status and can facilitate downstream analyses.
AB - Objective: Respiratory support status is critical in understanding patient status, but electronic health record data are often scattered, incomplete, and contradictory. Further, there has been limited work on standardizing representations for respiratory support. The objective of this work was to (1) propose a practical terminology system for respiratory support methods; (2) develop (meta-)heuristics for constructing respiratory support episodes; and (3) evaluate the utility of respiratory support information for mortality prediction. Materials and Methods: All analyses were performed using electronic health record data of COVID-19-tested, emergency department-admit, adult patients at a large, Midwestern healthcare system between March 1, 2020 and April 1, 2021. Logistic regression and XGBoost models were trained with and without respiratory support information, and performance metrics were compared. Importance of respiratory-support-based features was explored using absolute coefficient values for logistic regression and SHapley Additive exPlanations values for the XGBoost model. Results: The proposed terminology system for respiratory support methods is as follows: Low-Flow Oxygen Therapy (LFOT), High-Flow Oxygen Therapy (HFOT), Non-Invasive Mechanical Ventilation (NIMV), Invasive Mechanical Ventilation (IMV), and ExtraCorporeal Membrane Oxygenation (ECMO). The addition of respiratory support information significantly improved mortality prediction (logistic regression area under receiver operating characteristic curve, median [IQR] from 0.855 [0.852 - 0.855] to 0.881 [0.876 - 0.884]; area under precision recall curve from 0.262 [0.245 - 0.268] to 0.319 [0.313 - 0.325], both P < 0.01). The proposed generalizable, interpretable, and episodic representation had commensurate performance compared to alternate representations despite loss of granularity. Respiratory support features were among the most important in both models. Conclusion: Respiratory support information is critical in understanding patient status and can facilitate downstream analyses.
KW - electronic health records
KW - machine learning
KW - oxygen support
KW - predictive analytics
KW - respiratory support
KW - supplemental oxygen
UR - http://www.scopus.com/inward/record.url?scp=85128489006&partnerID=8YFLogxK
U2 - 10.1093/jamia/ocac005
DO - 10.1093/jamia/ocac005
M3 - Article
C2 - 35092276
AN - SCOPUS:85128489006
SN - 1067-5027
VL - 29
SP - 813
EP - 821
JO - Journal of the American Medical Informatics Association
JF - Journal of the American Medical Informatics Association
IS - 5
ER -