TY - JOUR
T1 - Predicting hospitalization of COVID-19 positive patients using clinician-guided machine learning methods
AU - Song, Wenyu
AU - Zhang, Linying
AU - Liu, Luwei
AU - Sainlaire, Michael
AU - Karvar, Mehran
AU - Kang, Min Jeoung
AU - Pullman, Avery
AU - Lipsitz, Stuart
AU - Massaro, Anthony
AU - Patil, Namrata
AU - Jasuja, Ravi
AU - Dykes, Patricia C.
N1 - Publisher Copyright:
© The Author(s) 2022. Published by Oxford University Press on behalf of the American Medical Informatics Association. All rights reserved.
PY - 2022
Y1 - 2022
N2 - Objectives: The coronavirus disease 2019 (COVID-19) is a resource-intensive global pandemic. It is important for healthcare systems to identify high-risk COVID-19-positive patients who need timely health care. This study was conducted to predict the hospitalization of older adults who have tested positive for COVID-19. Methods: We screened all patients with COVID test records from 11 Mass General Brigham hospitals to identify the study population. A total of 1495 patients with age 65 and above from the outpatient setting were included in the final cohort, among which 459 patients were hospitalized. We conducted a clinician-guided, 3-stage feature selection, and phenotyping process using iterative combinations of literature review, clinician expert opinion, and electronic healthcare record data exploration. A list of 44 features, including temporal features, was generated from this process and used for model training. Four machine learning prediction models were developed, including regularized logistic regression, support vector machine, random forest, and neural network. Results: All 4 models achieved area under the receiver operating characteristic curve (AUC) greater than 0.80. Random forest achieved the best predictive performance (AUC ¼ 0.83). Albumin, an index for nutritional status, was found to have the strongest association with hospitalization among COVID positive older adults. Conclusions: In this study, we developed 4 machine learning models for predicting general hospitalization among COVID positive older adults. We identified important clinical factors associated with hospitalization and observed temporal patterns in our study cohort. Our modeling pipeline and algorithm could potentially be used to facilitate more accurate and efficient decision support for triaging COVID positive patients.
AB - Objectives: The coronavirus disease 2019 (COVID-19) is a resource-intensive global pandemic. It is important for healthcare systems to identify high-risk COVID-19-positive patients who need timely health care. This study was conducted to predict the hospitalization of older adults who have tested positive for COVID-19. Methods: We screened all patients with COVID test records from 11 Mass General Brigham hospitals to identify the study population. A total of 1495 patients with age 65 and above from the outpatient setting were included in the final cohort, among which 459 patients were hospitalized. We conducted a clinician-guided, 3-stage feature selection, and phenotyping process using iterative combinations of literature review, clinician expert opinion, and electronic healthcare record data exploration. A list of 44 features, including temporal features, was generated from this process and used for model training. Four machine learning prediction models were developed, including regularized logistic regression, support vector machine, random forest, and neural network. Results: All 4 models achieved area under the receiver operating characteristic curve (AUC) greater than 0.80. Random forest achieved the best predictive performance (AUC ¼ 0.83). Albumin, an index for nutritional status, was found to have the strongest association with hospitalization among COVID positive older adults. Conclusions: In this study, we developed 4 machine learning models for predicting general hospitalization among COVID positive older adults. We identified important clinical factors associated with hospitalization and observed temporal patterns in our study cohort. Our modeling pipeline and algorithm could potentially be used to facilitate more accurate and efficient decision support for triaging COVID positive patients.
KW - COVID-19
KW - electronic health record
KW - hospitalization
KW - machine learning
KW - temporal patterns
UR - http://www.scopus.com/inward/record.url?scp=85138447722&partnerID=8YFLogxK
U2 - 10.1093/jamia/ocac083
DO - 10.1093/jamia/ocac083
M3 - Article
C2 - 35595237
AN - SCOPUS:85138447722
SN - 1067-5027
VL - 29
SP - 1661
EP - 1667
JO - Journal of the American Medical Informatics Association
JF - Journal of the American Medical Informatics Association
IS - 10
ER -