TY - JOUR
T1 - Predicting cardiovascular health trajectories in time-series electronic health records with LSTM models
AU - Guo, Aixia
AU - Beheshti, Rahmatollah
AU - Khan, Yosef M.
AU - Langabeer, James R.
AU - Foraker, Randi E.
N1 - Publisher Copyright:
© 2020, The Author(s).
PY - 2021/12
Y1 - 2021/12
N2 - Background: Cardiovascular disease (CVD) is the leading cause of death in the United States (US). Better cardiovascular health (CVH) is associated with CVD prevention. Predicting future CVH levels may help providers better manage patients’ CVH. We hypothesized that CVH measures can be predicted based on previous measurements from longitudinal electronic health record (EHR) data. Methods: The Guideline Advantage (TGA) dataset was used and contained EHR data from 70 outpatient clinics across the United States (US). We studied predictions of 5 CVH submetrics: smoking status (SMK), body mass index (BMI), blood pressure (BP), hemoglobin A1c (A1C), and low-density lipoprotein (LDL). We applied embedding techniques and long short-term memory (LSTM) networks – to predict future CVH category levels from all the previous CVH measurements of 216,445 unique patients for each CVH submetric. Results: The LSTM model performance was evaluated by the area under the receiver operator curve (AUROC): the micro-average AUROC was 0.99 for SMK prediction; 0.97 for BMI; 0.84 for BP; 0.91 for A1C; and 0.93 for LDL prediction. Model performance was not improved by using all 5 submetric measures compared with using single submetric measures. Conclusions: We suggest that future CVH levels can be predicted using previous CVH measurements for each submetric, which has implications for population cardiovascular health management. Predicting patients’ future CVH levels might directly increase patient CVH health and thus quality of life, while also indirectly decreasing the burden and cost for clinical health system caused by CVD and cancers.
AB - Background: Cardiovascular disease (CVD) is the leading cause of death in the United States (US). Better cardiovascular health (CVH) is associated with CVD prevention. Predicting future CVH levels may help providers better manage patients’ CVH. We hypothesized that CVH measures can be predicted based on previous measurements from longitudinal electronic health record (EHR) data. Methods: The Guideline Advantage (TGA) dataset was used and contained EHR data from 70 outpatient clinics across the United States (US). We studied predictions of 5 CVH submetrics: smoking status (SMK), body mass index (BMI), blood pressure (BP), hemoglobin A1c (A1C), and low-density lipoprotein (LDL). We applied embedding techniques and long short-term memory (LSTM) networks – to predict future CVH category levels from all the previous CVH measurements of 216,445 unique patients for each CVH submetric. Results: The LSTM model performance was evaluated by the area under the receiver operator curve (AUROC): the micro-average AUROC was 0.99 for SMK prediction; 0.97 for BMI; 0.84 for BP; 0.91 for A1C; and 0.93 for LDL prediction. Model performance was not improved by using all 5 submetric measures compared with using single submetric measures. Conclusions: We suggest that future CVH levels can be predicted using previous CVH measurements for each submetric, which has implications for population cardiovascular health management. Predicting patients’ future CVH levels might directly increase patient CVH health and thus quality of life, while also indirectly decreasing the burden and cost for clinical health system caused by CVD and cancers.
KW - CVH prediction
KW - Cardiovascular health (CVH)
KW - LSTM models
KW - Precision medicine
KW - The guideline advantage (TGA)
UR - http://www.scopus.com/inward/record.url?scp=85098760575&partnerID=8YFLogxK
U2 - 10.1186/s12911-020-01345-1
DO - 10.1186/s12911-020-01345-1
M3 - Article
C2 - 33407390
AN - SCOPUS:85098760575
SN - 1472-6947
VL - 21
JO - BMC Medical Informatics and Decision Making
JF - BMC Medical Informatics and Decision Making
IS - 1
M1 - 5
ER -