Background: Cardiovascular disease (CVD) is the leading cause of death in the United States (US). Better cardiovascular health (CVH) is associated with CVD prevention. Predicting future CVH levels may help providers better manage patients’ CVH. We hypothesized that CVH measures can be predicted based on previous measurements from longitudinal electronic health record (EHR) data. Methods: The Guideline Advantage (TGA) dataset was used and contained EHR data from 70 outpatient clinics across the United States (US). We studied predictions of 5 CVH submetrics: smoking status (SMK), body mass index (BMI), blood pressure (BP), hemoglobin A1c (A1C), and low-density lipoprotein (LDL). We applied embedding techniques and long short-term memory (LSTM) networks – to predict future CVH category levels from all the previous CVH measurements of 216,445 unique patients for each CVH submetric. Results: The LSTM model performance was evaluated by the area under the receiver operator curve (AUROC): the micro-average AUROC was 0.99 for SMK prediction; 0.97 for BMI; 0.84 for BP; 0.91 for A1C; and 0.93 for LDL prediction. Model performance was not improved by using all 5 submetric measures compared with using single submetric measures. Conclusions: We suggest that future CVH levels can be predicted using previous CVH measurements for each submetric, which has implications for population cardiovascular health management. Predicting patients’ future CVH levels might directly increase patient CVH health and thus quality of life, while also indirectly decreasing the burden and cost for clinical health system caused by CVD and cancers.
- CVH prediction
- Cardiovascular health (CVH)
- LSTM models
- Precision medicine
- The guideline advantage (TGA)