With increased use of electronic medical records (EMRs), data mining on medical data has great potential to improve the quality of hospital treatment and increase the survival rate of patients. Early readmission prediction enables early intervention, which is essential to preventing serious or life-threatening events, and act as a substantial contributor to reduce healthcare costs. Existing works on predicting readmission often focus on certain vital signs and diseases by extracting statistical features. They also fail to consider skewness of class labels in medical data and different costs of misclassification errors. In this paper, we recur to the merits of convolutional neural networks (CNN) to automatically learn features from time series of vital sign, and categorical feature embedding to effectively encode feature vectors with heterogeneous clinical features, such as demographics, hospitalization history, vital signs, and laboratory tests. Then, both learnt features via CNN and statistical features via feature embedding are fed into a multilayer perceptron (MLP) for prediction. We use a cost-sensitive formulation to train MLP during prediction to tackle the imbalance and skewness challenge. We validate the proposed approach on two real medical datasets from Barnes-Jewish Hospital, and all data is taken from historical EMR databases and reflects the kinds of data that would realistically be available at the clinical prediction system in hospitals. We find that early prediction of readmission is possible and when compared with state-of-the-art existing methods used by hospitals, our methods perform significantly better. For example, using the general hospital wards data for 30-day readmission prediction, the area under the curve (AUC) for the proposed model was 0.70, significantly higher than all the baseline methods. Based on these results, a system is being deployed in hospital settings with the proposed forecasting algorithms to support treatment.
|Number of pages||11|
|Journal||IEEE/ACM Transactions on Computational Biology and Bioinformatics|
|State||Published - Nov 1 2018|
- Readmission prediction
- categorical feature embedding
- deep learning
- electronic medical records