TY - GEN
T1 - Trend Analysis Neural Networks for Interpretable Analysis of Longitudinal Data
AU - Yao, Zhenjie
AU - Chen, Yixin
AU - Wang, Jinwei
AU - Wu, Shouling
AU - Tu, Yanhui
AU - Zhao, Minghui
AU - Zhang, Luxia
N1 - Publisher Copyright:
© 2021 IEEE.
PY - 2021
Y1 - 2021
N2 - Cohort study is one of the most commonly used study methods in medical and public health researches, which result in longitudinal data. Conventional statistical models and machine learning methods are not capable of modeling the evolution trend of the variables in longitudinal data. In this paper, we propose a Trend Analysis Neural Networks (TANN), which models the evolution trend of the variables by adaptive feature learning. TANN was tested on dataset of Kaiuan research. The task was to predict occurrence of death within 5 years, with 3 repeated medical examinations from 2008 to 2013. The AUC of the TANN is 0.7888, which is a slightly improvement than that of conventional methods, while that of GBDT is 0.7824, that of random forests is 0.7822, and that of logistic regression is 0.7789. The experimental results show that the proposed TANN achieves better prediction performance on death events prediction than conventional models. Furthermore, by analyzing the weights of TANN, we could find out important trends of the indicators. The trend discovery mechanism interprets the model well. TANN is an appropriate trade-off between high performance and interpretability.
AB - Cohort study is one of the most commonly used study methods in medical and public health researches, which result in longitudinal data. Conventional statistical models and machine learning methods are not capable of modeling the evolution trend of the variables in longitudinal data. In this paper, we propose a Trend Analysis Neural Networks (TANN), which models the evolution trend of the variables by adaptive feature learning. TANN was tested on dataset of Kaiuan research. The task was to predict occurrence of death within 5 years, with 3 repeated medical examinations from 2008 to 2013. The AUC of the TANN is 0.7888, which is a slightly improvement than that of conventional methods, while that of GBDT is 0.7824, that of random forests is 0.7822, and that of logistic regression is 0.7789. The experimental results show that the proposed TANN achieves better prediction performance on death events prediction than conventional models. Furthermore, by analyzing the weights of TANN, we could find out important trends of the indicators. The trend discovery mechanism interprets the model well. TANN is an appropriate trade-off between high performance and interpretability.
KW - Interpretability
KW - Longitudinal data
KW - Neural networks
KW - Trend analysis
UR - https://www.scopus.com/pages/publications/85125361154
U2 - 10.1109/BigData52589.2021.9671590
DO - 10.1109/BigData52589.2021.9671590
M3 - Conference contribution
AN - SCOPUS:85125361154
T3 - Proceedings - 2021 IEEE International Conference on Big Data, Big Data 2021
SP - 6061
EP - 6063
BT - Proceedings - 2021 IEEE International Conference on Big Data, Big Data 2021
A2 - Chen, Yixin
A2 - Ludwig, Heiko
A2 - Tu, Yicheng
A2 - Fayyad, Usama
A2 - Zhu, Xingquan
A2 - Hu, Xiaohua Tony
A2 - Byna, Suren
A2 - Liu, Xiong
A2 - Zhang, Jianping
A2 - Pan, Shirui
A2 - Papalexakis, Vagelis
A2 - Wang, Jianwu
A2 - Cuzzocrea, Alfredo
A2 - Ordonez, Carlos
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2021 IEEE International Conference on Big Data, Big Data 2021
Y2 - 15 December 2021 through 18 December 2021
ER -