TY - JOUR
T1 - Detecting abnormal electroencephalograms using deep convolutional networks
AU - van Leeuwen, K. G.
AU - Sun, H.
AU - Tabaeizadeh, M.
AU - Struck, A. F.
AU - van Putten, M. J.A.M.
AU - Westover, M. B.
N1 - Publisher Copyright:
© 2018 International Federation of Clinical Neurophysiology
PY - 2019/1
Y1 - 2019/1
N2 - Objectives: Electroencephalography (EEG) is a central part of the medical evaluation for patients with neurological disorders. Training an algorithm to label the EEG normal vs abnormal seems challenging, because of EEG heterogeneity and dependence of contextual factors, including age and sleep stage. Our objectives were to validate prior work on an independent data set suggesting that deep learning methods can discriminate between normal vs abnormal EEGs, to understand whether age and sleep stage information can improve discrimination, and to understand what factors lead to errors. Methods: We train a deep convolutional neural network on a heterogeneous set of 8522 routine EEGs from the Massachusetts General Hospital. We explore several strategies for optimizing model performance, including accounting for age and sleep stage. Results: The area under the receiver operating characteristic curve (AUC) on an independent test set (n = 851) is 0.917 marginally improved by including age (AUC = 0.924), and both age and sleep stages (AUC = 0.925), though not statistically significant. Conclusions: The model architecture generalizes well to an independent dataset. Adding age and sleep stage to the model does not significantly improve performance. Significance: Insights learned from misclassified examples, and minimal improvement by adding sleep stage and age suggest fruitful directions for further research.
AB - Objectives: Electroencephalography (EEG) is a central part of the medical evaluation for patients with neurological disorders. Training an algorithm to label the EEG normal vs abnormal seems challenging, because of EEG heterogeneity and dependence of contextual factors, including age and sleep stage. Our objectives were to validate prior work on an independent data set suggesting that deep learning methods can discriminate between normal vs abnormal EEGs, to understand whether age and sleep stage information can improve discrimination, and to understand what factors lead to errors. Methods: We train a deep convolutional neural network on a heterogeneous set of 8522 routine EEGs from the Massachusetts General Hospital. We explore several strategies for optimizing model performance, including accounting for age and sleep stage. Results: The area under the receiver operating characteristic curve (AUC) on an independent test set (n = 851) is 0.917 marginally improved by including age (AUC = 0.924), and both age and sleep stages (AUC = 0.925), though not statistically significant. Conclusions: The model architecture generalizes well to an independent dataset. Adding age and sleep stage to the model does not significantly improve performance. Significance: Insights learned from misclassified examples, and minimal improvement by adding sleep stage and age suggest fruitful directions for further research.
KW - Clinical neurophysiology
KW - Computer aided diagnosis (CAD)
KW - Convolutional neural networks (CNN)
KW - Deep learning
KW - Electroencephalograms (EEG)
KW - Epilepsy
UR - https://www.scopus.com/pages/publications/85057139469
U2 - 10.1016/j.clinph.2018.10.012
DO - 10.1016/j.clinph.2018.10.012
M3 - Article
C2 - 30481649
AN - SCOPUS:85057139469
SN - 1388-2457
VL - 130
SP - 77
EP - 84
JO - Clinical Neurophysiology
JF - Clinical Neurophysiology
IS - 1
ER -