TY - JOUR
T1 - Detecting Glaucoma in the Ocular Hypertension Study Using Deep Learning
AU - Fan, Rui
AU - Bowd, Christopher
AU - Christopher, Mark
AU - Brye, Nicole
AU - Proudfoot, James A.
AU - Rezapour, Jasmin
AU - Belghith, Akram
AU - Goldbaum, Michael H.
AU - Chuter, Benton
AU - Girkin, Christopher A.
AU - Fazio, Massimo A.
AU - Liebmann, Jeffrey M.
AU - Weinreb, Robert N.
AU - Gordon, Mae O.
AU - Kass, Michael A.
AU - Kriegman, David
AU - Zangwill, Linda M.
N1 - Publisher Copyright:
© 2022 American Medical Association. All rights reserved.
PY - 2022/4
Y1 - 2022/4
N2 - Importance: Automated deep learning (DL) analyses of fundus photographs potentially can reduce the cost and improve the efficiency of reading center assessment of end points in clinical trials. Objective: To investigate the diagnostic accuracy of DL algorithms trained on fundus photographs from the Ocular Hypertension Treatment Study (OHTS) to detect primary open-angle glaucoma (POAG). Design, Setting, and Participants: In this diagnostic study, 1636 OHTS participants from 22 sites with a mean (range) follow-up of 10.7 (0-14.3) years. A total of 66715 photographs from 3272 eyes were used to train and test a ResNet-50 model to detect the OHTS Endpoint Committee POAG determination based on optic disc (287 eyes, 3502 photographs) and/or visual field (198 eyes, 2300 visual fields) changes. Three independent test sets were used to evaluate the generalizability of the model. Main Outcomes and Measures: Areas under the receiver operating characteristic curve (AUROC) and sensitivities at fixed specificities were calculated to compare model performance. Evaluation of false-positive rates was used to determine whether the DL model detected POAG before the OHTS Endpoint Committee POAG determination. Results: A total of 1147 participants were included in the training set (661 [57.6%] female; mean age, 57.2 years; 95% CI, 56.6-57.8), 167 in the validation set (97 [58.1%] female; mean age, 57.1 years; 95% CI, 55.6-58.7), and 322 in the test set (173 [53.7%] female; mean age, 57.2 years; 95% CI, 56.1-58.2). The DL model achieved an AUROC of 0.88 (95% CI, 0.82-0.92) for the OHTS Endpoint Committee determination of optic disc or VF changes. For the OHTS end points based on optic disc changes or visual field changes, AUROCs were 0.91 (95% CI, 0.88-0.94) and 0.86 (95% CI, 0.76-0.93), respectively. False-positive rates (at 90% specificity) were higher in photographs of eyes that later developed POAG by disc or visual field (27.5% [56 of 204]) compared with eyes that did not develop POAG (11.4% [50 of 440]) during follow-up. The diagnostic accuracy of the DL model developed on the optic disc end point applied to 3 independent data sets was lower, with AUROCs ranging from 0.74 (95% CI, 0.70-0.77) to 0.79 (95% CI, 0.78-0.81). Conclusions and Relevance: The model's high diagnostic accuracy using OHTS photographs suggests that DL has the potential to standardize and automate POAG determination for clinical trials and management. In addition, the higher false-positive rate in early photographs of eyes that later developed POAG suggests that DL models detected POAG in some eyes earlier than the OHTS Endpoint Committee, reflecting the OHTS design that emphasized a high specificity for POAG determination by requiring a clinically significant change from baseline. 2022 American Medical Association. All rights reserved.
AB - Importance: Automated deep learning (DL) analyses of fundus photographs potentially can reduce the cost and improve the efficiency of reading center assessment of end points in clinical trials. Objective: To investigate the diagnostic accuracy of DL algorithms trained on fundus photographs from the Ocular Hypertension Treatment Study (OHTS) to detect primary open-angle glaucoma (POAG). Design, Setting, and Participants: In this diagnostic study, 1636 OHTS participants from 22 sites with a mean (range) follow-up of 10.7 (0-14.3) years. A total of 66715 photographs from 3272 eyes were used to train and test a ResNet-50 model to detect the OHTS Endpoint Committee POAG determination based on optic disc (287 eyes, 3502 photographs) and/or visual field (198 eyes, 2300 visual fields) changes. Three independent test sets were used to evaluate the generalizability of the model. Main Outcomes and Measures: Areas under the receiver operating characteristic curve (AUROC) and sensitivities at fixed specificities were calculated to compare model performance. Evaluation of false-positive rates was used to determine whether the DL model detected POAG before the OHTS Endpoint Committee POAG determination. Results: A total of 1147 participants were included in the training set (661 [57.6%] female; mean age, 57.2 years; 95% CI, 56.6-57.8), 167 in the validation set (97 [58.1%] female; mean age, 57.1 years; 95% CI, 55.6-58.7), and 322 in the test set (173 [53.7%] female; mean age, 57.2 years; 95% CI, 56.1-58.2). The DL model achieved an AUROC of 0.88 (95% CI, 0.82-0.92) for the OHTS Endpoint Committee determination of optic disc or VF changes. For the OHTS end points based on optic disc changes or visual field changes, AUROCs were 0.91 (95% CI, 0.88-0.94) and 0.86 (95% CI, 0.76-0.93), respectively. False-positive rates (at 90% specificity) were higher in photographs of eyes that later developed POAG by disc or visual field (27.5% [56 of 204]) compared with eyes that did not develop POAG (11.4% [50 of 440]) during follow-up. The diagnostic accuracy of the DL model developed on the optic disc end point applied to 3 independent data sets was lower, with AUROCs ranging from 0.74 (95% CI, 0.70-0.77) to 0.79 (95% CI, 0.78-0.81). Conclusions and Relevance: The model's high diagnostic accuracy using OHTS photographs suggests that DL has the potential to standardize and automate POAG determination for clinical trials and management. In addition, the higher false-positive rate in early photographs of eyes that later developed POAG suggests that DL models detected POAG in some eyes earlier than the OHTS Endpoint Committee, reflecting the OHTS design that emphasized a high specificity for POAG determination by requiring a clinically significant change from baseline. 2022 American Medical Association. All rights reserved.
UR - http://www.scopus.com/inward/record.url?scp=85127582419&partnerID=8YFLogxK
U2 - 10.1001/jamaophthalmol.2022.0244
DO - 10.1001/jamaophthalmol.2022.0244
M3 - Article
C2 - 35297959
AN - SCOPUS:85127582419
SN - 2168-6165
VL - 140
SP - 383
EP - 391
JO - JAMA Ophthalmology
JF - JAMA Ophthalmology
IS - 4
ER -