TY - JOUR
T1 - A comparison of prediction approaches for identifying prodromal Parkinson disease
AU - Warden, Mark N.
AU - Nielsen, Susan Searles
AU - Camacho-Soto, Alejandra
AU - Garnett, Roman
AU - Racette, Brad A.
N1 - Publisher Copyright:
© 2021 Warden et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
PY - 2021/8
Y1 - 2021/8
N2 - Identifying people with Parkinson disease during the prodromal period, including via algorithms in administrative claims data, is an important research and clinical priority. We sought to improve upon an existing penalized logistic regression model, based on diagnosis and procedure codes, by adding prescription medication data or using machine learning. Using Medicare Part D beneficiaries age 66-90 from a population-based case-control study of incident Parkinson disease, we fit a penalized logistic regression both with and without Part D data. We also built a predictive algorithm using a random forest classifier for comparison. In a combined approach, we introduced the probability of Parkinson disease from the random forest, as a predictor in the penalized regression model. We calculated the receiver operator characteristic area under the curve (AUC) for each model. All models performed well, with AUCs ranging from 0.824 (simplest model) to 0.835 (combined approach). We conclude that medication data and random forests improve Parkinson disease prediction, but are not essential.
AB - Identifying people with Parkinson disease during the prodromal period, including via algorithms in administrative claims data, is an important research and clinical priority. We sought to improve upon an existing penalized logistic regression model, based on diagnosis and procedure codes, by adding prescription medication data or using machine learning. Using Medicare Part D beneficiaries age 66-90 from a population-based case-control study of incident Parkinson disease, we fit a penalized logistic regression both with and without Part D data. We also built a predictive algorithm using a random forest classifier for comparison. In a combined approach, we introduced the probability of Parkinson disease from the random forest, as a predictor in the penalized regression model. We calculated the receiver operator characteristic area under the curve (AUC) for each model. All models performed well, with AUCs ranging from 0.824 (simplest model) to 0.835 (combined approach). We conclude that medication data and random forests improve Parkinson disease prediction, but are not essential.
UR - http://www.scopus.com/inward/record.url?scp=85113730177&partnerID=8YFLogxK
U2 - 10.1371/journal.pone.0256592
DO - 10.1371/journal.pone.0256592
M3 - Article
C2 - 34437600
AN - SCOPUS:85113730177
SN - 1932-6203
VL - 16
JO - PloS one
JF - PloS one
IS - 8 August
M1 - e0256592
ER -