TY - JOUR
T1 - Data Overuse in Aging Research
T2 - Emerging Issues and Potential Solutions
AU - Mroczek, Daniel K.
AU - Weston, Sara J.
AU - Graham, Eileen K.
AU - Willroth, Emily C.
N1 - Publisher Copyright:
© 2021 American Psychological Association
PY - 2022
Y1 - 2022
N2 - Aging and lifespan development researchers have been fortunate to have public access to many longitudinal datasets. These data are valuable and see high utilization, yet this has a considerable downside. Many of these are heavily overused. Overuse of publicly available datasets creates dependency among published research papers giving the false impression of independent contributions to knowledge by reporting the same associations over multiple papers. This is a potentially serious problem in the aging literature given the high use of a relatively small number of well-known studies. Any irregularities or sampling biases in this relatively small number of samples have outsize influence on perceived answers to key aging questions. We detail this problem, focusing on issues of dependency among studies, sampling bias and overfitting, and contradictory estimates of the same effect from the same data in independent publications. We provide solutions, including greater use of data sharing, pre-registrations, holdout samples, split-sample cross-validation, and coordinated analysis. We argue these valuable datasets are public resources that are being diminished by overuse, with parallels in environmental science. Taking a conservation perspective, we hold that these practices (pre-registration, holdout samples) can preserve data resources for future generations of researchers.
AB - Aging and lifespan development researchers have been fortunate to have public access to many longitudinal datasets. These data are valuable and see high utilization, yet this has a considerable downside. Many of these are heavily overused. Overuse of publicly available datasets creates dependency among published research papers giving the false impression of independent contributions to knowledge by reporting the same associations over multiple papers. This is a potentially serious problem in the aging literature given the high use of a relatively small number of well-known studies. Any irregularities or sampling biases in this relatively small number of samples have outsize influence on perceived answers to key aging questions. We detail this problem, focusing on issues of dependency among studies, sampling bias and overfitting, and contradictory estimates of the same effect from the same data in independent publications. We provide solutions, including greater use of data sharing, pre-registrations, holdout samples, split-sample cross-validation, and coordinated analysis. We argue these valuable datasets are public resources that are being diminished by overuse, with parallels in environmental science. Taking a conservation perspective, we hold that these practices (pre-registration, holdout samples) can preserve data resources for future generations of researchers.
KW - Adult development and aging
KW - Coordinated analysis
KW - Data overuse
KW - Open science
KW - Replicability
UR - https://www.scopus.com/pages/publications/85108326646
U2 - 10.1037/pag0000605
DO - 10.1037/pag0000605
M3 - Article
C2 - 33914579
AN - SCOPUS:85108326646
SN - 0882-7974
VL - 37
SP - 141
EP - 147
JO - Psychology and Aging
JF - Psychology and Aging
IS - 1
ER -