TY - JOUR
T1 - A Customizable Data Quality Tool for Global Observational Research Networks
AU - Lewis, Judith
AU - Hogan, Brenna
AU - Goss, Charles
AU - Rupasinghe, Dhanushi
AU - Maruri, Fernanda
AU - Hamusonde, Kalongo
AU - Obregon, Savannah
AU - Agarwal, Mansi
AU - Jiamsakul, Awachana
AU - Turner, Megan
AU - Katona, Austin
AU - Althoff, Keri
AU - Duda, Stephany N.
PY - 2025/5/15
Y1 - 2025/5/15
N2 - Evaluating data quality is essential when combining multi-site observational clinical data for analysis. We collaborated with five research networks, representing various data approaches and workflows, to generalize an established data quality checking and report generation tool so it could be implemented more easily by other research consortia. The resulting approach reduced the need for technical expertise at user sites by leveraging the REDCap data collection software to store details about a research group, their data model, and expectations about variables (e.g., plausible numeric range, valid format and codes, date logic). The application then used the REDCap API to retrieve those details and assess a dataset's conformance to the data model, logical consistency, and completeness. Users could download reports that summarized the dataset contents and quality. The generalized Harmonist Data Toolkit was built using the freely available REDCap and R/Shiny platforms, with code available on GitHub. All five collaborating consortia found the Toolkit beneficial in detecting inconsistencies and providing informative data reports and visualizations. The Harmonist Data Toolkit fills a need for data quality and report generation solutions for consortia without local programming expertise.
AB - Evaluating data quality is essential when combining multi-site observational clinical data for analysis. We collaborated with five research networks, representing various data approaches and workflows, to generalize an established data quality checking and report generation tool so it could be implemented more easily by other research consortia. The resulting approach reduced the need for technical expertise at user sites by leveraging the REDCap data collection software to store details about a research group, their data model, and expectations about variables (e.g., plausible numeric range, valid format and codes, date logic). The application then used the REDCap API to retrieve those details and assess a dataset's conformance to the data model, logical consistency, and completeness. Users could download reports that summarized the dataset contents and quality. The generalized Harmonist Data Toolkit was built using the freely available REDCap and R/Shiny platforms, with code available on GitHub. All five collaborating consortia found the Toolkit beneficial in detecting inconsistencies and providing informative data reports and visualizations. The Harmonist Data Toolkit fills a need for data quality and report generation solutions for consortia without local programming expertise.
KW - Datasets as topic
KW - data quality
KW - observational studies
KW - software
UR - http://www.scopus.com/inward/record.url?scp=105005816825&partnerID=8YFLogxK
U2 - 10.3233/SHTI250391
DO - 10.3233/SHTI250391
M3 - Article
C2 - 40380501
AN - SCOPUS:105005816825
SN - 0926-9630
VL - 327
SP - 517
EP - 521
JO - Studies in Health Technology and Informatics
JF - Studies in Health Technology and Informatics
ER -