TY - GEN
T1 - A Customizable Data Quality Tool for Global Observational Research Networks
AU - Lewis, Judith
AU - Hogan, Brenna
AU - Goss, Charles
AU - Rupasinghe, Dhanushi
AU - Maruri, Fernanda
AU - Hamusonde, Kalongo
AU - Obregon, Savannah
AU - Agarwal, Mansi
AU - Jiamsakul, Awachana
AU - Turner, Megan
AU - Katona, Austin
AU - Althoff, Keri
AU - Duda, Stephany N.
N1 - Publisher Copyright:
© 2025 The Authors.
PY - 2025/5/15
Y1 - 2025/5/15
N2 - Evaluating data quality is essential when combining multi-site observational clinical data for analysis. We collaborated with five research networks, representing various data approaches and workflows, to generalize an established data quality checking and report generation tool so it could be implemented more easily by other research consortia. The resulting approach reduced the need for technical expertise at user sites by leveraging the REDCap data collection software to store details about a research group, their data model, and expectations about variables (e.g., plausible numeric range, valid format and codes, date logic). The application then used the REDCap API to retrieve those details and assess a dataset’s conformance to the data model, logical consistency, and completeness. Users could download reports that summarized the dataset contents and quality. The generalized Harmonist Data Toolkit was built using the freely available REDCap and R/Shiny platforms, with code available on GitHub. All five collaborating consortia found the Toolkit beneficial in detecting inconsistencies and providing informative data reports and visualizations. The Harmonist Data Toolkit fills a need for data quality and report generation solutions for consortia without local programming expertise.
AB - Evaluating data quality is essential when combining multi-site observational clinical data for analysis. We collaborated with five research networks, representing various data approaches and workflows, to generalize an established data quality checking and report generation tool so it could be implemented more easily by other research consortia. The resulting approach reduced the need for technical expertise at user sites by leveraging the REDCap data collection software to store details about a research group, their data model, and expectations about variables (e.g., plausible numeric range, valid format and codes, date logic). The application then used the REDCap API to retrieve those details and assess a dataset’s conformance to the data model, logical consistency, and completeness. Users could download reports that summarized the dataset contents and quality. The generalized Harmonist Data Toolkit was built using the freely available REDCap and R/Shiny platforms, with code available on GitHub. All five collaborating consortia found the Toolkit beneficial in detecting inconsistencies and providing informative data reports and visualizations. The Harmonist Data Toolkit fills a need for data quality and report generation solutions for consortia without local programming expertise.
KW - Datasets as topic
KW - data quality
KW - observational studies
KW - software
UR - https://www.scopus.com/pages/publications/105005816825
U2 - 10.3233/SHTI250391
DO - 10.3233/SHTI250391
M3 - Conference contribution
C2 - 40380501
AN - SCOPUS:105005816825
T3 - Studies in Health Technology and Informatics
SP - 517
EP - 521
BT - Intelligent Health Systems - From Technology to Data and Knowledge, Proceedings of MIE 2025
A2 - Andrikopoulou, Elisavet
A2 - Gallos, Parisis
A2 - Arvanitis, Theodoros N.
A2 - Austin, Rosalynn
A2 - Benis, Arriel
A2 - Cornet, Ronald
A2 - Chatzistergos, Panagiotis
A2 - Dejaco, Alexander
A2 - Dusseljee-Peute, Linda
A2 - Mohasseb, Alaa
A2 - Natsiavas, Pantelis
A2 - Nakkas, Haythem
A2 - Scott, Philip
PB - IOS Press BV
T2 - 35th Medical Informatics Europe Conference, MIE 2025
Y2 - 19 May 2025 through 21 May 2025
ER -