TY - JOUR
T1 - Development of novel composite data quality scores to evaluate facility-level data quality in electronic data in Kenya
T2 - a nationwide retrospective cohort study
AU - Odeny, Beryne M.
AU - Njoroge, Anne
AU - Gloyd, Steve
AU - Hughes, James P.
AU - Wagenaar, Bradley H.
AU - Odhiambo, Jacob
AU - Nyagah, Lilly M.
AU - Manya, Ayub
AU - Oghera, Ooga Wesley
AU - Puttkammer, Nancy
N1 - Publisher Copyright:
© 2023, BioMed Central Ltd., part of Springer Nature.
PY - 2023/12
Y1 - 2023/12
N2 - Background: In this evaluation, we aim to strengthen Routine Health Information Systems (RHIS) through the digitization of data quality assessment (DQA) processes. We leverage electronic data from the Kenya Health Information System (KHIS) which is based on the District Health Information System version 2 (DHIS2) to perform DQAs at scale. We provide a systematic guide to developing composite data quality scores and use these scores to assess data quality in Kenya. Methods: We evaluated 187 HIV care facilities with electronic medical records across Kenya. Using quarterly, longitudinal KHIS data from January 2011 to June 2018 (total N = 30 quarters), we extracted indicators encompassing general HIV services including services to prevent mother-to-child transmission (PMTCT). We assessed the accuracy (the extent to which data were correct and free of error) of these data using three data-driven composite scores: 1) completeness score; 2) consistency score; and 3) discrepancy score. Completeness refers to the presence of the appropriate amount of data. Consistency refers to uniformity of data across multiple indicators. Discrepancy (measured on a Z-scale) refers to the degree of alignment (or lack thereof) of data with rules that defined the possible valid values for the data. Results: A total of 5,610 unique facility-quarters were extracted from KHIS. The mean completeness score was 61.1% [standard deviation (SD) = 27%]. The mean consistency score was 80% (SD = 16.4%). The mean discrepancy score was 0.07 (SD = 0.22). A strong and positive correlation was identified between the consistency score and discrepancy score (correlation coefficient = 0.77), whereas the correlation of either score with the completeness score was low with a correlation coefficient of -0.12 (with consistency score) and -0.36 (with discrepancy score). General HIV indicators were more complete, but less consistent, and less plausible than PMTCT indicators. Conclusion: We observed a lack of correlation between the completeness score and the other two scores. As such, for a holistic DQA, completeness assessment should be paired with the measurement of either consistency or discrepancy to reflect distinct dimensions of data quality. Given the complexity of the discrepancy score, we recommend the simpler consistency score, since they were highly correlated. Routine use of composite scores on KHIS data could enhance efficiencies in DQA at scale as digitization of health information expands and could be applied to other health sectors beyondHIV clinics.
AB - Background: In this evaluation, we aim to strengthen Routine Health Information Systems (RHIS) through the digitization of data quality assessment (DQA) processes. We leverage electronic data from the Kenya Health Information System (KHIS) which is based on the District Health Information System version 2 (DHIS2) to perform DQAs at scale. We provide a systematic guide to developing composite data quality scores and use these scores to assess data quality in Kenya. Methods: We evaluated 187 HIV care facilities with electronic medical records across Kenya. Using quarterly, longitudinal KHIS data from January 2011 to June 2018 (total N = 30 quarters), we extracted indicators encompassing general HIV services including services to prevent mother-to-child transmission (PMTCT). We assessed the accuracy (the extent to which data were correct and free of error) of these data using three data-driven composite scores: 1) completeness score; 2) consistency score; and 3) discrepancy score. Completeness refers to the presence of the appropriate amount of data. Consistency refers to uniformity of data across multiple indicators. Discrepancy (measured on a Z-scale) refers to the degree of alignment (or lack thereof) of data with rules that defined the possible valid values for the data. Results: A total of 5,610 unique facility-quarters were extracted from KHIS. The mean completeness score was 61.1% [standard deviation (SD) = 27%]. The mean consistency score was 80% (SD = 16.4%). The mean discrepancy score was 0.07 (SD = 0.22). A strong and positive correlation was identified between the consistency score and discrepancy score (correlation coefficient = 0.77), whereas the correlation of either score with the completeness score was low with a correlation coefficient of -0.12 (with consistency score) and -0.36 (with discrepancy score). General HIV indicators were more complete, but less consistent, and less plausible than PMTCT indicators. Conclusion: We observed a lack of correlation between the completeness score and the other two scores. As such, for a holistic DQA, completeness assessment should be paired with the measurement of either consistency or discrepancy to reflect distinct dimensions of data quality. Given the complexity of the discrepancy score, we recommend the simpler consistency score, since they were highly correlated. Routine use of composite scores on KHIS data could enhance efficiencies in DQA at scale as digitization of health information expands and could be applied to other health sectors beyondHIV clinics.
KW - DHIS2
KW - Data quality assessment
KW - EMRs
KW - HIV
UR - http://www.scopus.com/inward/record.url?scp=85174803248&partnerID=8YFLogxK
U2 - 10.1186/s12913-023-10133-2
DO - 10.1186/s12913-023-10133-2
M3 - Article
C2 - 37872540
AN - SCOPUS:85174803248
SN - 1472-6963
VL - 23
JO - BMC health services research
JF - BMC health services research
IS - 1
M1 - 1139
ER -