TY - JOUR
T1 - ComBat harmonization for radiomic features in independent phantom and lung cancer patient computed tomography datasets
AU - Mahon, R. N.
AU - Ghita, M.
AU - Hugo, G. D.
AU - Weiss, E.
N1 - Publisher Copyright:
© 2020 Institute of Physics and Engineering in Medicine.
PY - 2020/1/13
Y1 - 2020/1/13
N2 - This work seeks to evaluate the combatting batch effect (ComBat) harmonization algorithm's ability to reduce the variation in radiomic features arising from different imaging protocols and independently verify published results. The Gammex computed tomography (CT) electron density phantom and Quasar body phantom were imaged using 32 different chest imaging protocols. 107 radiomic features were extracted from 15 spatially varying spherical contours between 1.5 cm and 3 cm in each of the lung300 density, lung450 density, and wood inserts. The Kolmogorov-Smirnov test was used to determine significant differences in the distribution of the features and the concordance correlation coefficient (CCC) was used to measure the repeatability of the features from each protocol variation class (kVp, pitch, etc) before and after ComBat harmonization. P-values were corrected for multiple comparisons using the Benjamini-Hochberg-Yekutieli procedure. Finally, the ComBat algorithm was applied to human subject data using six different thorax imaging protocols with 135 patients. Spherical contours of un-irradiated lung (2 cm) and vertebral bone (1 cm) were used for radiomic feature extraction. ComBat harmonization reduced the percentage of features from significantly different distributions to 0%-2% or preserved 0% across all protocol variations for the lung300, lung450 and wood inserts. For the human subject data, ComBat harmonization reduced the percentage of significantly different features from 0%-59% for bone and 0%-19% for lung to 0% for both. This work verifies previously published results and demonstrates that ComBat harmonization is an effective means to harmonize radiomic features extracted from different imaging protocols to allow comparisons in large multi-institution datasets. Biological variation can be explicitly preserved by providing the ComBat algorithm with clinical or biological variables to protect. ComBat harmonization should be tested for its effect on predictive models.
AB - This work seeks to evaluate the combatting batch effect (ComBat) harmonization algorithm's ability to reduce the variation in radiomic features arising from different imaging protocols and independently verify published results. The Gammex computed tomography (CT) electron density phantom and Quasar body phantom were imaged using 32 different chest imaging protocols. 107 radiomic features were extracted from 15 spatially varying spherical contours between 1.5 cm and 3 cm in each of the lung300 density, lung450 density, and wood inserts. The Kolmogorov-Smirnov test was used to determine significant differences in the distribution of the features and the concordance correlation coefficient (CCC) was used to measure the repeatability of the features from each protocol variation class (kVp, pitch, etc) before and after ComBat harmonization. P-values were corrected for multiple comparisons using the Benjamini-Hochberg-Yekutieli procedure. Finally, the ComBat algorithm was applied to human subject data using six different thorax imaging protocols with 135 patients. Spherical contours of un-irradiated lung (2 cm) and vertebral bone (1 cm) were used for radiomic feature extraction. ComBat harmonization reduced the percentage of features from significantly different distributions to 0%-2% or preserved 0% across all protocol variations for the lung300, lung450 and wood inserts. For the human subject data, ComBat harmonization reduced the percentage of significantly different features from 0%-59% for bone and 0%-19% for lung to 0% for both. This work verifies previously published results and demonstrates that ComBat harmonization is an effective means to harmonize radiomic features extracted from different imaging protocols to allow comparisons in large multi-institution datasets. Biological variation can be explicitly preserved by providing the ComBat algorithm with clinical or biological variables to protect. ComBat harmonization should be tested for its effect on predictive models.
UR - http://www.scopus.com/inward/record.url?scp=85077786630&partnerID=8YFLogxK
U2 - 10.1088/1361-6560/ab6177
DO - 10.1088/1361-6560/ab6177
M3 - Article
C2 - 31835261
AN - SCOPUS:85077786630
SN - 0031-9155
VL - 65
JO - Physics in medicine and biology
JF - Physics in medicine and biology
IS - 1
M1 - 015010
ER -