Our goal was to estimate and compare across different readers the reproducibility of the 18F-FDG PET standardized uptake value (SUV) and CT size measurements, and changes in those measurements, in malignant tumors before and after therapy. Methods: Fifty-two tumors in 25 patients were evaluated on 18F-FDG PET/CT scans. Maximum SUVs (SUVbw max) and CT size measurements were determined for each tumor independently on pre- and posttreatment scans by 8 different readers (4 PET, 4 CT) using routine nonautomated clinical methods. Percentage changes in SUVbw max and CT size between pre- and posttreatment scans were calculated. Interobserver reproducibility of SUVbw max, CT size, and changes in these values were described by intraclass correlation coefficients (ICCs) and estimates of variance. Results: The ICC was higher for the pretreatment, posttreatment, and percentage change in SUVbw max than the ICC for the longest CT size and the 2-dimensional CT size (before treatment, 0.93, 0.72, and 0.61, respectively; after treatment, 0.91, 0.85, and 0.45, respectively; and percentage change, 0.94, 0.70, and 0.33, respectively). The variability of SUVbw max was significantly lower than the variability of the longest CT size and the 2-dimensional CT size (mean ± SD before treatment, 6.3% ± 14.2%, 16.2% ± 17.8%, and 27.5% ± 26.7%, respectively, P ≤ 0.001; and after treatment, 18.4% ± 26.8%, 35.1% ± 47.5%, and 50.9% ± 51.4%, respectively, P ≤ 0.02). The variability of percentage change in SUVbw max (16.7% ± 36.2%) was significantly lower than that for percentage change in the longest CT size (156.3% ± 157.3%, P ≤ 0.0001) and the 2-dimensional CT size (178.4% ± 546.5%, P < 0.0001). Conclusion: The interobserver reproducibility of SUVbw max for both untreated and treated tumors and percentage change in SUVbw max are substantially higher than measurements of CT size and percentage change in CT size. Measurements of tumor metabolism by PET should be included in trials to assess response to therapy. Although PET reproducibility was high, the variability observed in analyses of identical image sets by 4 readers indicates that automated analytic tools to assess response might be helpful to further enhance reproducibility.
- F-FDG PET