TY - JOUR
T1 - Observer-study-based approaches to quantitatively evaluate the realism of synthetic medical images
AU - Liu, Ziping
AU - Wolfe, Scott
AU - Yu, Zitong
AU - Laforest, Richard
AU - Mhlanga, Joyce C.
AU - Fraum, Tyler J.
AU - Itani, Malak
AU - Dehdashti, Farrokh
AU - Siegel, Barry A.
AU - Jha, Abhinav K.
N1 - Funding Information:
Financial support for this work was provided by the National Institute of Biomedical Imaging and Bioengineering R01-EB031051, R56-EB028287 and R01-EB031962. We also thank Qiye Tan for the help with initial development of web application for conducting the observer study.
Publisher Copyright:
© 2023 The Author(s). Published on behalf of Institute of Physics and Engineering in Medicine by IOP Publishing Ltd.
PY - 2023/4/7
Y1 - 2023/4/7
N2 - Objective. Synthetic images generated by simulation studies have a well-recognized role in developing and evaluating imaging systems and methods. However, for clinically relevant development and evaluation, the synthetic images must be clinically realistic and, ideally, have the same distribution as that of clinical images. Thus, mechanisms that can quantitatively evaluate this clinical realism and, ideally, the similarity in distributions of the real and synthetic images, are much needed. Approach. We investigated two observer-study-based approaches to quantitatively evaluate the clinical realism of synthetic images. In the first approach, we presented a theoretical formalism for the use of an ideal-observer study to quantitatively evaluate the similarity in distributions between the real and synthetic images. This theoretical formalism provides a direct relationship between the area under the receiver operating characteristic curve, AUC, for an ideal observer and the distributions of real and synthetic images. The second approach is based on the use of expert-human-observer studies to quantitatively evaluate the realism of synthetic images. In this approach, we developed a web-based software to conduct two-alternative forced-choice (2-AFC) experiments with expert human observers. The usability of this software was evaluated by conducting a system usability scale (SUS) survey with seven expert human readers and five observer-study designers. Further, we demonstrated the application of this software to evaluate a stochastic and physics-based image-synthesis technique for oncologic positron emission tomography (PET). In this evaluation, the 2-AFC study with our software was performed by six expert human readers, who were highly experienced in reading PET scans, with years of expertise ranging from 7 to 40 years (median: 12 years, average: 20.4 years). Main results. In the ideal-observer-study-based approach, we theoretically demonstrated that the AUC for an ideal observer can be expressed, to an excellent approximation, by the Bhattacharyya distance between the distributions of the real and synthetic images. This relationship shows that a decrease in the ideal-observer AUC indicates a decrease in the distance between the two image distributions. Moreover, a lower bound of ideal-observer AUC = 0.5 implies that the distributions of synthetic and real images exactly match. For the expert-human-observer-study-based approach, our software for performing the 2-AFC experiments is available at https://apps.mir.wustl.edu/twoafc. Results from the SUS survey demonstrate that the web application is very user friendly and accessible. As a secondary finding, evaluation of a stochastic and physics-based PET image-synthesis technique using our software showed that expert human readers had limited ability to distinguish the real images from the synthetic images. Significance. This work addresses the important need for mechanisms to quantitatively evaluate the clinical realism of synthetic images. The mathematical treatment in this paper shows that quantifying the similarity in the distribution of real and synthetic images is theoretically possible by using an ideal-observer-study-based approach. Our developed software provides a platform for designing and performing 2-AFC experiments with human observers in a highly accessible, efficient, and secure manner. Additionally, our results on the evaluation of the stochastic and physics-based image-synthesis technique motivate the application of this technique to develop and evaluate a wide array of PET imaging methods.
AB - Objective. Synthetic images generated by simulation studies have a well-recognized role in developing and evaluating imaging systems and methods. However, for clinically relevant development and evaluation, the synthetic images must be clinically realistic and, ideally, have the same distribution as that of clinical images. Thus, mechanisms that can quantitatively evaluate this clinical realism and, ideally, the similarity in distributions of the real and synthetic images, are much needed. Approach. We investigated two observer-study-based approaches to quantitatively evaluate the clinical realism of synthetic images. In the first approach, we presented a theoretical formalism for the use of an ideal-observer study to quantitatively evaluate the similarity in distributions between the real and synthetic images. This theoretical formalism provides a direct relationship between the area under the receiver operating characteristic curve, AUC, for an ideal observer and the distributions of real and synthetic images. The second approach is based on the use of expert-human-observer studies to quantitatively evaluate the realism of synthetic images. In this approach, we developed a web-based software to conduct two-alternative forced-choice (2-AFC) experiments with expert human observers. The usability of this software was evaluated by conducting a system usability scale (SUS) survey with seven expert human readers and five observer-study designers. Further, we demonstrated the application of this software to evaluate a stochastic and physics-based image-synthesis technique for oncologic positron emission tomography (PET). In this evaluation, the 2-AFC study with our software was performed by six expert human readers, who were highly experienced in reading PET scans, with years of expertise ranging from 7 to 40 years (median: 12 years, average: 20.4 years). Main results. In the ideal-observer-study-based approach, we theoretically demonstrated that the AUC for an ideal observer can be expressed, to an excellent approximation, by the Bhattacharyya distance between the distributions of the real and synthetic images. This relationship shows that a decrease in the ideal-observer AUC indicates a decrease in the distance between the two image distributions. Moreover, a lower bound of ideal-observer AUC = 0.5 implies that the distributions of synthetic and real images exactly match. For the expert-human-observer-study-based approach, our software for performing the 2-AFC experiments is available at https://apps.mir.wustl.edu/twoafc. Results from the SUS survey demonstrate that the web application is very user friendly and accessible. As a secondary finding, evaluation of a stochastic and physics-based PET image-synthesis technique using our software showed that expert human readers had limited ability to distinguish the real images from the synthetic images. Significance. This work addresses the important need for mechanisms to quantitatively evaluate the clinical realism of synthetic images. The mathematical treatment in this paper shows that quantifying the similarity in the distribution of real and synthetic images is theoretically possible by using an ideal-observer-study-based approach. Our developed software provides a platform for designing and performing 2-AFC experiments with human observers in a highly accessible, efficient, and secure manner. Additionally, our results on the evaluation of the stochastic and physics-based image-synthesis technique motivate the application of this technique to develop and evaluate a wide array of PET imaging methods.
KW - image quality assessment
KW - image synthesis
KW - medical imaging
KW - observer study
UR - http://www.scopus.com/inward/record.url?scp=85150752065&partnerID=8YFLogxK
U2 - 10.1088/1361-6560/acc0ce
DO - 10.1088/1361-6560/acc0ce
M3 - Article
C2 - 36863028
AN - SCOPUS:85150752065
SN - 0031-9155
VL - 68
JO - Physics in medicine and biology
JF - Physics in medicine and biology
IS - 7
M1 - 074001
ER -