Abstract

Background: Preclinical low-count positron emission tomography (LC-PET) imaging offers numerous advantages such as facilitating imaging logistics, enabling longitudinal studies of long- and short-lived isotopes as well as increasing scanner throughput. However, LC-PET is characterized by reduced photon-count levels resulting in low signal-to-noise ratio (SNR), segmentation difficulties, and quantification uncertainties. Purpose: We developed and evaluated a novel deep-learning (DL) architecture—Attention based Residual-Dilated Net (ARD-Net)—to generate standard-count PET (SC-PET) images from LC-PET images. The performance of the ARD-Net framework was evaluated for numerous low count realizations using fidelity-based qualitative metrics, task-based segmentation, and quantitative metrics. Method: Patient Derived tumor Xenograft (PDX) with tumors implanted in the mammary fat-pad were subjected to preclinical [18F]-Fluorodeoxyglucose (FDG)-PET/CT imaging. SC-PET images were derived from a 10 min static FDG-PET acquisition, 50 min post administration of FDG, and were resampled to generate four distinct LC-PET realizations corresponding to 10%, 5%, 1.6%, and 0.8% of SC-PET count-level. ARD-Net was trained and optimized using 48 preclinical FDG-PET datasets, while 16 datasets were utilized to assess performance. Further, the performance of ARD-Net was benchmarked against two leading DL-based methods (Residual UNet, RU-Net; and Dilated Network, D-Net) and non-DL methods (Non-Local Means, NLM; and Block Matching 3D Filtering, BM3D). The performance of the framework was evaluated using traditional fidelity-based image quality metrics such as Structural Similarity Index Metric (SSIM) and Normalized Root Mean Square Error (NRMSE), as well as human observer-based tumor segmentation performance (Dice Score and volume bias) and quantitative analysis of Standardized Uptake Value (SUV) measurements. Additionally, radiomics-derived features were utilized as a measure of quality assurance (QA) in comparison to true SC-PET. Finally, a performance ensemble score (EPS) was developed by integrating fidelity-based and task-based metrics. Concordance Correlation Coefficient (CCC) was utilized to determine concordance between measures. The non-parametric Friedman Test with Bonferroni correction was used to compare the performance of ARD-Net against benchmarked methods with significance at adjusted p-value ≤0.01. Results: ARD-Net-generated SC-PET images exhibited significantly better (p ≤ 0.01 post Bonferroni correction) overall image fidelity scores in terms of SSIM and NRMSE at majority of photon-count levels compared to benchmarked DL and non-DL methods. In terms of task-based quantitative accuracy evaluated by SUVMean and SUVPeak, ARD-Net exhibited less than 5% median absolute bias for SUVMean compared to true SC-PET and lower degree of variability compared to benchmarked DL and non-DL based methods in generating SC-PET. Additionally, ARD-Net-generated SC-PET images displayed higher degree of concordance to SC-PET images in terms of radiomics features compared to non-DL and other DL approaches. Finally, the ensemble score suggested that ARD-Net exhibited significantly superior performance compared to benchmarked algorithms (p ≤ 0.01 post Bonferroni correction). Conclusion: ARD-Net provides a robust framework to generate SC-PET from LC-PET images. ARD-Net generated SC-PET images exhibited superior performance compared other DL and non-DL approaches in terms of image-fidelity based metrics, task-based segmentation metrics, and minimal bias in terms of task-based quantification performance for preclinical PET imaging.

Original languageEnglish
Pages (from-to)4324-4339
Number of pages16
JournalMedical physics
Volume51
Issue number6
DOIs
StatePublished - Jun 2024

Keywords

  • deep learning
  • image denoising
  • low-count PET
  • preclinical PET imaging
  • quantitative imaging
  • radiomics
  • task-based performance

Fingerprint

Dive into the research topics of 'Deep learning generation of preclinical positron emission tomography (PET) images from low-count PET with task-based performance assessment'. Together they form a unique fingerprint.

Cite this