Diagnosing and grading gastric atrophy and intestinal metaplasia using semi-supervised deep learning on pathological images: development and validation study

Shuangshuang Fang, Zhenyu Liu, Qi Qiu, Zhenchao Tang, Yang Yang, Zhongsheng Kuang, Xiaohua Du, Shanshan Xiao, Yanyan Liu, Yuanbin Luo, Liping Gu, Li Tian, Xiaoxia Liang, Guiling Fan, Yu Zhang, Ping Zhang, Weixun Zhou, Xiuli Liu, Jie Tian, Wei Wei

Research output: Contribution to journalArticlepeer-review

1 Scopus citations


Objective: Patients with gastric atrophy and intestinal metaplasia (IM) were at risk for gastric cancer, necessitating an accurate risk assessment. We aimed to establish and validate a diagnostic approach for gastric biopsy specimens using deep learning and OLGA/OLGIM for individual gastric cancer risk classification. Methods: In this study, we prospectively enrolled 545 patients suspected of atrophic gastritis during endoscopy from 13 tertiary hospitals between December 22, 2017, to September 25, 2020, with a total of 2725 whole-slide images (WSIs). Patients were randomly divided into a training set (n = 349), an internal validation set (n = 87), and an external validation set (n = 109). Sixty patients from the external validation set were randomly selected and divided into two groups for an observer study, one with the assistance of algorithm results and the other without. We proposed a semi-supervised deep learning algorithm to diagnose and grade IM and atrophy, and we compared it with the assessments of 10 pathologists. The model’s performance was evaluated based on the area under the curve (AUC), sensitivity, specificity, and weighted kappa value. Results: The algorithm, named GasMIL, was established and demonstrated encouraging performance in diagnosing IM (AUC 0.884, 95% CI 0.862–0.902) and atrophy (AUC 0.877, 95% CI 0.855–0.897) in the external test set. In the observer study, GasMIL achieved an 80% sensitivity, 85% specificity, a weighted kappa value of 0.61, and an AUC of 0.953, surpassing the performance of all ten pathologists in diagnosing atrophy. Among the 10 pathologists, GasMIL’s AUC ranked second in OLGA (0.729, 95% CI 0.625–0.833) and fifth in OLGIM (0.792, 95% CI 0.688–0.896). With the assistance of GasMIL, pathologists demonstrated improved AUC (p = 0.013), sensitivity (p = 0.014), and weighted kappa (p = 0.016) in diagnosing IM, and improved specificity (p = 0.007) in diagnosing atrophy compared to pathologists working alone. Conclusion: GasMIL shows the best overall performance in diagnosing IM and atrophy when compared to pathologists, significantly enhancing their diagnostic capabilities.

Original languageEnglish
Pages (from-to)343-354
Number of pages12
JournalGastric Cancer
Issue number2
StatePublished - Mar 2024


  • Atrophic gastritis
  • Diagnose
  • Semi-supervised deep learning
  • The operative link for gastric intestinal metaplasia assessment
  • The operative link for gastritis assessment


Dive into the research topics of 'Diagnosing and grading gastric atrophy and intestinal metaplasia using semi-supervised deep learning on pathological images: development and validation study'. Together they form a unique fingerprint.

Cite this