TY - JOUR
T1 - Complexity of ballooned hepatocyte feature recognition
T2 - Defining a training atlas for artificial intelligence-based imaging in NAFLD
AU - Brunt, Elizabeth M.
AU - Clouston, Andrew D.
AU - Goodman, Zachary
AU - Guy, Cynthia
AU - Kleiner, David E.
AU - Lackner, Carolin
AU - Tiniakos, Dina G.
AU - Wee, Aileen
AU - Yeh, Matthew
AU - Leow, Wei Qiang
AU - Chng, Elaine
AU - Ren, Yayun
AU - Boon Bee, George Goh
AU - Powell, Elizabeth E.
AU - Rinella, Mary
AU - Sanyal, Arun J.
AU - Neuschwander-Tetri, Brent
AU - Younossi, Zobair
AU - Charlton, Michael
AU - Ratziu, Vlad
AU - Harrison, Stephen A.
AU - Tai, Dean
AU - Anstee, Quentin M.
N1 - Funding Information:
This study has been supported by Histoindex Pte Ltd (DT, EC, YR); the Newcastle NIHR Biomedical Research Centre (QMA); the Intramural Research Program of the NIH , National Cancer Institute (DEK); and the LITMUS (Liver Investigation: Testing Marker Utility in Steatohepatitis) consortium funded by the Innovative Medicines Initiative (IMI2) Program of the European Union under Grant Agreement 777377; this Joint Undertaking receives support from the European Union’s Horizon 2020 research and innovation programme and EFPIA (QMA, CL, DGT, VR, SH, DT).
Publisher Copyright:
© 2022
PY - 2022/5
Y1 - 2022/5
N2 - Background & Aims: Histologically assessed hepatocyte ballooning is a key feature discriminating non-alcoholic steatohepatitis (NASH) from steatosis (NAFL). Reliable identification underpins patient inclusion in clinical trials and serves as a key regulatory-approved surrogate endpoint for drug efficacy. High inter/intra-observer variation in ballooning measured using the NASH CRN semi-quantitative score has been reported yet no actionable solutions have been proposed. Methods: A focused evaluation of hepatocyte ballooning recognition was conducted. Digitized slides were evaluated by 9 internationally recognized expert liver pathologists on 2 separate occasions: each pathologist independently marked every ballooned hepatocyte and later provided an overall non-NASH NAFL/NASH assessment. Interobserver variation was assessed and a ‘concordance atlas’ of ballooned hepatocytes generated to train second harmonic generation/two-photon excitation fluorescence imaging-based artificial intelligence (AI). Results: The Fleiss kappa statistic for overall interobserver agreement for presence/absence of ballooning was 0.197 (95% CI 0.094–0.300), rising to 0.362 (0.258–0.465) with a ≥5-cell threshold. However, the intraclass correlation coefficient for consistency was higher (0.718 [0.511–0.900]), indicating ‘moderate’ agreement on ballooning burden. 133 ballooned cells were identified using a ≥5/9 majority to train AI ballooning detection (AI-pathologist pairwise concordance 19–42%, comparable to inter-pathologist pairwise concordance of between 8–75%). AI quantified change in ballooned cell burden in response to therapy in a separate slide set. Conclusions: The substantial divergence in hepatocyte ballooning identified amongst expert hepatopathologists suggests that ballooning is a spectrum, too subjective for its presence or complete absence to be unequivocally determined as a trial endpoint. A concordance atlas may be used to train AI assistive technologies to reproducibly quantify ballooned hepatocytes that standardize assessment of therapeutic efficacy. This atlas serves as a reference standard for ongoing work to refine how ballooning is classified by both pathologists and AI. Lay summary: For the first time, we show that, even amongst expert hepatopathologists, there is poor agreement regarding the number of ballooned hepatocytes seen on the same digitized histology images. This has important implications as the presence of ballooning is needed to establish the diagnosis of non-alcoholic steatohepatitis (NASH), and its unequivocal absence is one of the key requirements to show ‘NASH resolution’ to support drug efficacy in clinical trials. Artificial intelligence-based approaches may provide a more reliable way to assess the range of injury recorded as “hepatocyte ballooning”.
AB - Background & Aims: Histologically assessed hepatocyte ballooning is a key feature discriminating non-alcoholic steatohepatitis (NASH) from steatosis (NAFL). Reliable identification underpins patient inclusion in clinical trials and serves as a key regulatory-approved surrogate endpoint for drug efficacy. High inter/intra-observer variation in ballooning measured using the NASH CRN semi-quantitative score has been reported yet no actionable solutions have been proposed. Methods: A focused evaluation of hepatocyte ballooning recognition was conducted. Digitized slides were evaluated by 9 internationally recognized expert liver pathologists on 2 separate occasions: each pathologist independently marked every ballooned hepatocyte and later provided an overall non-NASH NAFL/NASH assessment. Interobserver variation was assessed and a ‘concordance atlas’ of ballooned hepatocytes generated to train second harmonic generation/two-photon excitation fluorescence imaging-based artificial intelligence (AI). Results: The Fleiss kappa statistic for overall interobserver agreement for presence/absence of ballooning was 0.197 (95% CI 0.094–0.300), rising to 0.362 (0.258–0.465) with a ≥5-cell threshold. However, the intraclass correlation coefficient for consistency was higher (0.718 [0.511–0.900]), indicating ‘moderate’ agreement on ballooning burden. 133 ballooned cells were identified using a ≥5/9 majority to train AI ballooning detection (AI-pathologist pairwise concordance 19–42%, comparable to inter-pathologist pairwise concordance of between 8–75%). AI quantified change in ballooned cell burden in response to therapy in a separate slide set. Conclusions: The substantial divergence in hepatocyte ballooning identified amongst expert hepatopathologists suggests that ballooning is a spectrum, too subjective for its presence or complete absence to be unequivocally determined as a trial endpoint. A concordance atlas may be used to train AI assistive technologies to reproducibly quantify ballooned hepatocytes that standardize assessment of therapeutic efficacy. This atlas serves as a reference standard for ongoing work to refine how ballooning is classified by both pathologists and AI. Lay summary: For the first time, we show that, even amongst expert hepatopathologists, there is poor agreement regarding the number of ballooned hepatocytes seen on the same digitized histology images. This has important implications as the presence of ballooning is needed to establish the diagnosis of non-alcoholic steatohepatitis (NASH), and its unequivocal absence is one of the key requirements to show ‘NASH resolution’ to support drug efficacy in clinical trials. Artificial intelligence-based approaches may provide a more reliable way to assess the range of injury recorded as “hepatocyte ballooning”.
KW - Artificial intelligence
KW - Ballooning
KW - Histology
KW - Machine learning
KW - NAFLD
KW - NASH
KW - Nonalcoholic fatty liver disease
KW - nonalcoholic steatohepatitis
UR - http://www.scopus.com/inward/record.url?scp=85125436557&partnerID=8YFLogxK
U2 - 10.1016/j.jhep.2022.01.011
DO - 10.1016/j.jhep.2022.01.011
M3 - Article
C2 - 35090960
AN - SCOPUS:85125436557
SN - 0168-8278
VL - 76
SP - 1030
EP - 1041
JO - Journal of Hepatology
JF - Journal of Hepatology
IS - 5
ER -