TY - JOUR
T1 - The foundational capabilities of large language models in predicting postoperative risks using clinical notes
AU - Alba, Charles
AU - Xue, Bing
AU - Abraham, Joanna
AU - Kannampallil, Thomas
AU - Lu, Chenyang
N1 - Publisher Copyright:
© The Author(s) 2025.
PY - 2025/12
Y1 - 2025/12
N2 - Clinical notes recorded during a patient’s perioperative journey holds immense informational value. Advances in large language models (LLMs) offer opportunities for bridging this gap. Using 84,875 preoperative notes and its associated surgical cases from 2018 to 2021, we examine the performance of LLMs in predicting six postoperative risks using various fine-tuning strategies. Pretrained LLMs outperformed traditional word embeddings by an absolute AUROC of 38.3% and AUPRC of 33.2%. Self-supervised fine-tuning further improved performance by 3.2% and 1.5%. Incorporating labels into training further increased AUROC by 1.8% and AUPRC by 2%. The highest performance was achieved with a unified foundation model, with improvements of 3.6% for AUROC and 2.6% for AUPRC compared to self-supervision, highlighting the foundational capabilities of LLMs in predicting postoperative risks, which could be potentially beneficial when deployed for perioperative care.
AB - Clinical notes recorded during a patient’s perioperative journey holds immense informational value. Advances in large language models (LLMs) offer opportunities for bridging this gap. Using 84,875 preoperative notes and its associated surgical cases from 2018 to 2021, we examine the performance of LLMs in predicting six postoperative risks using various fine-tuning strategies. Pretrained LLMs outperformed traditional word embeddings by an absolute AUROC of 38.3% and AUPRC of 33.2%. Self-supervised fine-tuning further improved performance by 3.2% and 1.5%. Incorporating labels into training further increased AUROC by 1.8% and AUPRC by 2%. The highest performance was achieved with a unified foundation model, with improvements of 3.6% for AUROC and 2.6% for AUPRC compared to self-supervision, highlighting the foundational capabilities of LLMs in predicting postoperative risks, which could be potentially beneficial when deployed for perioperative care.
UR - http://www.scopus.com/inward/record.url?scp=85218339500&partnerID=8YFLogxK
U2 - 10.1038/s41746-025-01489-2
DO - 10.1038/s41746-025-01489-2
M3 - Article
C2 - 39934379
AN - SCOPUS:85218339500
SN - 2398-6352
VL - 8
JO - npj Digital Medicine
JF - npj Digital Medicine
IS - 1
M1 - 95
ER -