Predicting 30-Day Postoperative Mortality and American Society of Anesthesiologists Physical Status Using Retrieval-Augmented Large Language Models: Development and Validation Study

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Journal of Medical Internet Research vol. 27 (2025), p. e75052
1. Verfasser: Ying-Hao, Chen
Weitere Verfasser: Shanq-Jang Ruan, Pei-fu, Chen
Veröffentlicht:
Gunther Eysenbach MD MPH, Associate Professor
Schlagworte:
Online-Zugang:Citation/Abstract
Full Text + Graphics
Full Text - PDF
Tags: Tag hinzufügen
Keine Tags, Fügen Sie das erste Tag hinzu!

MARC

LEADER 00000nab a2200000uu 4500
001 3222369277
003 UK-CbPIL
022 |a 1438-8871 
024 7 |a 10.2196/75052  |2 doi 
035 |a 3222369277 
045 2 |b d20250101  |b d20251231 
100 1 |a Ying-Hao, Chen 
245 1 |a Predicting 30-Day Postoperative Mortality and American Society of Anesthesiologists Physical Status Using Retrieval-Augmented Large Language Models: Development and Validation Study 
260 |b Gunther Eysenbach MD MPH, Associate Professor  |c 2025 
513 |a Journal Article 
520 3 |a Background:Accurately assessing perioperative risk is critical for informed surgical planning and patient safety. However, current prediction models often rely on structured data and overlook the nuanced clinical reasoning embedded in free-text preoperative notes. Recent advances in large language models (LLMs) have opened opportunities for harnessing unstructured clinical data, yet their application in perioperative prediction remains limited by concerns about factual accuracy. Retrieval-augmented generation (RAG) offers a promising solution—enhancing LLM performance by grounding outputs in domain-specific knowledge sources, potentially improving both predictive accuracy and clinical interpretability.Objective:This study aimed to investigate whether integrating LLMs with RAG can improve the prediction of 30-day postoperative mortality and American Society of Anesthesiologists (ASA) physical status classification using unstructured preoperative clinical notes.Methods:We conducted a retrospective cohort study using 24,491 medical records from a tertiary medical center, including preoperative anesthesia assessments, discharge summaries, and surgical information. To extract clinical insights from free-text data, we used the LLaMA 3.1-8B language model with RAG, using MedEmbed for text embedding and Miller’s Anesthesia as the primary retrieval source. We evaluated model performance under various configurations, including embedding models, chunk sizes, and few-shot prompting. Machine learning (ML) models, including random forest, support vector machines (SVM), Extreme Gradient Boosting (XGBoost), and logistic regression, were trained on structured features as baselines.Results:A total of 520 (2.1%) patients experienced in-hospital 30-day postoperative mortality. The ASA physical status distribution was as follows: class I: 535 (2.2%); class II: 15,272 (62.4%); class III: 8024 (32.8%); class IV: 606 (2.5%); and class V: 54 (0.22%). For 30-day postoperative mortality prediction, the LLaMA‑RAG model achieved an F1-score of 0.4663 (95% CI 0.4654-0.4672), versus 0.2369 (95% CI 0.2341-0.2397) without few‑shot prompting, 0.0879 (95% CI 0.0717-0.1041) without RAG, and 0.0436 (95% CI 0.0292-0.0580) without either few‑shot prompting or RAG. Among ML models, XGBoost scored 0.4459 (95% CI 0.4176-0.4742); random forest, 0.3953 (95% CI 0.3791-0.4115); logistic regression, 0.2720 (95% CI 0.2647-0.2793); and SVM, 0.2474 (95% CI 0.2275-0.2673). For ASA classification, LLaMA‑RAG achieved a micro F1-score of 0.8409 (95% CI 0.8238-0.8551) versus 0.6546 (95% CI 0.6430-0.6796) without few-shot prompting, 0.6340 (95% CI 0.6157-0.6535) without RAG, and 0.4238 (95% CI 0.3952-0.4490) without either few‑shot prompting or RAG. In comparison, XGBoost achieved 0.8273 (95% CI 0.8209-0.8498); logistic regression, 0.7940 (95% CI 0.7671-0.7950); random forest, 0.7847 (95% CI 0.7637-0.7868); and SVM, 0.7697 (95% CI 0.7637-0.7697). Notably, the model demonstrated exceptional sensitivity in identifying rare but high-risk cases, such as ASA Class 5 patients and postoperative deaths.Conclusions:The LLaMA-RAG model significantly improved the prediction of postoperative mortality and ASA classification, especially for rare high-risk cases. By grounding outputs in domain knowledge, retrieval-augmented generation enhanced both accuracy and prompt‑driven interpretability over ML and ablation models—highlighting its promise for real-world clinical decision support. 
610 4 |a American Society of Anesthesiologists 
653 |a Exercise 
653 |a Diabetes 
653 |a Discharge summaries 
653 |a Mortality 
653 |a Vital signs 
653 |a Medical personnel 
653 |a Risk assessment 
653 |a Retrieval 
653 |a Anesthesia 
653 |a Cohort analysis 
653 |a Potassium 
653 |a Blood & organ donations 
653 |a Performance evaluation 
653 |a Chronic obstructive pulmonary disease 
653 |a Augmentation 
653 |a Oxygen saturation 
653 |a Prediction models 
653 |a Patients 
653 |a Hemoglobin 
653 |a Classification 
653 |a Surgery 
653 |a Clinical decision making 
653 |a Heart attacks 
653 |a Perioperative care 
653 |a Validation studies 
653 |a Blood pressure 
653 |a Hypertension 
653 |a Decision making 
653 |a Body temperature 
653 |a Hospitals 
653 |a High risk 
653 |a Natural language processing 
653 |a Medical records 
653 |a Large language models 
653 |a Decision support systems 
653 |a Accuracy 
653 |a Heart rate 
653 |a Machine learning 
653 |a Risk 
653 |a Data 
653 |a Deaths 
653 |a Medical decision making 
653 |a Language 
653 |a Language modeling 
653 |a Machinery 
700 1 |a Shanq-Jang Ruan 
700 1 |a Pei-fu, Chen 
773 0 |t Journal of Medical Internet Research  |g vol. 27 (2025), p. e75052 
786 0 |d ProQuest  |t Library Science Database 
856 4 1 |3 Citation/Abstract  |u https://www.proquest.com/docview/3222369277/abstract/embedded/7BTGNMKEMPT1V9Z2?source=fedsrch 
856 4 0 |3 Full Text + Graphics  |u https://www.proquest.com/docview/3222369277/fulltextwithgraphics/embedded/7BTGNMKEMPT1V9Z2?source=fedsrch 
856 4 0 |3 Full Text - PDF  |u https://www.proquest.com/docview/3222369277/fulltextPDF/embedded/7BTGNMKEMPT1V9Z2?source=fedsrch