Investigating the Performance of Retrieval-Augmented Generation and Domain-Specific Fine-Tuning for the Development of AI-Driven Knowledge-Based Systems
Gespeichert in:
| Veröffentlicht in: | Machine Learning and Knowledge Extraction vol. 7, no. 1 (2025), p. 15 |
|---|---|
| 1. Verfasser: | |
| Weitere Verfasser: | , , |
| Veröffentlicht: |
MDPI AG
|
| Schlagworte: | |
| Online-Zugang: | Citation/Abstract Full Text + Graphics Full Text - PDF |
| Tags: |
Keine Tags, Fügen Sie das erste Tag hinzu!
|
MARC
| LEADER | 00000nab a2200000uu 4500 | ||
|---|---|---|---|
| 001 | 3181641098 | ||
| 003 | UK-CbPIL | ||
| 022 | |a 2504-4990 | ||
| 024 | 7 | |a 10.3390/make7010015 |2 doi | |
| 035 | |a 3181641098 | ||
| 045 | 2 | |b d20250101 |b d20250331 | |
| 100 | 1 | |a Lakatos, Róbert |u Department of Data Science and Visualization, Faculty of Informatics, University of Debrecen, 4032 Debrecen, Hungary; <email>hajdu.andras@inf.unideb.hu</email>; Doctoral School of Informatics, University of Debrecen, 4032 Debrecen, Hungary; Neumann Technology Platform, Neumann Nonprofit Ltd., 1074 Budapest, Hungary | |
| 245 | 1 | |a Investigating the Performance of Retrieval-Augmented Generation and Domain-Specific Fine-Tuning for the Development of AI-Driven Knowledge-Based Systems | |
| 260 | |b MDPI AG |c 2025 | ||
| 513 | |a Journal Article | ||
| 520 | 3 | |a Generative large language models (LLMs) have revolutionized the development of knowledge-based systems, enabling new possibilities in applications like ChatGPT, Bing, and Gemini. Two key strategies for domain adaptation in these systems are Domain-Specific Fine-Tuning (DFT) and Retrieval-Augmented Generation (RAG). In this study, we evaluate the performance of RAG and DFT on several LLM architectures, including GPT-J-6B, OPT-6.7B, LLaMA, and LLaMA-2. We use the ROUGE, BLEU, and METEOR scores to evaluate the performance of the models. We also measure the performance of the models with our own designed cosine similarity-based Coverage Score (CS). Our results, based on experiments across multiple datasets, show that RAG-based systems consistently outperform those fine-tuned with DFT. Specifically, RAG models outperform DFT by an average of 17% in ROUGE, 13% in BLEU, and 36% in CS. At the same time, DFT achieves only a modest advantage in METEOR, suggesting slightly better creative capabilities. We also highlight the challenges of integrating RAG with DFT, as such integration can lead to performance degradation. Furthermore, we propose a simplified RAG-based architecture that maximizes efficiency and reduces hallucination, underscoring the advantages of RAG in building reliable, domain-adapted knowledge systems. | |
| 651 | 4 | |a United States--US | |
| 653 | |a Language | ||
| 653 | |a Work stations | ||
| 653 | |a Accuracy | ||
| 653 | |a Datasets | ||
| 653 | |a Performance evaluation | ||
| 653 | |a Large language models | ||
| 653 | |a Knowledge | ||
| 653 | |a Benchmarks | ||
| 653 | |a Retrieval | ||
| 653 | |a Machine translation | ||
| 653 | |a Performance degradation | ||
| 653 | |a Coronaviruses | ||
| 653 | |a Chatbots | ||
| 653 | |a Meteors | ||
| 700 | 1 | |a Pollner, Péter |u Data-Driven Health Division of National Laboratory for Health Security, Health Services Management Training Centre, Semmelweis University, 1085 Budapest, Hungary | |
| 700 | 1 | |a Hajdu, András |u Department of Data Science and Visualization, Faculty of Informatics, University of Debrecen, 4032 Debrecen, Hungary; <email>hajdu.andras@inf.unideb.hu</email> | |
| 700 | 1 | |a Joó, Tamás |u Neumann Technology Platform, Neumann Nonprofit Ltd., 1074 Budapest, Hungary; Data-Driven Health Division of National Laboratory for Health Security, Health Services Management Training Centre, Semmelweis University, 1085 Budapest, Hungary | |
| 773 | 0 | |t Machine Learning and Knowledge Extraction |g vol. 7, no. 1 (2025), p. 15 | |
| 786 | 0 | |d ProQuest |t Advanced Technologies & Aerospace Database | |
| 856 | 4 | 1 | |3 Citation/Abstract |u https://www.proquest.com/docview/3181641098/abstract/embedded/H09TXR3UUZB2ISDL?source=fedsrch |
| 856 | 4 | 0 | |3 Full Text + Graphics |u https://www.proquest.com/docview/3181641098/fulltextwithgraphics/embedded/H09TXR3UUZB2ISDL?source=fedsrch |
| 856 | 4 | 0 | |3 Full Text - PDF |u https://www.proquest.com/docview/3181641098/fulltextPDF/embedded/H09TXR3UUZB2ISDL?source=fedsrch |