{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,6,11]],"date-time":"2026-06-11T16:11:48Z","timestamp":1781194308354,"version":"3.54.1"},"reference-count":34,"publisher":"MDPI AG","issue":"1","license":[{"start":{"date-parts":[[2025,2,10]],"date-time":"2025-02-10T00:00:00Z","timestamp":1739145600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"Ministry of Culture and Innovation of Hungary from the National Research, Development, and Innovation Fund","award":["C1774095"],"award-info":[{"award-number":["C1774095"]}]},{"name":"Ministry of Culture and Innovation of Hungary from the National Research, Development, and Innovation Fund","award":["RRF-2.3.1-21-2022-00006"],"award-info":[{"award-number":["RRF-2.3.1-21-2022-00006"]}]},{"DOI":"10.13039\/501100011019","name":"National Research, Development and Innovation Office of Hungary","doi-asserted-by":"publisher","award":["C1774095"],"award-info":[{"award-number":["C1774095"]}],"id":[{"id":"10.13039\/501100011019","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100011019","name":"National Research, Development and Innovation Office of Hungary","doi-asserted-by":"publisher","award":["RRF-2.3.1-21-2022-00006"],"award-info":[{"award-number":["RRF-2.3.1-21-2022-00006"]}],"id":[{"id":"10.13039\/501100011019","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["MAKE"],"abstract":"<jats:p>Generative large language models (LLMs) have revolutionized the development of knowledge-based systems, enabling new possibilities in applications like ChatGPT, Bing, and Gemini. Two key strategies for domain adaptation in these systems are Domain-Specific Fine-Tuning (DFT) and Retrieval-Augmented Generation (RAG). In this study, we evaluate the performance of RAG and DFT on several LLM architectures, including GPT-J-6B, OPT-6.7B, LLaMA, and LLaMA-2. We use the ROUGE, BLEU, and METEOR scores to evaluate the performance of the models. We also measure the performance of the models with our own designed cosine similarity-based Coverage Score (CS). Our results, based on experiments across multiple datasets, show that RAG-based systems consistently outperform those fine-tuned with DFT. Specifically, RAG models outperform DFT by an average of 17% in ROUGE, 13% in BLEU, and 36% in CS. At the same time, DFT achieves only a modest advantage in METEOR, suggesting slightly better creative capabilities. We also highlight the challenges of integrating RAG with DFT, as such integration can lead to performance degradation. Furthermore, we propose a simplified RAG-based architecture that maximizes efficiency and reduces hallucination, underscoring the advantages of RAG in building reliable, domain-adapted knowledge systems.<\/jats:p>","DOI":"10.3390\/make7010015","type":"journal-article","created":{"date-parts":[[2025,2,10]],"date-time":"2025-02-10T06:43:07Z","timestamp":1739169787000},"page":"15","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":28,"title":["Investigating the Performance of Retrieval-Augmented Generation and Domain-Specific Fine-Tuning for the Development of AI-Driven Knowledge-Based Systems"],"prefix":"10.3390","volume":"7","author":[{"ORCID":"https:\/\/orcid.org\/0009-0003-9349-3077","authenticated-orcid":false,"given":"R\u00f3bert","family":"Lakatos","sequence":"first","affiliation":[{"name":"Department of Data Science and Visualization, Faculty of Informatics, University of Debrecen, 4032 Debrecen, Hungary"},{"name":"Doctoral School of Informatics, University of Debrecen, 4032 Debrecen, Hungary"},{"name":"Neumann Technology Platform, Neumann Nonprofit Ltd., 1074 Budapest, Hungary"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-0464-4893","authenticated-orcid":false,"given":"P\u00e9ter","family":"Pollner","sequence":"additional","affiliation":[{"name":"Data-Driven Health Division of National Laboratory for Health Security, Health Services Management Training Centre, Semmelweis University, 1085 Budapest, Hungary"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-1718-9770","authenticated-orcid":false,"given":"Andr\u00e1s","family":"Hajdu","sequence":"additional","affiliation":[{"name":"Department of Data Science and Visualization, Faculty of Informatics, University of Debrecen, 4032 Debrecen, Hungary"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-3551-6125","authenticated-orcid":false,"given":"Tam\u00e1s","family":"Jo\u00f3","sequence":"additional","affiliation":[{"name":"Neumann Technology Platform, Neumann Nonprofit Ltd., 1074 Budapest, Hungary"},{"name":"Data-Driven Health Division of National Laboratory for Health Security, Health Services Management Training Centre, Semmelweis University, 1085 Budapest, Hungary"}],"role":[{"vocabulary":"crossref","role":"author"}]}],"member":"1968","published-online":{"date-parts":[[2025,2,10]]},"reference":[{"key":"ref_1","unstructured":"Devlin, J., Chang, M.W., Lee, K., and Toutanova, K. (2018). Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv."},{"key":"ref_2","unstructured":"Radford, A., Narasimhan, K., Salimans, T., and Sutskever, I. (2024, March 12). Improving Language Understanding by Generative Pre-Training. Available online: https:\/\/hayate-lab.com\/wp-content\/uploads\/2023\/05\/43372bfa750340059ad87ac8e538c53b.pdf."},{"key":"ref_3","doi-asserted-by":"crossref","unstructured":"Dai, Z., Yang, Z., Yang, Y., Carbonell, J., Le, Q.V., and Salakhutdinov, R. (2019). Transformer-xl: Attentive language models beyond a fixed-length context. arXiv.","DOI":"10.18653\/v1\/P19-1285"},{"key":"ref_4","first-page":"9","article-title":"Language models are unsupervised multitask learners","volume":"1","author":"Radford","year":"2019","journal-title":"OpenAI blog"},{"key":"ref_5","unstructured":"OpenAI (2024, March 12). ChatGPT. Available online: https:\/\/openai.com\/index\/chatgpt\/."},{"key":"ref_6","unstructured":"Microsoft (2024, March 12). Microsoft Copilot. Available online: https:\/\/copilot.microsoft.com."},{"key":"ref_7","unstructured":"Chowdhery, A., Narang, S., Devlin, J., Bosma, M., Mishra, G., Roberts, A., Barham, P., Chung, H.W., Sutton, C., and Gehrmann, S. (2022). PaLM: Scaling Language Modeling with Pathways. arXiv."},{"key":"ref_8","unstructured":"AI, G. (2024, March 12). Gemini. Available online: https:\/\/gemini.google.com\/app."},{"key":"ref_9","unstructured":"Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.A., Lacroix, T., Rozi\u00e8re, B., Goyal, N., Hambro, E., and Azhar, F. (2023). LLaMA: Open and Efficient Foundation Language Models. arXiv."},{"key":"ref_10","unstructured":"Hendrycks, D., Burns, C., Basart, S., Zou, A., Mazeika, M., Song, D., and Steinhardt, J. (2020). Measuring massive multitask language understanding. arXiv."},{"key":"ref_11","first-page":"1","article-title":"Mathematical discoveries from program search with large language models","volume":"625","author":"Barekatain","year":"2023","journal-title":"Nature"},{"key":"ref_12","doi-asserted-by":"crossref","unstructured":"Petroni, F., Rockt\u00e4schel, T., Lewis, P., Bakhtin, A., Wu, Y., Miller, A.H., and Riedel, S. (2019). Language models as knowledge bases?. arXiv.","DOI":"10.18653\/v1\/D19-1250"},{"key":"ref_13","doi-asserted-by":"crossref","first-page":"26839","DOI":"10.1109\/ACCESS.2024.3365742","article-title":"A review on large Language Models: Architectures, applications, taxonomies, open issues and challenges","volume":"12","author":"Raiaan","year":"2024","journal-title":"IEEE Access"},{"key":"ref_14","unstructured":"Zheng, J., Hong, H., Wang, X., Su, J., Liang, Y., and Wu, S. (2024). Fine-tuning Large Language Models for Domain-specific Machine Translation. arXiv."},{"key":"ref_15","unstructured":"Jeong, C. (2024). Fine-tuning and utilization methods of domain-specific llms. arXiv."},{"key":"ref_16","unstructured":"Guu, K., Lee, K., Tung, Z., Pasupat, P., and Chang, M. (, January 13\u201318). Retrieval augmented language model pre-training. Proceedings of the International Conference on Machine Learning, Virtual Event."},{"key":"ref_17","unstructured":"Wang, L.L., Lo, K., Chandrasekhar, Y., Reas, R., Yang, J., Burdick, D., Eide, D., Funk, K., Katsis, Y., and Kinney, R. (2020). CORD-19: The COVID-19 open research dataset. arXiv."},{"key":"ref_18","unstructured":"Ben Abacha, A., and Demner-Fushman, D. (2019). MedQuAD: A Medical Question Answering Dataset Containing Diverse Healthcare Topics. Studies in Health Technology and Informatics, Available online: https:\/\/bmcbioinformatics.biomedcentral.com\/articles\/10.1186\/s12859-019-3119-4."},{"key":"ref_19","first-page":"511:1","article-title":"A Question-Entailment Approach to Question Answering","volume":"20","year":"2019","journal-title":"BMC Bioinform."},{"key":"ref_20","doi-asserted-by":"crossref","unstructured":"Wolf, T., Debut, L., Sanh, V., Chaumond, J., Delangue, C., Moi, A., Cistac, P., Rault, T., Louf, R., and Funtowicz, M. (2020). HuggingFace\u2019s Transformers: State-of-the-art Natural Language Processing. arXiv.","DOI":"10.18653\/v1\/2020.emnlp-demos.6"},{"key":"ref_21","unstructured":"Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., and Antiga, L. (2019). PyTorch: An Imperative Style, High-Performance Deep Learning Library. Advances in Neural Information Processing Systems 32, Curran Associates, Inc."},{"key":"ref_22","unstructured":"Wang, B., and Komatsuzaki, A. (2024, March 12). GPT-J-6B: A 6 Billion Parameter Autoregressive Language Model. Available online: https:\/\/github.com\/kingoflolz\/mesh-transformer-jax."},{"key":"ref_23","unstructured":"Zhang, S., Roller, S., Goyal, N., Artetxe, M., Chen, M., Chen, S., Dewan, C., Diab, M., Li, X., and Lin, X.V. (2022). Opt: Open pre-trained transformer language models. arXiv."},{"key":"ref_24","unstructured":"Isabelle, P., Charniak, E., and Lin, D. (2002, January 6\u201312). Bleu: A Method for Automatic Evaluation of Machine Translation. Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics, Philadelphia, PA, USA."},{"key":"ref_25","unstructured":"Lin, C.Y. (2004, January 25\u201326). ROUGE: A Package for Automatic Evaluation of Summaries. Proceedings of the Text Summarization Branches Out, Barcelona, Spain."},{"key":"ref_26","unstructured":"Banerjee, S., and Lavie, A. (2005, January 29). METEOR: An Automatic Metric for MT Evaluation with Improved Correlation with Human Judgments. Proceedings of the ACL Workshop on Intrinsic and Extrinsic Evaluation Measures for Machine Translation and\/or Summarization, Ann Arbor, MI, USA."},{"key":"ref_27","doi-asserted-by":"crossref","unstructured":"Wang, W., Wei, F., Dong, L., Bao, H., Yang, N., and Zhou, M. (2020). MiniLM: Deep Self-Attention Distillation for Task-Agnostic Compression of Pre-Trained Transformers. arXiv.","DOI":"10.18653\/v1\/2021.findings-acl.188"},{"key":"ref_28","unstructured":"Scott, D., Bel, N., and Zong, C. (2020, January 8\u201313). The Devil is in the Details: Evaluating Limitations of Transformer-based Methods for Granular Tasks. Proceedings of the 28th International Conference on Computational Linguistics, Barcelona, Spain."},{"key":"ref_29","unstructured":"Taori, R., Gulrajani, I., Zhang, T., Dubois, Y., Li, X., Guestrin, C., Liang, P., and Hashimoto, T.B. (2024, March 12). Stanford Alpaca: An Instruction-Following Llama Model. Available online: https:\/\/github.com\/tatsu-lab\/stanford_alpaca."},{"key":"ref_30","unstructured":"Voidful (2024, January 01). Context Only Question Generator. Available online: https:\/\/huggingface.co\/voidful\/context-only-question-generator."},{"key":"ref_31","doi-asserted-by":"crossref","unstructured":"Reimers, N., and Gurevych, I. (2019). Sentence-bert: Sentence embeddings using siamese bert-networks. arXiv.","DOI":"10.18653\/v1\/D19-1410"},{"key":"ref_32","first-page":"1609","article-title":"A unifying probabilistic perspective for spectral dimensionality reduction: Insights and new models","volume":"13","author":"Lawrence","year":"2012","journal-title":"J. Mach. Learn. Res."},{"key":"ref_33","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1145\/2733381","article-title":"Hierarchical density estimates for data clustering, visualization, and outlier detection","volume":"10","author":"Campello","year":"2015","journal-title":"ACM Trans. Knowl. Discov. Data (TKDD)"},{"key":"ref_34","unstructured":"Lakatos, R. (2024, January 01). Retrieval-Augmented Generation Versus Domain-Specific Fine-Tuning. Available online: https:\/\/github.com\/robertlakatos\/ragvsfn\/tree\/main."}],"container-title":["Machine Learning and Knowledge Extraction"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2504-4990\/7\/1\/15\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,9]],"date-time":"2025-10-09T16:30:25Z","timestamp":1760027425000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2504-4990\/7\/1\/15"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,2,10]]},"references-count":34,"journal-issue":{"issue":"1","published-online":{"date-parts":[[2025,3]]}},"alternative-id":["make7010015"],"URL":"https:\/\/doi.org\/10.3390\/make7010015","relation":{},"ISSN":["2504-4990"],"issn-type":[{"value":"2504-4990","type":"electronic"}],"subject":[],"published":{"date-parts":[[2025,2,10]]}}}