{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,18]],"date-time":"2026-04-18T08:13:27Z","timestamp":1776500007727,"version":"3.51.2"},"reference-count":134,"publisher":"MDPI AG","issue":"10","license":[{"start":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T00:00:00Z","timestamp":1760140800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["BDCC"],"abstract":"<jats:p>Electronic health records (EHRs) are typically stored in relational databases, making them difficult to query for nontechnical users, especially under privacy constraints. We evaluate two practical clinical NLP workflows, natural language to SQL (NL2SQL) for EHR querying and retrieval-augmented generation for clinical question answering (RAG-QA), with a focus on privacy-preserving deployment.\u00a0We benchmark nine large language models, spanning open-weight options (DeepSeek V3\/V3.1, Llama-3.3-70B, Qwen2.5-32B, Mixtral-8 \u00d7 22B,\u00a0BioMistral-7B, and GPT-OSS-20B) and proprietary APIs (GPT-4o and GPT-5).\u00a0The models were chosen to represent a diverse cross-section spanning sparse MoE, dense general-purpose, domain-adapted, and proprietary LLMs.\u00a0On MIMICSQL (27,000 generations; nine models \u00d7 three runs), the best NL2SQL execution accuracy (EX) is 66.1% (GPT-4o), followed by 64.6% (GPT-5). Among open-weight models, DeepSeek V3.1 reaches 59.8% EX, while DeepSeek V3 reaches 58.8%, with Llama-3.3-70B at 54.5% and BioMistral-7B achieving only 11.8%, underscoring a persistent gap relative to general-domain benchmarks. We introduce SQL-EC, a deterministic SQL error-classification framework with adjudication, revealing string mismatches as the dominant failure (86.3%), followed by query-join misinterpretations (49.7%), while incorrect aggregation-function usage accounts for only 6.7%. This highlights lexical\/ontology grounding as the key bottleneck for NL2SQL in the biomedical domain. For RAG-QA, evaluated on 100 synthetic patient records across 20 questions (54,000 reference\u2013generation pairs; three runs), BLEU and ROUGE-L fluctuate more strongly across models, whereas BERTScore remains high on most, with DeepSeek V3.1 and GPT-4o among the top performers; pairwise t-tests confirm that significant differences were observed among the LLMs. Cost\u2013performance analysis based on measured token usage shows per-query costs ranging from USD 0.000285 (GPT-OSS-20B) to USD 0.005918 (GPT-4o); DeepSeek V3.1 offers the best open-weight cost\u2013accuracy trade-off, and GPT-5 provides a balanced API alternative. Overall, the privacy-conscious RAG-QA attains strong semantic fidelity, whereas the clinical NL2SQL remains brittle under lexical variation. SQL-EC pinpoints actionable failure modes, motivating ontology-aware normalization and schema-linked prompting for robust clinical querying.<\/jats:p>","DOI":"10.3390\/bdcc9100256","type":"journal-article","created":{"date-parts":[[2025,10,14]],"date-time":"2025-10-14T05:48:21Z","timestamp":1760420901000},"page":"256","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":1,"title":["Robust Clinical Querying with Local LLMs: Lexical Challenges in NL2SQL and Retrieval-Augmented QA on EHRs"],"prefix":"10.3390","volume":"9","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-8942-8585","authenticated-orcid":false,"given":"Luka","family":"Bla\u0161kovi\u0107","sequence":"first","affiliation":[{"name":"Faculty of Informatics, Juraj Dobrila University of Pula, 52100 Pula, Croatia"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-3936-1244","authenticated-orcid":false,"given":"Nikola","family":"Tankovi\u0107","sequence":"additional","affiliation":[{"name":"Faculty of Informatics, Juraj Dobrila University of Pula, 52100 Pula, Croatia"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-5964-245X","authenticated-orcid":false,"given":"Ivan","family":"Lorencin","sequence":"additional","affiliation":[{"name":"Faculty of Informatics, Juraj Dobrila University of Pula, 52100 Pula, Croatia"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-3015-1024","authenticated-orcid":false,"given":"Sandi","family":"Baressi \u0160egota","sequence":"additional","affiliation":[{"name":"Department of Automation and Electronics, Faculty of Engineering, University of Rijeka, 51000 Rijeka, Croatia"}]}],"member":"1968","published-online":{"date-parts":[[2025,10,11]]},"reference":[{"key":"ref_1","unstructured":"Garets, D., and Davis, M. (2006). Electronic Medical Records vs. Electronic Health Records: Yes, There is a Difference, HIMSS Analytics. Policy White Paper."},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"201","DOI":"10.5009\/gnl230272","article-title":"Challenges in and Opportunities for Electronic Health Record-Based Data Analysis and Interpretation","volume":"18","author":"Kim","year":"2024","journal-title":"Gut Liver"},{"key":"ref_3","doi-asserted-by":"crossref","unstructured":"Shabestari, O., and Roudsari, A. (2013). Challenges in data quality assurance for electronic health records. Studies in Health Technology and Informatics, IOS Press.","DOI":"10.3233\/978-1-61499-203-5-37"},{"key":"ref_4","doi-asserted-by":"crossref","first-page":"1730","DOI":"10.1093\/jamia\/ocad120","article-title":"Electronic health record data quality assessment and tools: A systematic review","volume":"30","author":"Lewis","year":"2023","journal-title":"J. Am. Med. Inform. Assoc."},{"key":"ref_5","doi-asserted-by":"crossref","first-page":"11786329211070722","DOI":"10.1177\/11786329211070722","article-title":"A qualitative analysis of the impact of electronic health records (EHR) on healthcare quality and safety: Clinicians\u2019 lived experiences","volume":"15","author":"Upadhyay","year":"2022","journal-title":"Health Serv. Insights"},{"key":"ref_6","doi-asserted-by":"crossref","first-page":"60","DOI":"10.1093\/eurpub\/ckv122","article-title":"The impact of electronic health records on healthcare quality: A systematic review and meta-analysis","volume":"26","author":"Campanella","year":"2016","journal-title":"Eur. J. Public Health"},{"key":"ref_7","doi-asserted-by":"crossref","first-page":"194","DOI":"10.1080\/17538157.2021.1879810","article-title":"Impact of patient access to their electronic health record: Systematic review","volume":"46","author":"Tapuria","year":"2021","journal-title":"Inform. Health Soc. Care"},{"key":"ref_8","doi-asserted-by":"crossref","first-page":"1234","DOI":"10.1093\/bioinformatics\/btz682","article-title":"BioBERT: A pre-trained biomedical language representation model for biomedical text mining","volume":"36","author":"Lee","year":"2020","journal-title":"Bioinformatics"},{"key":"ref_9","unstructured":"Huang, K., Altosaar, J., and Ranganath, R. (2019). ClinicalBERT: Modeling Clinical Notes and Predicting Hospital Readmission. arXiv."},{"key":"ref_10","unstructured":"Bolton, E., Hall, D., Yasunaga, M., Lee, T., Manning, C., and Liang, P. (2025, March 31). Stanford CRFM Introduces PubMedGPT 2.7B. Available online: https:\/\/hai.stanford.edu\/news\/stanford-crfm-introduces-pubmedgpt-27b."},{"key":"ref_11","doi-asserted-by":"crossref","first-page":"172","DOI":"10.1038\/s41586-023-06291-2","article-title":"Large language models encode clinical knowledge","volume":"620","author":"Singhal","year":"2023","journal-title":"Nature"},{"key":"ref_12","doi-asserted-by":"crossref","first-page":"943","DOI":"10.1038\/s41591-024-03423-7","article-title":"Toward expert-level medical question answering with large language models","volume":"31","author":"Singhal","year":"2025","journal-title":"Nat. Med."},{"key":"ref_13","doi-asserted-by":"crossref","first-page":"193","DOI":"10.1038\/s41746-024-01181-x","article-title":"Large language models outperform mental and medical health care professionals in identifying obsessive-compulsive disorder","volume":"7","author":"Kim","year":"2024","journal-title":"npj Digit. Med."},{"key":"ref_14","doi-asserted-by":"crossref","unstructured":"Hou, Y., Bert, C., Gomaa, A., Lahmer, G., H\u00f6fler, D., Weissmann, T., Voigt, R., Schubert, P., Schmitter, C., and Depardon, A. (2025). Fine-tuning a local LLaMA-3 large language model for automated privacy-preserving physician letter generation in radiation oncology. Front. Artif. Intell., 7.","DOI":"10.3389\/frai.2024.1493716"},{"key":"ref_15","doi-asserted-by":"crossref","first-page":"1621","DOI":"10.1007\/s00256-024-04599-2","article-title":"Translating musculoskeletal radiology reports into patient-friendly summaries using ChatGPT-4","volume":"53","author":"Kuckelman","year":"2024","journal-title":"Skelet. Radiol."},{"key":"ref_16","doi-asserted-by":"crossref","first-page":"105800","DOI":"10.1016\/j.ijmedinf.2025.105800","article-title":"Large language models vs human for classifying clinical documents","volume":"195","author":"Mustafa","year":"2025","journal-title":"Int. J. Med. Inform."},{"key":"ref_17","unstructured":"Ong, J.C.L., Jin, L., Elangovan, K., Lim, G.Y.S., Lim, D.Y.Z., Sng, G.G.R., Ke, Y., Tung, J.Y.M., Zhong, R.J., and Koh, C.M.Y. (2024). Development and testing of a novel large language model-based clinical decision support systems for medication safety in 12 clinical specialties. arXiv."},{"key":"ref_18","doi-asserted-by":"crossref","first-page":"605","DOI":"10.1093\/jamia\/ocaf008","article-title":"Improving large language model applications in biomedicine with retrieval-augmented generation: A systematic review, meta-analysis, and clinical development guidelines","volume":"32","author":"Liu","year":"2025","journal-title":"J. Am. Med. Inform. Assoc."},{"key":"ref_19","doi-asserted-by":"crossref","first-page":"143","DOI":"10.1038\/s41746-025-01476-7","article-title":"Implementing large language models in healthcare while balancing control, collaboration, costs and security","volume":"8","author":"Hastings","year":"2025","journal-title":"npj Digit. Med."},{"key":"ref_20","doi-asserted-by":"crossref","first-page":"395","DOI":"10.1038\/s41746-025-01802-z","article-title":"Retrieval-augmented generation elevates local LLM quality in radiology contrast media consultation","volume":"8","author":"Wada","year":"2025","journal-title":"npj Digit. Med."},{"key":"ref_21","first-page":"365","article-title":"Early Alzheimer\u2019s Detection Through Voice Analysis: Harnessing Locally Deployable LLMs via ADetectoLocum, a privacy-preserving diagnostic system","volume":"2025","author":"Mortensen","year":"2025","journal-title":"AMIA Summits Transl. Sci. Proc."},{"key":"ref_22","doi-asserted-by":"crossref","unstructured":"Lorencin, I., Tankovic, N., and Etinger, D. (2025). Optimizing Healthcare Efficiency with Local Large Language Models. Intelligent Human Systems Integration (IHSI 2025): Integrating People and Intelligent Systems, AHFE International Open Access.","DOI":"10.54941\/ahfe1005863"},{"key":"ref_23","doi-asserted-by":"crossref","unstructured":"Wiest, I.C., Le\u00dfmann, M.E., Wolf, F., Ferber, D., Van Treeck, M., Zhu, J., Ebert, M.P., Westphalen, C.B., Wermke, M., and Kather, J.N. (2024). Anonymizing medical documents with local, privacy preserving large language models: The LLM-Anonymizer. medRxiv.","DOI":"10.1101\/2024.06.11.24308355"},{"key":"ref_24","doi-asserted-by":"crossref","unstructured":"Hong, Z., Yuan, Z., Zhang, Q., Chen, H., Dong, J., Huang, F., and Huang, X. (2025). Next-Generation Database Interfaces: A Survey of LLM-based Text-to-SQL. arXiv.","DOI":"10.1109\/TKDE.2025.3609486"},{"key":"ref_25","doi-asserted-by":"crossref","unstructured":"Male\u0161evi\u0107, A., and \u010cartolovni, A. (2025). Healthcare digitalization: Insights from Croatia\u2019s experience and lessons from the COVID-19 pandemic. Digital Healthcare, Digital Transformation and Citizen Empowerment in Asia-Pacific and Europe for a Healthier Society, Academic Press (Elsevier).","DOI":"10.1016\/B978-0-443-30168-1.00003-7"},{"key":"ref_26","doi-asserted-by":"crossref","first-page":"223","DOI":"10.1136\/bmj.331.7510.223","article-title":"Croatian healthcare system in transition, from the perspective of users","volume":"331","author":"Mastilica","year":"2005","journal-title":"BMJ"},{"key":"ref_27","unstructured":"Bj\u00f6rnberg, A., and Phang, A.Y. (2025, March 30). Euro Health Consumer Index 2018 Report. Available online: https:\/\/santesecu.public.lu\/dam-assets\/fr\/publications\/e\/euro-health-consumer-index-2018\/euro-health-consumer-index-2018.pdf."},{"key":"ref_28","unstructured":"OECD (2023). Croatia Country Health Profile 2023, OECD Publishing."},{"key":"ref_29","unstructured":"D\u017eakula, A., Vo\u010danec, D., Banadinovi\u0107, M., Vajagi\u0107, M., Lon\u010darek, K., Lovren\u010di\u0107, I.L., Radin, D., and Rechel, B. (2024). Croatia: Health System Summary, 2024, European Observatory on Health Systems and Policies, WHO Regional Office for Europe."},{"key":"ref_30","unstructured":"Zhong, V., Xiong, C., and Socher, R. (2017). Seq2sql: Generating structured queries from natural language using reinforcement learning. arXiv."},{"key":"ref_31","unstructured":"Xu, X., Liu, C., and Song, D. (2017). SQLNet: Generating Structured Queries From Natural Language Without Reinforcement Learning. arXiv."},{"key":"ref_32","first-page":"1478","article-title":"A semantic parsing method for mapping clinical questions to logical forms","volume":"Volume 2017","author":"Roberts","year":"2018","journal-title":"Proceedings of the AMIA 2018 Annual Symposium Proceedings"},{"key":"ref_33","unstructured":"Zhu, X., Li, Q., Cui, L., and Liu, Y. (2024). Large language model enhanced text-to-sql generation: A survey. arXiv."},{"key":"ref_34","doi-asserted-by":"crossref","unstructured":"Wang, P., Shi, T., and Reddy, C.K. (2020, January 20\u201324). Text-to-SQL Generation for Question Answering on Electronic Medical Records. Proceedings of the WWW \u201920: The Web Conference 2020, Taipei Taiwan.","DOI":"10.1145\/3366423.3380120"},{"key":"ref_35","doi-asserted-by":"crossref","first-page":"e32698","DOI":"10.2196\/32698","article-title":"A BERT-based generation model to transform medical texts to SQL queries for electronic medical records: Model development and validation","volume":"9","author":"Pan","year":"2021","journal-title":"JMIR Med. Inform."},{"key":"ref_36","unstructured":"Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., and Askell, A. (2020). Language Models are Few-Shot Learners. arXiv."},{"key":"ref_37","unstructured":"Chen, M., Tworek, J., Jun, H., Yuan, Q., de Oliveira Pinto, H.P., Kaplan, J., Edwards, H., Burda, Y., Joseph, N., and Brockman, G. (2021). Evaluating Large Language Models Trained on Code. arXiv."},{"key":"ref_38","doi-asserted-by":"crossref","unstructured":"Yu, T., Zhang, R., Yang, K., Yasunaga, M., Wang, D., Li, Z., Ma, J., Li, I., Yao, Q., and Roman, S. (2019). Spider: A Large-Scale Human-Labeled Dataset for Complex and Cross-Domain Semantic Parsing and Text-to-SQL Task. arXiv.","DOI":"10.18653\/v1\/D18-1425"},{"key":"ref_39","first-page":"42330","article-title":"Can llm already serve as a database interface? a big bench for large-scale database grounded text-to-sqls","volume":"36","author":"Li","year":"2023","journal-title":"Adv. Neural Inf. Process. Syst."},{"key":"ref_40","unstructured":"Liu, X., Shen, S., Li, B., Ma, P., Jiang, R., Zhang, Y., Fan, J., Li, G., Tang, N., and Luo, Y. (2024). A Survey of NL2SQL with Large Language Models: Where are we, and where are we going?. arXiv."},{"key":"ref_41","doi-asserted-by":"crossref","unstructured":"Guo, J., Zhan, Z., Gao, Y., Xiao, Y., Lou, J.G., Liu, T., and Zhang, D. (2019). Towards Complex Text-to-SQL in Cross-Domain Database with Intermediate Representation. arXiv.","DOI":"10.18653\/v1\/P19-1444"},{"key":"ref_42","unstructured":"Hwang, W., Yim, J., Park, S., and Seo, M. (2019). A comprehensive exploration on wikisql with table-aware word contextualization. arXiv."},{"key":"ref_43","unstructured":"OpenAI (2025, September 06). GPT-3.5. Available online: https:\/\/platform.openai.com\/docs\/models\/gpt-3.5-turbo."},{"key":"ref_44","unstructured":"OpenAI (2023). GPT-4 Technical Report. arXiv."},{"key":"ref_45","unstructured":"Anthropic (2025, September 06). Claude 2. Available online: https:\/\/www.anthropic.com\/index\/claude-2."},{"key":"ref_46","unstructured":"DeepMind, G. (2025, September 06). Gemini 1.5: Unlocking Multimodal Long Context Understanding. Available online: https:\/\/blog.google\/technology\/ai\/google-gemini-next-generation-model-february-2024\/."},{"key":"ref_47","unstructured":"OpenAI (2025, September 06). Introducing GPT-5 for Developers. Available online: https:\/\/openai.com\/index\/introducing-gpt-5-for-developers\/."},{"key":"ref_48","unstructured":"Attrach, R.A., Moreira, P., Fani, R., Umeton, R., and Celi, L.A. (2025). Conversational LLMs Simplify Secure Clinical Data Access, Understanding, and Analysis. arXiv."},{"key":"ref_49","unstructured":"Anthropic (2025, September 10). Introducing the Model Context Protocol. Available online: https:\/\/www.anthropic.com\/news\/model-context-protocol."},{"key":"ref_50","doi-asserted-by":"crossref","unstructured":"Chadha, I.K., Gupta, A., Sarkar, S., Tomer, M., and Rathee, T. (2025, January 8\u20139). Performance Evaluation of Open-Source LLMs for Text-to-SQL Conversion in Healthcare Data. Proceedings of the 2025 International Conference on Pervasive Computational Technologies (ICPCT), Greater Noida, India.","DOI":"10.1109\/ICPCT64145.2025.10941238"},{"key":"ref_51","unstructured":"OpenAI (2025). gpt-oss-120b & gpt-oss-20b Model Card. arXiv."},{"key":"ref_52","unstructured":"Liu, A., Feng, B., Xue, B., Wang, B., Wu, B., Lu, C., Zhao, C., Deng, C., Zhang, C., and Ruan, C. (2025). DeepSeek-V3 Technical Report. arXiv."},{"key":"ref_53","first-page":"669","article-title":"Towards understanding the generalization of medical text-to-sql models and datasets","volume":"Volume 2023","author":"Tarbell","year":"2024","journal-title":"Proceedings of the AMIA Annual Symposium Proceedings"},{"key":"ref_54","unstructured":"Rahman, S., Jiang, L.Y., Gabriel, S., Aphinyanaphongs, Y., Oermann, E.K., and Chunara, R. (2024). Generalization in healthcare ai: Evaluation of a clinical large language model. arXiv."},{"key":"ref_55","unstructured":"Chen, C., Yu, J., Chen, S., Liu, C., Wan, Z., Bitterman, D., Wang, F., and Shu, K. (2024). ClinicalBench: Can LLMs Beat Traditional ML Models in Clinical Prediction?. arXiv."},{"key":"ref_56","doi-asserted-by":"crossref","first-page":"239","DOI":"10.1038\/s41746-025-01651-w","article-title":"Leveraging long context in retrieval augmented language models for medical question answering","volume":"8","author":"Zhang","year":"2025","journal-title":"npj Digit. Med."},{"key":"ref_57","first-page":"9459","article-title":"Retrieval-augmented generation for knowledge-intensive nlp tasks","volume":"33","author":"Lewis","year":"2020","journal-title":"Adv. Neural Inf. Process. Syst."},{"key":"ref_58","doi-asserted-by":"crossref","unstructured":"Amugongo, L.M., Mascheroni, P., Brooks, S., Doering, S., and Seidel, J. (2025). Retrieval augmented generation for large language models in healthcare: A systematic review. PLoS Digit. Health, 4.","DOI":"10.1371\/journal.pdig.0000877"},{"key":"ref_59","doi-asserted-by":"crossref","unstructured":"Alkhalaf, M., Yu, P., Yin, M., and Deng, C. (2024). Applying generative AI with retrieval augmented generation to summarize and extract key clinical information from electronic health records. J. Biomed. Inform., 156.","DOI":"10.1016\/j.jbi.2024.104662"},{"key":"ref_60","unstructured":"Myers, S., Dligach, D., Miller, T.A., Barr, S., Gao, Y., Churpek, M., Mayampurath, A., and Afshar, M. (2025). Evaluating Retrieval-Augmented Generation vs. Long-Context Input for Clinical Reasoning over EHRs. arXiv."},{"key":"ref_61","doi-asserted-by":"crossref","first-page":"1042","DOI":"10.1002\/ohn.864","article-title":"ChatENT: Augmented large language model for expert knowledge retrieval in otolaryngology\u2013head and neck surgery","volume":"171","author":"Long","year":"2024","journal-title":"Otolaryngol.\u2013Head Neck Surg."},{"key":"ref_62","doi-asserted-by":"crossref","first-page":"1158","DOI":"10.1097\/HEP.0000000000000834","article-title":"Development of a liver disease\u2013specific large language model chat interface using retrieval-augmented generation","volume":"80","author":"Ge","year":"2024","journal-title":"Hepatology"},{"key":"ref_63","doi-asserted-by":"crossref","unstructured":"Rouhollahi, A., Homaei, A., Sahu, A., Harari, R.E., and Nezami, F.R. (2025). RAGnosis: Retrieval-Augmented Generation for Enhanced Medical Decision Making. medRxiv.","DOI":"10.1101\/2025.06.11.25329438"},{"key":"ref_64","doi-asserted-by":"crossref","unstructured":"Zhao, X., Liu, S., Yang, S.Y., and Miao, C. (May, January 28). Medrag: Enhancing retrieval-augmented generation with knowledge graph-elicited reasoning for healthcare copilot. Proceedings of the ACM on Web Conference 2025, Sydney, Australia.","DOI":"10.1145\/3696410.3714782"},{"key":"ref_65","doi-asserted-by":"crossref","unstructured":"Soni, S., Gayen, S., and Demner-Fushman, D. (2025, January 1). Overview of the archehr-qa 2025 shared task on grounded question answering from electronic health records. Proceedings of the 24th Workshop on Biomedical Language Processing, Vienna, Austria.","DOI":"10.18653\/v1\/2025.bionlp-1.34"},{"key":"ref_66","doi-asserted-by":"crossref","unstructured":"Yu, H., Gan, A., Zhang, K., Tong, S., Liu, Q., and Liu, Z. (2025, January 28\u201330). Evaluation of retrieval-augmented generation: A survey. Proceedings of the CCF Conference on Big Data, Shenyang, China.","DOI":"10.1007\/978-981-96-1024-2_8"},{"key":"ref_67","doi-asserted-by":"crossref","first-page":"20552076251337177","DOI":"10.1177\/20552076251337177","article-title":"Enhancing medical AI with retrieval-augmented generation: A mini narrative review","volume":"11","author":"Gargari","year":"2025","journal-title":"Digit. Health"},{"key":"ref_68","doi-asserted-by":"crossref","unstructured":"Xiong, G., Jin, Q., Lu, Z., and Zhang, A. (2024, January 11\u201316). Benchmarking retrieval-augmented generation for medicine. Proceedings of the Findings of the Association for Computational Linguistics, Bangkok, Thailand.","DOI":"10.18653\/v1\/2024.findings-acl.372"},{"key":"ref_69","doi-asserted-by":"crossref","unstructured":"Wu, J., Zhu, J., and Qi, Y. (2024). Medical Graph RAG: Towards Safe Medical Large Language Model via Graph Retrieval-Augmented Generation. arXiv.","DOI":"10.18653\/v1\/2025.acl-long.1381"},{"key":"ref_70","doi-asserted-by":"crossref","unstructured":"Jadhav, S., Shanbhag, A.G., Joshi, S., Date, A., and Sonawane, S. (2024, January 24). Maven at MEDIQA-CORR 2024: Leveraging RAG and Medical LLM for Error Detection and Correction in Medical Notes. Proceedings of the Clinical Natural Language Processing Workshop, Mexico City, Mexico.","DOI":"10.18653\/v1\/2024.clinicalnlp-1.36"},{"key":"ref_71","first-page":"e44735","article-title":"Interpretation and misinterpretation of medical abbreviations found in patient medical records: A cross-sectional survey","volume":"15","author":"Jayatilake","year":"2023","journal-title":"Cureus"},{"key":"ref_72","unstructured":"(2025, January 16). SNOMED CT: Myocardial infarction (Concept ID 22298006). Available online: https:\/\/bioportal.bioontology.org\/ontologies\/SNOMEDCT?p=classes&conceptid=22298006."},{"key":"ref_73","doi-asserted-by":"crossref","first-page":"332","DOI":"10.1177\/1060028017740140","article-title":"Audit on the use of dangerous abbreviations, symbols, and dose designations in paper compared to electronic medication orders: A multicenter study","volume":"52","author":"Cheung","year":"2018","journal-title":"Ann. Pharmacother."},{"key":"ref_74","unstructured":"(2025, September 06). National Institutes of Health (NIH) Exhibit in NIH Grants Policy Statement (NIH GPS), Available online: https:\/\/grants.nih.gov\/grants\/policy\/nihgps\/html5\/section_1\/1.1_abbreviations.htm."},{"key":"ref_75","unstructured":"(2025, September 06). Do Not Use: Dangerous Abbreviations, Symbols, and Dose Designations\u20142025 Update. Available online: https:\/\/ismpcanada.ca\/resource\/do-not-use-list\/."},{"key":"ref_76","doi-asserted-by":"crossref","first-page":"581","DOI":"10.4414\/smw.2002.10027","article-title":"Latin as the language of medical terminology: Some remarks on its role and prospects","volume":"132","year":"2002","journal-title":"Swiss Med. Wkly."},{"key":"ref_77","doi-asserted-by":"crossref","first-page":"45","DOI":"10.1186\/s13256-018-1562-x","article-title":"The use of Latin terminology in medical case reports: Quantitative, structural, and thematic analysis","volume":"12","author":"Lysanets","year":"2018","journal-title":"J. Med. Case Rep."},{"key":"ref_78","doi-asserted-by":"crossref","first-page":"D267","DOI":"10.1093\/nar\/gkh061","article-title":"The unified medical language system (UMLS): Integrating biomedical terminology","volume":"32","author":"Bodenreider","year":"2004","journal-title":"Nucleic Acids Res."},{"key":"ref_79","unstructured":"Liu, H., Lussier, Y.A., and Friedman, C. (2001, January 6\u201310). A study of abbreviations in the UMLS. Proceedings of the AMIA Symposium, Portland, OR, USA."},{"key":"ref_80","doi-asserted-by":"crossref","unstructured":"Xu, R., Jiang, P., Luo, L., Xiao, C., Cross, A., Pan, S., Sun, J., and Yang, C. (2025, January 3\u20137). A Survey on Unifying Large Language Models and Knowledge Graphs for Biomedicine and Healthcare. Proceedings of the 31st ACM SIGKDD Conference on Knowledge Discovery and Data Mining V.2, Toronto, ON, Canada.","DOI":"10.1145\/3711896.3736556"},{"key":"ref_81","unstructured":"Nazi, Z.A., Hristidis, V., McLean, A.L., Meem, J.A., and Chowdhury, M.T.A. (2025). Ontology-Guided Query Expansion for Biomedical Document Retrieval using Large Language Models. arXiv."},{"key":"ref_82","unstructured":"Fan, Y., Xue, K., Li, Z., Zhang, X., and Ruan, T. (2025, January 19\u201324). An LLM-based Framework for Biomedical Terminology Normalization in Social Media via Multi-Agent Collaboration. Proceedings of the 31st International Conference on Computational Linguistics, Abu Dhabi, United Arab Emirates."},{"key":"ref_83","doi-asserted-by":"crossref","first-page":"baae067","DOI":"10.1093\/database\/baae067","article-title":"Improving biomedical entity linking for complex entity mentions with LLM-based text simplification","volume":"2024","author":"Borchert","year":"2024","journal-title":"Database"},{"key":"ref_84","doi-asserted-by":"crossref","unstructured":"Remy, F., Demuynck, K., and Demeester, T. (2023). BioLORD-2023: Semantic textual representations fusing llm and clinical knowledge graph insights. arXiv.","DOI":"10.1093\/jamia\/ocae029"},{"key":"ref_85","doi-asserted-by":"crossref","first-page":"435","DOI":"10.1007\/s10389-022-01795-z","article-title":"Privacy in electronic health records: A systematic mapping study","volume":"32","author":"Tertulino","year":"2024","journal-title":"J. Public Health"},{"key":"ref_86","doi-asserted-by":"crossref","unstructured":"Mirzaei, T., Amini, L., and Esmaeilzadeh, P. (2024). Clinician voices on ethics of LLM integration in healthcare: A thematic analysis of ethical concerns and implications. BMC Med. Inform. Decis. Mak., 24.","DOI":"10.1186\/s12911-024-02656-3"},{"key":"ref_87","doi-asserted-by":"crossref","unstructured":"Rathod, V., Nabavirazavi, S., Zad, S., and Iyengar, S.S. (2025, January 6\u20138). Privacy and security challenges in large language models. Proceedings of the 2025 IEEE 15th Annual Computing and Communication Workshop and Conference (CCWC), Las Vegas, NV, USA.","DOI":"10.1109\/CCWC62904.2025.10903912"},{"key":"ref_88","doi-asserted-by":"crossref","first-page":"100211","DOI":"10.1016\/j.hcc.2024.100211","article-title":"A survey on large language model (llm) security and privacy: The good, the bad, and the ugly","volume":"4","author":"Yao","year":"2024","journal-title":"High-Confid. Comput."},{"key":"ref_89","unstructured":"Gemma Team (2024). Gemma: Open Models Based on Gemini Research and Technology. arXiv."},{"key":"ref_90","unstructured":"Gemma Team (2023). Mistral 7B. arXiv."},{"key":"ref_91","unstructured":"Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.A., Lacroix, T., Rozi\u00e8re, B., Goyal, N., Hambro, E., and Azhar, F. (2023). LLaMA: Open and Efficient Foundation Language Models. arXiv."},{"key":"ref_92","doi-asserted-by":"crossref","unstructured":"Kukreja, S., Kumar, T., Purohit, A., Dasgupta, A., and Guha, D. (2024, January 12\u201314). A literature survey on open source large language models. Proceedings of the 2024 7th International Conference on Computers in Management and Business, Singapore.","DOI":"10.1145\/3647782.3647803"},{"key":"ref_93","unstructured":"Plaat, A., Wong, A., Verberne, S., Broekens, J., van Stein, N., and B\u00e4ck, T. (2024). Reasoning with large language models, a survey. CoRR."},{"key":"ref_94","unstructured":"Floratou, A., Psallidas, F., Zhao, F., Deep, S., Hagleither, G., Tan, W., Cahoon, J., Alotaibi, R., Henkel, J., and Singla, A. (2024, January 14\u201317). Nl2sql is a solved problem\u2026 not!. Proceedings of the Conference of Innovative Data Systems Research (CIDR), Chaminade, HI, USA."},{"key":"ref_95","doi-asserted-by":"crossref","unstructured":"Tai, C.Y., Chen, Z., Zhang, T., Deng, X., and Sun, H. (2023). Exploring chain-of-thought style prompting for text-to-sql. arXiv.","DOI":"10.18653\/v1\/2023.emnlp-main.327"},{"key":"ref_96","unstructured":"Ali Alkamel, S. (2025, April 01). DeepSeek and the Power of Mixture of Experts (MoE). Available online: https:\/\/dev.to\/sayed_ali_alkamel\/deepseek-and-the-power-of-mixture-of-experts-moe-ham."},{"key":"ref_97","unstructured":"(2025, September 10). DeepSeek-V3.1 Release. Available online: https:\/\/api-docs.deepseek.com\/news\/news250821."},{"key":"ref_98","unstructured":"Open Source Initiative (2025, September 10). MIT License. Available online: https:\/\/opensource.org\/licenses\/MIT."},{"key":"ref_99","unstructured":"Meta Platforms, Inc (2025, September 10). Meta LLaMA 3 Community License. Available online: https:\/\/www.llama.com\/llama3\/license\/."},{"key":"ref_100","unstructured":"Team Qwen (2025). Qwen2.5 Technical Report. arXiv."},{"key":"ref_101","unstructured":"Apache Software Foundation (2025, September 10). Apache License, Version 2.0. Available online: http:\/\/www.apache.org\/licenses\/LICENSE-2.0."},{"key":"ref_102","unstructured":"Mistral AI (2025, September 10). Cheaper, Better, Faster, Stronger: Mixtral 8x22B. Release Blog Post. Available online: https:\/\/mistral.ai\/news\/mixtral-8x22b."},{"key":"ref_103","doi-asserted-by":"crossref","unstructured":"Labrak, Y., Bazoge, A., Morin, E., Gourraud, P.A., Rouvier, M., and Dufour, R. (2024). Biomistral: A collection of open-source pretrained large language models for medical domains. arXiv.","DOI":"10.18653\/v1\/2024.findings-acl.348"},{"key":"ref_104","unstructured":"OpenAI (2025, April 22). GPT-4o. Available online: https:\/\/openai.com\/index\/hello-gpt-4o\/."},{"key":"ref_105","unstructured":"OpenAI (2025, August 07). Introducing GPT-5. Available online: https:\/\/openai.com\/index\/introducing-gpt-5\/."},{"key":"ref_106","doi-asserted-by":"crossref","unstructured":"Tankovi\u0107, N., \u0160ajina, R., and Lorencin, I. (2025). Transforming Medical Data Access: The Role and Challenges of Recent Language Models in SQL Query Automation. Algorithms, 18.","DOI":"10.3390\/a18030124"},{"key":"ref_107","unstructured":"Xiao, G., Lin, J., Seznec, M., Wu, H., Demouth, J., and Han, S. (2024). SmoothQuant: Accurate and Efficient Post-Training Quantization for Large Language Models. arXiv."},{"key":"ref_108","unstructured":"Frantar, E., Ashkboos, S., Hoefler, T., and Alistarh, D. (2023). GPTQ: Accurate Post-Training Quantization for Generative Pre-trained Transformers. arXiv."},{"key":"ref_109","unstructured":"Fireworks AI (2025, September 11). How Fireworks Evaluates Quantization Precisely and Interpretably. Blog Post. Available online: https:\/\/fireworks.ai\/blog\/fireworks-quantization."},{"key":"ref_110","unstructured":"Fireworks AI (2025, September 11). LLM Inference Performance Benchmarking (Part 1). Blog Post. Available online: https:\/\/fireworks.ai\/blog\/llm-inference-performance-benchmarking-part-1."},{"key":"ref_111","unstructured":"Sandrini, P. (2025). Beyond the Cloud: Assessing the Benefits and Drawbacks of Local LLM Deployment for Translators. arXiv."},{"key":"ref_112","doi-asserted-by":"crossref","first-page":"163","DOI":"10.1007\/s44443-025-00177-1","article-title":"A survey on privacy risks and protection in large language models","volume":"37","author":"Chen","year":"2025","journal-title":"J. King Saud Univ. Comput. Inf. Sci."},{"key":"ref_113","unstructured":"Hamburg Commissioner for Data Protection and Freedom of Information (2025, September 11). Discussion Paper: Large Language Models and Personal Data. Discussion paper, Hamburg Commissioner for Data Protection and Freedom of Information (HmbBfDI). Available online: https:\/\/datenschutz-hamburg.de\/fileadmin\/user_upload\/HmbBfDI\/Datenschutz\/Informationen\/240715_Discussion_Paper_Hamburg_DPA_KI_Models.pdf."},{"key":"ref_114","unstructured":"Holtzman, A., Buys, J., Du, L., Forbes, M., and Choi, Y. (2020). The Curious Case of Neural Text Degeneration. arXiv."},{"key":"ref_115","unstructured":"Hipp, R.D. (2025, September 11). SQLite. Available online: https:\/\/sqlite.org\/."},{"key":"ref_116","unstructured":"Qin, B., Hui, B., Wang, L., Yang, M., Li, J., Li, B., Geng, R., Cao, R., Sun, J., and Si, L. (2022). A Survey on Text-to-SQL Parsing: Concepts, Methods, and Future Directions. arXiv."},{"key":"ref_117","unstructured":"Noor, H. (2025, September 09). What Do You Mean? Using Large Language Models for Semantic Evaluation of NL2SQL Queries. Available online: https:\/\/uwspace.uwaterloo.ca\/bitstreams\/73520bc6-13dd-4586-a9c4-5b8ced8ddfc1\/download."},{"key":"ref_118","unstructured":"OpenAI (2025, September 12). API Pricing. Available online: https:\/\/openai.com\/api\/pricing\/."},{"key":"ref_119","unstructured":"Fireworks AI (2025, September 12). Pricing. Available online: https:\/\/fireworks.ai\/pricing."},{"key":"ref_120","unstructured":"Qdrant Team (2025, September 11). Qdrant\u2014Vector Database. Available online: https:\/\/qdrant.tech\/."},{"key":"ref_121","doi-asserted-by":"crossref","unstructured":"Papineni, K., Roukos, S., Ward, T., and Zhu, W.J. (2002, January 6\u201312). Bleu: A method for automatic evaluation of machine translation. Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics, Philadelphia, PA, USA.","DOI":"10.3115\/1073083.1073135"},{"key":"ref_122","unstructured":"Lin, C.Y. (2004, January 25\u201326). Rouge: A package for automatic evaluation of summaries. Proceedings of the Text Summarization Branches Out, Barcelona, Spain."},{"key":"ref_123","unstructured":"Zhang, T., Kishore, V., Wu, F., Weinberger, K.Q., and Artzi, Y. (2020, January 30). BERTScore: Evaluating Text Generation with BERT. Proceedings of the International Conference on Learning Representations (ICLR), Addis Ababa, Ethiopia."},{"key":"ref_124","unstructured":"OpenAI (2025, September 12). Text Embedding 3 Small. Available online: https:\/\/platform.openai.com\/docs\/models\/text-embedding-3-small."},{"key":"ref_125","unstructured":"Malkov, Y.A., and Yashunin, D.A. (2018). Efficient and robust approximate nearest neighbor search using Hierarchical Navigable Small World graphs. arXiv."},{"key":"ref_126","unstructured":"Bird, S., Klein, E., and Loper, E. (2009). Natural Language Processing with Python, O\u2019Reilly Media, Inc."},{"key":"ref_127","doi-asserted-by":"crossref","unstructured":"Liu, X., Shen, S., Li, B., Tang, N., and Luo, Y. (2025). NL2SQL-BUGs: A Benchmark for Detecting Semantic Errors in NL2SQL Translation. arXiv.","DOI":"10.1145\/3711896.3737427"},{"key":"ref_128","unstructured":"Shen, J., Wan, C., Qiao, R., Zou, J., Xu, H., Shao, Y., Zhang, Y., Miao, W., and Pu, G. (2025). A Study of In-Context-Learning-Based Text-to-SQL Errors. arXiv."},{"key":"ref_129","doi-asserted-by":"crossref","unstructured":"Ning, Z., Tian, Y., Zhang, Z., Zhang, T., and Li, T.J.J. (2024). Insights into natural language database query errors: From attention misalignment to user handling strategies. arXiv.","DOI":"10.1145\/3650114"},{"key":"ref_130","doi-asserted-by":"crossref","unstructured":"Ren, T., Fan, Y., He, Z., Huang, R., Dai, J., Huang, C., Jing, Y., Zhang, K., Yang, Y., and Wang, X.S. (2024, January 13\u201316). Purple: Making a large language model a better sql writer. Proceedings of the 2024 IEEE 40th International Conference on Data Engineering (ICDE), Utrecht, The Netherlands.","DOI":"10.1109\/ICDE60146.2024.00009"},{"key":"ref_131","doi-asserted-by":"crossref","first-page":"101865","DOI":"10.1016\/j.csl.2025.101865","article-title":"SCoT2S: Self-Correcting Text-to-SQL Parsing by Leveraging LLMs","volume":"95","author":"Zhu","year":"2026","journal-title":"Comput. Speech Lang."},{"key":"ref_132","unstructured":"Mitsopoulou, A.V. (2025, April 10). Towards More Robust Text-to-SQL Translation. Available online: https:\/\/pergamos.lib.uoa.gr\/uoa\/dl\/object\/3401060\/file.pdf."},{"key":"ref_133","doi-asserted-by":"crossref","unstructured":"Kroll, H., Kreutz, C.K., Sackhoff, P., and Balke, W.T. (2023, January 26\u201330). Enriching Simple Keyword Queries for Domain-Aware Narrative Retrieval. Proceedings of the 2023 ACM\/IEEE Joint Conference on Digital Libraries (JCDL), Santa Fe, NM, USA.","DOI":"10.1109\/JCDL57899.2023.00029"},{"key":"ref_134","unstructured":"Banerjee, S., and Lavie, A. (2005, January 1). METEOR: An automatic metric for MT evaluation with improved correlation with human judgments. Proceedings of the ACL Workshop on Intrinsic and Extrinsic Evaluation Measures for Machine Translation and\/or Summarization, Ann Arbor, MI, USA."}],"container-title":["Big Data and Cognitive Computing"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2504-2289\/9\/10\/256\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,14]],"date-time":"2025-10-14T06:33:19Z","timestamp":1760423599000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2504-2289\/9\/10\/256"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,10,11]]},"references-count":134,"journal-issue":{"issue":"10","published-online":{"date-parts":[[2025,10]]}},"alternative-id":["bdcc9100256"],"URL":"https:\/\/doi.org\/10.3390\/bdcc9100256","relation":{},"ISSN":["2504-2289"],"issn-type":[{"value":"2504-2289","type":"electronic"}],"subject":[],"published":{"date-parts":[[2025,10,11]]}}}