{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,26]],"date-time":"2026-03-26T04:49:16Z","timestamp":1774500556687,"version":"3.50.1"},"reference-count":137,"publisher":"Springer Science and Business Media LLC","issue":"9","license":[{"start":{"date-parts":[[2024,8,6]],"date-time":"2024-08-06T00:00:00Z","timestamp":1722902400000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2024,8,6]],"date-time":"2024-08-06T00:00:00Z","timestamp":1722902400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"DOI":"10.13039\/501100006690","name":"Politecnico di Milano","doi-asserted-by":"crossref","id":[{"id":"10.13039\/501100006690","id-type":"DOI","asserted-by":"crossref"}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Artif Intell Rev"],"abstract":"<jats:title>Abstract<\/jats:title><jats:p>Large language models (LLMs) have the intrinsic potential to acquire medical knowledge. Several studies assessing LLMs on medical examinations have been published. However, there is no reported evidence on tests related to robot-assisted surgery. The aims of this study were to perform the first systematic review of LLMs on medical examinations and to establish whether ChatGPT, GPT-4, and Bard can pass the Fundamentals of Robotic Surgery (FRS) didactic test. A literature search was performed on PubMed, Web of Science, Scopus, and arXiv following the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) approach. A total of 45 studies were analyzed. GPT-4 passed several national qualifying examinations with questions in English, Chinese, and Japanese using zero-shot and few-shot learning. Med-PaLM 2 obtained similar scores on the United States Medical Licensing Examination with more refined prompt engineering techniques. Five different 2023 releases of ChatGPT, one of GPT-4, and one of Bard were tested on FRS. Seven attempts were performed with each release. The pass score was 79.5%. ChatGPT achieved a mean score of 64.6%, 65.6%, 75.0%, 78.9%, and 72.7% respectively from the first to the fifth tested release on FRS vs 91.5% of GPT-4 and 79.5% of Bard. GPT-4 outperformed ChatGPT and Bard in all corresponding attempts with a statistically significant difference for ChatGPT (p\u2009&lt;\u20090.001), but not Bard (p\u2009=\u20090.002). Our findings agree with other studies included in this systematic review. We highlighted the potential and challenges of LLMs to transform the education of healthcare professionals in the different stages of learning, by assisting teachers in the preparation of teaching contents, and trainees in the acquisition of knowledge, up to becoming an assessment framework of leaners.<\/jats:p>","DOI":"10.1007\/s10462-024-10849-5","type":"journal-article","created":{"date-parts":[[2024,8,6]],"date-time":"2024-08-06T05:02:19Z","timestamp":1722920539000},"update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":20,"title":["Large language models in healthcare: from a systematic review on medical examinations to a comparative analysis on fundamentals of robotic surgery online test"],"prefix":"10.1007","volume":"57","author":[{"given":"Andrea","family":"Moglia","sequence":"first","affiliation":[]},{"given":"Konstantinos","family":"Georgiou","sequence":"additional","affiliation":[]},{"given":"Pietro","family":"Cerveri","sequence":"additional","affiliation":[]},{"given":"Luca","family":"Mainardi","sequence":"additional","affiliation":[]},{"given":"Richard M.","family":"Satava","sequence":"additional","affiliation":[]},{"given":"Alfred","family":"Cuschieri","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2024,8,6]]},"reference":[{"key":"10849_CR1","doi-asserted-by":"publisher","DOI":"10.2196\/48291","volume":"9","author":"A Abd-Alrazaq","year":"2023","unstructured":"Abd-Alrazaq A, AlSaad R, Alhuwail D, Ahmed A, Heal PM, Latifi S, Aziz S, Damseh R, Alabed Alrazak S, Sheikh J (2023) Large language models in medical education: opportunities, challenges, and future directions. JMIR Med Educ 9:e48291","journal-title":"JMIR Med Educ"},{"issue":"6","key":"10849_CR2","doi-asserted-by":"publisher","first-page":"2407","DOI":"10.1007\/s00266-023-03538-1","volume":"47","author":"J Abi-Rafeh","year":"2023","unstructured":"Abi-Rafeh J et al (2023) Complications following facelift and neck lift: implementation and assessment of large language model and artificial intelligence (ChatGPT) performance across 16 simulated patient presentations. Aesthetic Plast Surg 47(6):2407\u20132414. https:\/\/doi.org\/10.1007\/s00266-023-03538-1","journal-title":"Aesthetic Plast Surg"},{"issue":"6","key":"10849_CR3","doi-asserted-by":"publisher","first-page":"e40977","DOI":"10.7759\/cureus.40977","volume":"15","author":"M Agarwal","year":"2023","unstructured":"Agarwal M et al (2023) Analysing the applicability of ChatGPT, bard, and Bing to generate reasoning-based multiple-choice questions in medical physiology. Cureus 15(6):e40977\u2013e40977. https:\/\/doi.org\/10.7759\/cureus.40977","journal-title":"Cureus"},{"key":"10849_CR4","doi-asserted-by":"publisher","first-page":"2309","DOI":"10.2147\/JMDH.S419847","volume":"16","author":"TM Alanzi","year":"2023","unstructured":"Alanzi TM (2023) Impact of ChatGPT on teleconsultants in healthcare: perceptions of healthcare experts in Saudi Arabia. J Multidiscip Healthc 16:2309\u20132321. https:\/\/doi.org\/10.2147\/JMDH.S419847","journal-title":"J Multidiscip Healthc"},{"key":"10849_CR5","doi-asserted-by":"publisher","unstructured":"Alayrac, J, Donahue J, Luc P (2022) Flamingo: a visual language model for fewshot learning. arXiv:2204.14198. https:\/\/doi.org\/10.48550\/arXiv.2204.14198","DOI":"10.48550\/arXiv.2204.14198"},{"issue":"6","key":"10849_CR6","doi-asserted-by":"publisher","first-page":"1353","DOI":"10.1227\/neu.0000000000002632","volume":"93","author":"R Ali","year":"2023","unstructured":"Ali R et al (2023a) Performance of ChatGPT and GPT-4 on neurosurgery written board examinations. Neurosurgery 93(6):1353\u20131365. https:\/\/doi.org\/10.1227\/neu.0000000000002632","journal-title":"Neurosurgery"},{"issue":"5","key":"10849_CR7","doi-asserted-by":"publisher","first-page":"1090","DOI":"10.1227\/neu.0000000000002551","volume":"93","author":"R Ali","year":"2023","unstructured":"Ali R et al (2023b) Performance of ChatGPT, GPT-4, and google bard on a neurosurgery oral boards preparation question bank. Neurosurgery 93(5):1090\u20131098. https:\/\/doi.org\/10.1227\/neu.0000000000002551","journal-title":"Neurosurgery"},{"issue":"1","key":"10849_CR8","doi-asserted-by":"publisher","first-page":"206","DOI":"10.1111\/eje.12937","volume":"28","author":"K Ali","year":"2024","unstructured":"Ali K et al (2024) ChatGPT-A double-edged sword for healthcare education? Implications for assessments of dental students. Eur J Dent Educ 28(1):206\u2013211. https:\/\/doi.org\/10.1111\/eje.12937","journal-title":"Eur J Dent Educ"},{"issue":"4","key":"10849_CR9","doi-asserted-by":"publisher","DOI":"10.7759\/cureus.38249","volume":"15","author":"M Almazyad","year":"2023","unstructured":"Almazyad M et al (2023) Enhancing expert panel discussions in pediatric palliative care: innovative scenario development and summarization with ChatGPT-4. Cureus 15(4):e38249. https:\/\/doi.org\/10.7759\/cureus.38249","journal-title":"Cureus"},{"issue":"7","key":"10849_CR10","doi-asserted-by":"publisher","first-page":"351","DOI":"10.3390\/systems11070351","volume":"11","author":"A Alshami","year":"2023","unstructured":"Alshami A et al (2023) Harnessing the power of ChatGPT for automating systematic review process: methodology, case study, limitations, and future directions. Systems 11(7):351. https:\/\/doi.org\/10.3390\/systems11070351","journal-title":"Systems"},{"issue":"6","key":"10849_CR11","doi-asserted-by":"publisher","DOI":"10.7759\/cureus.40351","volume":"15","author":"I Altamimi","year":"2023","unstructured":"Altamimi I et al (2023) Snakebite advice and counseling from artificial intelligence: an acute venomous snakebite consultation with ChatGPT. Cureus 15(6):e40351. https:\/\/doi.org\/10.7759\/cureus.40351","journal-title":"Cureus"},{"key":"10849_CR12","doi-asserted-by":"publisher","DOI":"10.1213\/ANE.0000000000006892","author":"MC Angel","year":"2024","unstructured":"Angel MC, Rinehart JB, Canneson MP, Baldi P (2024) Clinical knowledge and reasoning abilities of AI large language models in anesthesiology: a comparative study on the ABA exam. Anesth Analg. https:\/\/doi.org\/10.1213\/ANE.0000000000006892","journal-title":"Anesth Analg"},{"issue":"6","key":"10849_CR13","doi-asserted-by":"publisher","first-page":"1623","DOI":"10.3390\/biomedicines11061623","volume":"11","author":"A Anghelescu","year":"2023","unstructured":"Anghelescu A et al (2023) PRISMA systematic literature review, including with meta-analysis vs. Chatbot\/GPT (AI) regarding current scientific data on the main effects of the calf blood deproteinized hemoderivative medicine (Actovegin) in ischemic stroke. Biomedicines 11(6):1623. https:\/\/doi.org\/10.3390\/biomedicines11061623","journal-title":"Biomedicines"},{"issue":"4","key":"10849_CR14","doi-asserted-by":"publisher","DOI":"10.1016\/j.xops.2023.100324","volume":"3","author":"F Antaki","year":"2023","unstructured":"Antaki F, Touma S, Milad D, El-Khoury J, Duval R (2023) Evaluating the performance of ChatGPT in ophthalmology: an analysis of its successes and shortcomings. Ophthalmol Sci 3(4):100324. https:\/\/doi.org\/10.1016\/j.xops.2023.100324","journal-title":"Ophthalmol Sci"},{"issue":"8","key":"10849_CR15","doi-asserted-by":"publisher","DOI":"10.7759\/cureus.43690","volume":"15","author":"M Ayoub","year":"2023","unstructured":"Ayoub M et al (2023) Mind + Machine: ChatGPT as a basic clinical decisions support tool. Cureus 15(8):e43690. https:\/\/doi.org\/10.7759\/cureus.43690","journal-title":"Cureus"},{"issue":"6","key":"10849_CR16","doi-asserted-by":"publisher","first-page":"1484","DOI":"10.1002\/ohn.465","volume":"170","author":"NF Ayoub","year":"2024","unstructured":"Ayoub NF et al (2024) Head-to-head comparison of ChatGPT versus google search for medical knowledge acquisition. Otolaryngol Head Neck Surg 170(6):1484\u20131491. https:\/\/doi.org\/10.1002\/ohn.465","journal-title":"Otolaryngol Head Neck Surg"},{"issue":"5","key":"10849_CR17","doi-asserted-by":"publisher","first-page":"809","DOI":"10.1111\/1742-6723.14233","volume":"35","author":"FE Babl","year":"2023","unstructured":"Babl FE, Babl MP (2023) Generative artificial intelligence: Can ChatGPT write a quality abstract? Emerg Med Australas 35(5):809\u2013811. https:\/\/doi.org\/10.1111\/1742-6723.14233","journal-title":"Emerg Med Australas"},{"key":"10849_CR18","doi-asserted-by":"publisher","unstructured":"Bai Y, Kadavath S, Kundu S, et al (2022) Constitutional AI: Harmlessness from AI Feedback. arXiv:2212.08073v1. https:\/\/doi.org\/10.48550\/arXiv.2212.08073","DOI":"10.48550\/arXiv.2212.08073"},{"issue":"4","key":"10849_CR19","doi-asserted-by":"publisher","first-page":"936","DOI":"10.1016\/j.surg.2023.12.014","volume":"175","author":"BR Beaulieu-Jones","year":"2024","unstructured":"Beaulieu-Jones BR, Shah S, Berrigan MT, Marwaha JS, Lai SL, Brat GA (2024) Evaluating capabilities of large language models: performance of GPT4 on surgical knowledge assessments. Surgery 175(4):936\u2013942. https:\/\/doi.org\/10.1016\/j.surg.2023.12.014","journal-title":"Surgery"},{"issue":"6","key":"10849_CR20","doi-asserted-by":"publisher","first-page":"1504","DOI":"10.1002\/ohn.506","volume":"170","author":"JR Bellinger","year":"2024","unstructured":"Bellinger JR et al (2024) BPPV information on google versus AI (ChatGPT). Otolaryngol Head Neck Surg 170(6):1504\u20131511. https:\/\/doi.org\/10.1002\/ohn.506","journal-title":"Otolaryngol Head Neck Surg"},{"issue":"5","key":"10849_CR21","doi-asserted-by":"publisher","DOI":"10.1148\/radiol.230582","volume":"307","author":"R Bhayana","year":"2023","unstructured":"Bhayana R, Krishna S, Bleakney RR (2023) Performance of ChatGPT on a radiology board-style examination: insights into current strengths and limitations. Radiology 307(5):e230582. https:\/\/doi.org\/10.1148\/radiol.230582","journal-title":"Radiology"},{"issue":"3","key":"10849_CR22","doi-asserted-by":"publisher","first-page":"415","DOI":"10.59249\/SKDH9286","volume":"96","author":"S Biswas","year":"2023","unstructured":"Biswas S et al (2023) ChatGPT and the future of journal reviews: a feasibility study. Yale J Biol Med 96(3):415\u2013420. https:\/\/doi.org\/10.59249\/SKDH9286","journal-title":"Yale J Biol Med"},{"issue":"1","key":"10849_CR23","doi-asserted-by":"publisher","first-page":"102","DOI":"10.1067\/j.cpradiol.2023.04.001","volume":"53","author":"WA Bosbach","year":"2023","unstructured":"Bosbach WA et al (2023) Ability of ChatGPT to generate competent radiology reports for distal radius fracture by use of RSNA template items and integrated AO classifier. Curr Problems Diagnostic Radiol 53(1):102\u2013110. https:\/\/doi.org\/10.1067\/j.cpradiol.2023.04.001","journal-title":"Curr Problems Diagnostic Radiol"},{"key":"10849_CR24","doi-asserted-by":"publisher","unstructured":"Brown T, Mann B, Ryder N, et al (2020) Language models are few-shot learners. arXiv:2005.14165. https:\/\/doi.org\/10.48550\/arXiv.2005.14165","DOI":"10.48550\/arXiv.2005.14165"},{"issue":"4","key":"10849_CR25","doi-asserted-by":"publisher","first-page":"2081","DOI":"10.1007\/s00405-023-08104-8","volume":"281","author":"CM Chiesa-Estomba","year":"2024","unstructured":"Chiesa-Estomba CM, Lechien JR, Vaira LA, Brunet A, Cammaroto G, Mayo-Yanez M, Sanchez-Barrueco A, Saga-Gutierrez C (2024) Exploring the potential of Chat-GPT as a supportive tool for sialendoscopy clinical decision making and patient information support. Eur Arch Otorhinolaryngol 281(4):2081\u20132086. https:\/\/doi.org\/10.1007\/s00405-023-08104-8","journal-title":"Eur Arch Otorhinolaryngol"},{"key":"10849_CR26","doi-asserted-by":"publisher","first-page":"102635","DOI":"10.1016\/j.artmed.2023.102635","volume":"144","author":"P Chung","year":"2023","unstructured":"Chung P et al (2023) Case scenario generators for trauma surgery simulation utilizing autoregressive language models. Artif Intell Med 144:102635. https:\/\/doi.org\/10.1016\/j.artmed.2023.102635","journal-title":"Artif Intell Med"},{"key":"10849_CR27","doi-asserted-by":"publisher","unstructured":"Cobbe K, Kosaraju V, Bavarian M, et al (2021) Training verifiers to solve math word problems. arXiv:2110.14168. https:\/\/doi.org\/10.48550\/arXiv.2110.14168","DOI":"10.48550\/arXiv.2110.14168"},{"issue":"1","key":"10849_CR28","doi-asserted-by":"publisher","first-page":"103","DOI":"10.1038\/s41391-023-00705-y","volume":"27","author":"A Cocci","year":"2023","unstructured":"Cocci A, Pezzoli M, Lo Re M, Russo GI, Asmundo MG, Fode M, Cacciamani G, Cimino S, Minervini A, Durukan E (2023) Quality of information and appropriateness of ChatGPT outputs for urology patients. Prostate Cancer Prostatic Dis 27(1):103\u2013108. https:\/\/doi.org\/10.1038\/s41391-023-00705-y","journal-title":"Prostate Cancer Prostatic Dis"},{"issue":"10","key":"10849_CR29","doi-asserted-by":"publisher","first-page":"1435","DOI":"10.1177\/1049732312452938","volume":"22","author":"A Cooke","year":"2012","unstructured":"Cooke A, Smith D, Booth A (2012) Beyond PICO: the SPIDER tool for qualitative evidence synthesis. Qual Health Res 22(10):1435\u20131443. https:\/\/doi.org\/10.1177\/1049732312452938","journal-title":"Qual Health Res"},{"key":"10849_CR30","doi-asserted-by":"publisher","DOI":"10.1093\/postmj\/qgad053","author":"R Cuthbert","year":"2023","unstructured":"Cuthbert R, Simpson AI (2023) Artificial intelligence in orthopaedics: can Chat Generative Pre-trained Transformer (ChatGPT) pass Section 1 of the Fellowship of the Royal College of Surgeons (Trauma & Orthopaedics) examination? Postgrad Med J. https:\/\/doi.org\/10.1093\/postmj\/qgad053","journal-title":"Postgrad Med J"},{"issue":"8","key":"10849_CR31","doi-asserted-by":"publisher","DOI":"10.7759\/cureus.42972","volume":"15","author":"AKD Dhanvijay","year":"2023","unstructured":"Dhanvijay AKD et al (2023) Performance of large language models (ChatGPT, Bing search, and google bard) in solving case vignettes in physiology. Cureus 15(8):e42972. https:\/\/doi.org\/10.7759\/cureus.42972","journal-title":"Cureus"},{"key":"10849_CR32","doi-asserted-by":"publisher","unstructured":"Driess D et al (2023) PaLM-E: an embodied multimodal language model. arXiv:2303.03378. https:\/\/doi.org\/10.48550\/arXiv.2303.03378","DOI":"10.48550\/arXiv.2303.03378"},{"issue":"12","key":"10849_CR33","doi-asserted-by":"publisher","DOI":"10.1371\/journal.pdig.0000397","volume":"2","author":"C Fang","year":"2023","unstructured":"Fang C et al (2023) How does ChatGPT4 preform on non-English national medical licensing examination? An evaluation in Chinese language. PLOS Digit Health 2(12):e0000397. https:\/\/doi.org\/10.1371\/journal.pdig.0000397","journal-title":"PLOS Digit Health"},{"issue":"11","key":"10849_CR34","doi-asserted-by":"publisher","first-page":"2717","DOI":"10.1007\/s11255-023-03729-4","volume":"55","author":"J Gabriel","year":"2023","unstructured":"Gabriel J et al (2023) The utility of the ChatGPT artificial intelligence tool for patient education and enquiry in robotic radical prostatectomy. Int Urol Nephrol 55(11):2717\u20132732. https:\/\/doi.org\/10.1007\/s11255-023-03729-4","journal-title":"Int Urol Nephrol"},{"key":"10849_CR35","doi-asserted-by":"publisher","DOI":"10.1007\/978-0-85729-763-1","volume-title":"Fundamentals of surgical simulation","author":"AG Gallagher","year":"2012","unstructured":"Gallagher AG, O\u2019Sullivan GC (2012) Fundamentals of surgical simulation. Springer, Cham"},{"issue":"1","key":"10849_CR36","doi-asserted-by":"publisher","first-page":"75","DOI":"10.1038\/s41746-023-00819-6","volume":"6","author":"CA Gao","year":"2023","unstructured":"Gao CA et al (2023) Comparing scientific abstracts generated by ChatGPT to real abstracts with detectors and blinded human reviewers. NPJ Digit Med 6(1):75. https:\/\/doi.org\/10.1038\/s41746-023-00819-6","journal-title":"NPJ Digit Med"},{"issue":"14","key":"10849_CR37","doi-asserted-by":"publisher","first-page":"3717","DOI":"10.3390\/cancers15143717","volume":"15","author":"G Gebrael","year":"2023","unstructured":"Gebrael G, Sahu KK, Chigarira B, Tripathi N, Mathew Thomas V, Sayegh N, Maughan BL, Agarwal N, Swami U, Li H (2023) Enhancing triage efficiency and accuracy in emergency rooms for patients with metastatic prostate cancer: a retrospective analysis of artificial intelligence-assisted triage using ChatGPT 4.0. Cancers (basel). 15(14):3717. https:\/\/doi.org\/10.3390\/cancers15143717","journal-title":"Cancers (basel)."},{"issue":"1","key":"10849_CR38","doi-asserted-by":"publisher","DOI":"10.1136\/bmjno-2023-000451","volume":"5","author":"P Giannos","year":"2023","unstructured":"Giannos P (2023a) Evaluating the limits of AI in medical specialisation: ChatGPT\u2019s performance on the UK neurology specialty certificate examination. BMJ Neurol Open 5(1):e000451. https:\/\/doi.org\/10.1136\/bmjno-2023-000451","journal-title":"BMJ Neurol Open"},{"key":"10849_CR39","doi-asserted-by":"publisher","DOI":"10.2196\/47737","volume":"9","author":"P Giannos","year":"2023","unstructured":"Giannos P, Delardas O (2023b) Performance of ChatGPT on UK standardized admission tests: insights from the BMAT, TMUA, LNAT, and TSA examinations. JMIR Med Educ 9:e47737. https:\/\/doi.org\/10.2196\/47737","journal-title":"JMIR Med Educ"},{"key":"10849_CR40","doi-asserted-by":"publisher","DOI":"10.2196\/45312","volume":"9","author":"A Gilson","year":"2023","unstructured":"Gilson A, Safranek CW, Huang T et al (2023) How does ChatGPT perform on the United States medical licensing examination? The implications of large language models for medical education and knowledge assessment. JMIR Med Educ 9:e45312. https:\/\/doi.org\/10.2196\/45312","journal-title":"JMIR Med Educ"},{"issue":"1","key":"10849_CR41","doi-asserted-by":"publisher","DOI":"10.1136\/bmjhci-2023-100775","volume":"30","author":"J Haemmerli","year":"2023","unstructured":"Haemmerli J et al (2023) ChatGPT in glioma adjuvant therapy decision making: ready to assume the role of a doctor in the tumour board? BMJ Health Care Inform 30(1):e100775. https:\/\/doi.org\/10.1136\/bmjhci-2023-100775","journal-title":"BMJ Health Care Inform"},{"key":"10849_CR42","doi-asserted-by":"publisher","unstructured":"Han T, et al (2023) MedAlpaca -- An Open-Source Collection of Medical Conversational AI Models and Training Data. arXiv: 2304.08247. https:\/\/doi.org\/10.48550\/arXiv.2304.08247","DOI":"10.48550\/arXiv.2304.08247"},{"key":"10849_CR43","doi-asserted-by":"publisher","unstructured":"Hatamizadeh A,\u00a0Tang Y,\u00a0Nath V,\u00a0et al (2021) UNETR: Transformers for 3D Medical Image Segmentation. arXiv:2103.10504v3. https:\/\/doi.org\/10.48550\/arXiv.2103.10504","DOI":"10.48550\/arXiv.2103.10504"},{"issue":"9","key":"10849_CR44","doi-asserted-by":"publisher","first-page":"4271","DOI":"10.1007\/s00405-023-08051-4","volume":"280","author":"CC Hoch","year":"2023","unstructured":"Hoch CC, Wollenberg B, L\u00fcers JC, Knoedler S, Knoedler L, Frank K, Cotofana S, Alfertshofer M (2023) ChatGPT\u2019s quiz skills in different otolaryngology subspecialties: an analysis of 2576 single-choice and multiple-choice board certification preparation questions. Eur Arch Otorhinolaryngol 280(9):4271\u20134278. https:\/\/doi.org\/10.1007\/s00405-023-08051-4","journal-title":"Eur Arch Otorhinolaryngol"},{"key":"10849_CR45","doi-asserted-by":"publisher","first-page":"1219326","DOI":"10.3389\/fonc.2023.1219326","volume":"13","author":"J Holmes","year":"2023","unstructured":"Holmes J, Liu Z, Zhang L, Ding Y, Sio TT, McGee LA, Ashman JB, Li X, Liu T, Shen J, Liu W (2023) Evaluating large language models on a highly-specialized topic, radiation oncology physics. Front Oncol 13:1219326. https:\/\/doi.org\/10.3389\/fonc.2023.1219326","journal-title":"Front Oncol"},{"key":"10849_CR46","doi-asserted-by":"publisher","DOI":"10.3171\/2023.2.JNS23419","author":"BS Hopkins","year":"2023","unstructured":"Hopkins BS, Nguyen VN, Dallas J, Texakalidis P, Yang M, Renn A, Guerra G, Kashif Z, Cheok S, Zada G, Mack WJ (2023) ChatGPT versus the neurosurgical written boards: a comparative analysis of artificial intelligence\/machine learning performance on neurosurgical board-style questions. J Neurosurg. https:\/\/doi.org\/10.3171\/2023.2.JNS23419","journal-title":"J Neurosurg"},{"key":"10849_CR47","doi-asserted-by":"publisher","DOI":"10.2196\/48433","volume":"9","author":"HY Hsu","year":"2023","unstructured":"Hsu HY et al (2023) Examining real-world medication consultations and drug-herb interactions: ChatGPT performance evaluation. JMIR Med Educ 9:e48433. https:\/\/doi.org\/10.2196\/48433","journal-title":"JMIR Med Educ"},{"key":"10849_CR48","doi-asserted-by":"publisher","first-page":"1265024","DOI":"10.3389\/fonc.2023.1265024","volume":"13","author":"Y Huang","year":"2023","unstructured":"Huang Y et al (2023) Benchmarking ChatGPT-4 on ACR radiation oncology in-training (TXIT) exam and red journal gray zone cases: potentials and challenges for AI-assisted medical education and decision making in radiation oncology. Front Oncol 13:1265024. https:\/\/doi.org\/10.3389\/fonc.2023.1265024","journal-title":"Front Oncol"},{"key":"10849_CR49","doi-asserted-by":"publisher","first-page":"1","DOI":"10.3352\/jeehp.2023.20.1","volume":"20","author":"S Huh","year":"2023","unstructured":"Huh S (2023) Are ChatGPT\u2019s knowledge and interpretation ability comparable to those of medical students in Korea for taking a parasitology examination?: a descriptive study. J Educ Eval Health Prof 20:1. https:\/\/doi.org\/10.3352\/jeehp.2023.20.1","journal-title":"J Educ Eval Health Prof"},{"issue":"4","key":"10849_CR50","doi-asserted-by":"publisher","first-page":"409","DOI":"10.1097\/UPJ.0000000000000406","volume":"10","author":"LM Huynh","year":"2023","unstructured":"Huynh LM, Bonebrake BT, Schultis K, Quach A, Deibert CM (2023) New artificial intelligence ChatGPT performs poorly on the 2022 self-assessment study program for urology. Urol Pract 10(4):409\u2013415. https:\/\/doi.org\/10.1097\/UPJ.0000000000000406","journal-title":"Urol Pract"},{"issue":"8","key":"10849_CR51","doi-asserted-by":"publisher","first-page":"563","DOI":"10.5005\/jp-journals-10071-24498","volume":"27","author":"J Jacob","year":"2023","unstructured":"Jacob J (2023) ChatGPT: friend or foe?-Utility in trauma triage. Indian J Crit Care Med 27(8):563\u2013566. https:\/\/doi.org\/10.5005\/jp-journals-10071-24498","journal-title":"Indian J Crit Care Med"},{"key":"10849_CR52","doi-asserted-by":"publisher","unstructured":"Jang D et al (2023) Exploring the Potential of Large Language models in Traditional Korean Medicine: A Foundation Model Approach to Culturally-Adapted Healthcare. arXiv:2303.17807. https:\/\/doi.org\/10.48550\/arXiv.2303.17807","DOI":"10.48550\/arXiv.2303.17807"},{"key":"10849_CR54","doi-asserted-by":"publisher","unstructured":"Kaarre J, et al (2023) Exploring the potential of ChatGPT as a supplementary tool for providing orthopaedic information. Knee Surg Sports Traumatol Arthrosc. 31(11):5190\u20135198. https.\/\/doi.org\/https:\/\/doi.org\/10.1007\/s00167-023-07529-2","DOI":"10.1007\/s00167-023-07529-2"},{"issue":"25","key":"10849_CR55","doi-asserted-by":"publisher","DOI":"10.1097\/MD.0000000000034068","volume":"102","author":"HJ Kao","year":"2023","unstructured":"Kao HJ et al (2023) Assessing ChatGPT\u2019s capacity for clinical decision support in pediatrics: a comparative study with pediatricians using KIDMAP of Rasch analysis. Medicine (baltimore) 102(25):e34068. https:\/\/doi.org\/10.1097\/MD.0000000000034068","journal-title":"Medicine (baltimore)"},{"key":"10849_CR56","doi-asserted-by":"publisher","first-page":"157","DOI":"10.1016\/j.pediatrneurol.2023.08.035","volume":"148","author":"C Karakas","year":"2023","unstructured":"Karakas C et al (2023) Leveraging ChatGPT in the pediatric neurology clinic: practical considerations for use to improve efficiency and outcomes. Pediatr Neurol 148:157\u2013163. https:\/\/doi.org\/10.1016\/j.pediatrneurol.2023.08.035","journal-title":"Pediatr Neurol"},{"key":"10849_CR57","doi-asserted-by":"publisher","unstructured":"Kasai J et al (2023) Evaluating GPT-4 and ChatGPT on Japanese Medical Licensing Examinations. arXiv:2303.18027. https:\/\/doi.org\/10.48550\/arXiv.2303.18027","DOI":"10.48550\/arXiv.2303.18027"},{"key":"10849_CR58","doi-asserted-by":"publisher","DOI":"10.31478\/202107a","author":"RS Kington","year":"2021","unstructured":"Kington RS, Arnesen S, Chou WYS, Curry SJ, Lazer D, Villarruel A (2021) Identifying credible sources of health information in social media: Principles and attributes. NAM Perspect. https:\/\/doi.org\/10.31478\/202107a","journal-title":"NAM Perspect"},{"issue":"3","key":"10849_CR59","doi-asserted-by":"publisher","first-page":"e13207","DOI":"10.1111\/bpa.13207","volume":"34","author":"S Koga","year":"2023","unstructured":"Koga S, Martin NB, Dickson DW (2023) Evaluating the performance of large language models: ChatGPT and Google Bard in generating differential diagnoses in clinicopathological conferences of neurodegenerative disorders. Brain Pathol 34(3):e13207. https:\/\/doi.org\/10.1111\/bpa.13207","journal-title":"Brain Pathol"},{"issue":"7","key":"10849_CR60","doi-asserted-by":"publisher","first-page":"374","DOI":"10.47102\/annals-acadmedsg.2023138","volume":"52","author":"SJQ Koh","year":"2023","unstructured":"Koh SJQ et al (2023) Leveraging ChatGPT to aid patient education on coronary angiogram. Ann Acad Med Singap 52(7):374\u2013377. https:\/\/doi.org\/10.47102\/annals-acadmedsg.2023138","journal-title":"Ann Acad Med Singap"},{"issue":"9","key":"10849_CR61","doi-asserted-by":"publisher","first-page":"1558","DOI":"10.1093\/jamia\/ocad104","volume":"30","author":"Y Kumah-Crystal","year":"2023","unstructured":"Kumah-Crystal Y, Mankowitz S, Embi P, Lehmann CU (2023) ChatGPT and the clinical informatics board examination: the end of unproctored maintenance of certification? J Am Med Inform Assoc. https:\/\/doi.org\/10.1093\/jamia\/ocad104","journal-title":"J Am Med Inform Assoc"},{"issue":"2","key":"10849_CR62","doi-asserted-by":"publisher","DOI":"10.1371\/journal.pdig.0000198","volume":"2","author":"TH Kung","year":"2023","unstructured":"Kung TH, Cheatham M, Medenilla A et al (2023) Performance of ChatGPT on USMLE: potential for AI-assisted medical education using large language models. PLOS Digit Health 2(2):e0000198. https:\/\/doi.org\/10.1371\/journal.pdig.0000198","journal-title":"PLOS Digit Health"},{"key":"10849_CR63","doi-asserted-by":"publisher","DOI":"10.1177\/10556656231193966","author":"MS Lebhar","year":"2023","unstructured":"Lebhar MS et al (2023) Dr. ChatGPT: utilizing artificial intelligence in surgical education. Cleft Palate Craniofac J. https:\/\/doi.org\/10.1177\/10556656231193966","journal-title":"Cleft Palate Craniofac J"},{"key":"10849_CR64","doi-asserted-by":"publisher","DOI":"10.1002\/ase.2270","author":"H Lee","year":"2023","unstructured":"Lee H (2023a) The rise of ChatGPT: exploring its potential in medical education. Anat Sci Educ. https:\/\/doi.org\/10.1002\/ase.2270","journal-title":"Anat Sci Educ"},{"key":"10849_CR65","doi-asserted-by":"publisher","DOI":"10.2196\/47427","volume":"9","author":"H Lee","year":"2023","unstructured":"Lee H (2023b) Using ChatGPT AS A LEARNING TOOL IN ACUPUNCTURE EDUCATION: COMPARATIVE STUDY. JMIR Med Educ 9:e47427. https:\/\/doi.org\/10.2196\/47427","journal-title":"JMIR Med Educ"},{"issue":"13","key":"10849_CR66","doi-asserted-by":"publisher","first-page":"1233","DOI":"10.1056\/NEJMsr2214184","volume":"388","author":"P Lee","year":"2023","unstructured":"Lee P, Bubeck S, Petro J (2023) Benefits, limits, and risks of GPT-4 as an AI chatbot for medicine. N Engl J Med 388(13):1233\u20131239. https:\/\/doi.org\/10.1056\/NEJMsr2214184","journal-title":"N Engl J Med"},{"issue":"2","key":"10849_CR67","doi-asserted-by":"publisher","first-page":"172.e1","DOI":"10.1016\/j.ajog.2023.04.020","volume":"229","author":"SW Li","year":"2023","unstructured":"Li SW, Kemp MW, Logan SJS, Dimri PS, Singh N, Mattar CNZ, Dashraath P, Ramlal H, Mahyuddin AP, Kanayan S, Carter SWD, Thain SP, Fee EL, Illanes SE, Choolani MA, National University of Singapore Obstetrics and Gynecology Artificial Intelligence (NUS OBGYN-AI) Collaborative Group (2023c) ChatGPT outscored human candidates in a virtual objective structured clinical examination in obstetrics and gynecology. Am J Obstet Gynecol 229(2):172.e1-172.e12. https:\/\/doi.org\/10.1016\/j.ajog.2023.04.020","journal-title":"Am J Obstet Gynecol"},{"key":"10849_CR68","doi-asserted-by":"publisher","unstructured":"Li XL, Liang P (2021) Prefix-tuning: Optimizing continuous prompts for generation. arXiv:2101.00190. https:\/\/doi.org\/10.48550\/arXiv.2101.00190","DOI":"10.48550\/arXiv.2101.00190"},{"key":"10849_CR69","doi-asserted-by":"publisher","unstructured":"Li J, Li S, Savarese S, Hoi S (2023b) BLIP-2: bootstrapping language-image pre-training with frozen image encoders and large language models. arXiv:2301.12597. https:\/\/doi.org\/10.48550\/arXiv.2301.12597","DOI":"10.48550\/arXiv.2301.12597"},{"key":"10849_CR70","doi-asserted-by":"publisher","unstructured":"Li C (2023a) LLaVA-Med: training a large language-and-vision assistant for biomedicine in one day. arXiv:2306.00890. https:\/\/doi.org\/10.48550\/arXiv.2306.00890","DOI":"10.48550\/arXiv.2306.00890"},{"key":"10849_CR71","doi-asserted-by":"publisher","unstructured":"Li Y et al (2023d) ChatDoctor: a medical chat model fine-tuned on a large language model meta-AI (LLaMA) using medical domain knowledge. arXiv: 2303.14070v5. https:\/\/doi.org\/10.48550\/arXiv.2303.14070","DOI":"10.48550\/arXiv.2303.14070"},{"key":"10849_CR72","doi-asserted-by":"publisher","unstructured":"Li\u00e9vin V, Egeberg Hother C, Winther O (2022) Can large language models reason about medical questions? arXiv: 2207.08143. https:\/\/doi.org\/10.48550\/arXiv.2207.08143","DOI":"10.48550\/arXiv.2207.08143"},{"issue":"7","key":"10849_CR73","doi-asserted-by":"publisher","first-page":"1237","DOI":"10.1093\/jamia\/ocad072","volume":"30","author":"S Liu","year":"2023","unstructured":"Liu S et al (2023) Using AI-generated suggestions from ChatGPT to optimize clinical decision support. J Am Med Inform Assoc 30(7):1237\u20131245. https:\/\/doi.org\/10.1093\/jamia\/ocad072","journal-title":"J Am Med Inform Assoc"},{"issue":"9","key":"10849_CR74","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1145\/3560815","volume":"55","author":"P Liu","year":"2023","unstructured":"Liu P, Yuan W, Fu J, Jiang Z, Hayashi H, Neubig G (2023b) Pre-train, prompt, and predict: a systematic survey of prompting methods in natural language processing. ACM Comput Surveys 55(9):1\u201335. https:\/\/doi.org\/10.1145\/3560815","journal-title":"ACM Comput Surveys"},{"key":"10849_CR75","doi-asserted-by":"publisher","DOI":"10.1101\/2023.04.12.23288452","author":"X Liu","year":"2023","unstructured":"Liu X, Fang C, Wang J (2023c) Performance of ChatGPT on clinical medicine entrance examination for Chinese postgraduate in Chinese. medRxiv. https:\/\/doi.org\/10.1101\/2023.04.12.23288452","journal-title":"medRxiv"},{"issue":"1","key":"10849_CR76","doi-asserted-by":"publisher","first-page":"447","DOI":"10.1186\/s12967-023-04314-0","volume":"21","author":"X Liu","year":"2023","unstructured":"Liu X, Wu C, Lai R, Lin H, Xu Y, Lin Y, Zhang W (2023d) ChatGPT: when the artificial intelligence meets standardized patients in clinical training. J Transl Med 21(1):447. https:\/\/doi.org\/10.1186\/s12967-023-04314-0","journal-title":"J Transl Med"},{"key":"10849_CR77","doi-asserted-by":"publisher","first-page":"1062","DOI":"10.3233\/SHTI230347","volume":"302","author":"H Liu","year":"2023","unstructured":"Liu H et al (2023e) How good is ChatGPT for medication evidence synthesis? Stud Health Technol Inform 302:1062\u20131066. https:\/\/doi.org\/10.3233\/SHTI230347","journal-title":"Stud Health Technol Inform"},{"key":"10849_CR78","doi-asserted-by":"publisher","unstructured":"Liu H (2023a) Visual Instruction Tuning. arXiv:2304.08485. https:\/\/doi.org\/10.48550\/arXiv.2304.08485","DOI":"10.48550\/arXiv.2304.08485"},{"issue":"9","key":"10849_CR79","doi-asserted-by":"publisher","first-page":"1527","DOI":"10.1007\/s43465-023-00967-7","volume":"57","author":"K Lower","year":"2023","unstructured":"Lower K et al (2023) ChatGPT-4: Transforming Medical Education and Addressing Clinical Exposure Challenges in the Post-pandemic Era. Indian J Orthop 57(9):1527\u20131544. https:\/\/doi.org\/10.1007\/s43465-023-00967-7","journal-title":"Indian J Orthop"},{"issue":"8","key":"10849_CR80","doi-asserted-by":"publisher","first-page":"1623","DOI":"10.1097\/CORR.0000000000002704","volume":"481","author":"ZC Lum","year":"2023","unstructured":"Lum ZC (2023) Can Artificial Intelligence Pass the American Board of Orthopaedic Surgery Examination? Orthopaedic Residents Versus ChatGPT. Clin Orthop Relat Res 481(8):1623\u20131630","journal-title":"Clin Orthop Relat Res"},{"key":"10849_CR81","doi-asserted-by":"publisher","DOI":"10.1016\/j.jcjo.2023.07.016","author":"RJ Lyons","year":"2023","unstructured":"Lyons RJ et al (2023) Artificial intelligence chatbot performance in triage of ophthalmic conditions. Can J Ophthalmol. https:\/\/doi.org\/10.1016\/j.jcjo.2023.07.016","journal-title":"Can J Ophthalmol"},{"issue":"1","key":"10849_CR82","doi-asserted-by":"publisher","first-page":"9","DOI":"10.1186\/s42492-023-00136-5","volume":"6","author":"Q Lyu","year":"2023","unstructured":"Lyu Q et al (2023) Translating radiology reports into plain language using ChatGPT and GPT-4 with prompt learning: results, limitations, and potential. Vis Comput Ind Biomed Art 6(1):9. https:\/\/doi.org\/10.1186\/s42492-023-00136-5","journal-title":"Vis Comput Ind Biomed Art"},{"key":"10849_CR83","doi-asserted-by":"publisher","first-page":"01003","DOI":"10.7189\/jogh.13.01003","volume":"13","author":"C Macdonald","year":"2023","unstructured":"Macdonald C et al (2023) Can ChatGPT draft a research article? An example of population-level vaccine effectiveness analysis. J Glob Health 13:01003. https:\/\/doi.org\/10.7189\/jogh.13.01003","journal-title":"J Glob Health"},{"issue":"6","key":"10849_CR84","doi-asserted-by":"publisher","first-page":"589","DOI":"10.1001\/jamaophthalmol.2023.1144","volume":"141","author":"A Mihalache","year":"2023","unstructured":"Mihalache A, Popovic MM, Muni RH (2023) Performance of an artificial intelligence Chatbot in ophthalmic knowledge assessment. JAMA Ophthalmol 141(6):589\u2013597. https:\/\/doi.org\/10.1001\/jamaophthalmol.2023.1144","journal-title":"JAMA Ophthalmol"},{"issue":"5","key":"10849_CR85","doi-asserted-by":"publisher","first-page":"413","DOI":"10.1055\/s-0043-1772704","volume":"56","author":"DP Mohapatra","year":"2023","unstructured":"Mohapatra DP et al (2023) Leveraging large language models (LLM) for the plastic surgery resident training: do they have a role? Indian J Plast Surg 56(5):413\u2013420. https:\/\/doi.org\/10.1055\/s-0043-1772704","journal-title":"Indian J Plast Surg"},{"issue":"4","key":"10849_CR86","doi-asserted-by":"publisher","first-page":"482","DOI":"10.4103\/idoj.idoj_72_23","volume":"14","author":"H Mondal","year":"2023","unstructured":"Mondal H et al (2023) Using ChatGPT for writing articles for patients\u2019 education for dermatological diseases: a pilot study. Indian Dermatol Online J 14(4):482\u2013486. https:\/\/doi.org\/10.4103\/idoj.idoj_72_23","journal-title":"Indian Dermatol Online J"},{"issue":"7","key":"10849_CR87","doi-asserted-by":"publisher","first-page":"889","DOI":"10.1136\/bjophthalmol-2022-321141","volume":"106","author":"S Nath","year":"2022","unstructured":"Nath S, Marie A, Ellershaw S, Korot E, Keane PA (2022) New meaning for NLP: the trials and tribulations of natural language processing with GPT-3 in ophthalmology. Br J Ophthalmol 106(7):889\u2013892. https:\/\/doi.org\/10.1136\/bjophthalmol-2022-321141","journal-title":"Br J Ophthalmol"},{"issue":"10","key":"10849_CR88","doi-asserted-by":"publisher","first-page":"1004","DOI":"10.1016\/j.jacr.2023.06.008","volume":"20","author":"L Nazario-Johnson","year":"2023","unstructured":"Nazario-Johnson L, Zaki HA, Tung GA (2023) Use of large language models to predict neuroimaging. J Am Coll Radiol 20(10):1004\u20131009. https:\/\/doi.org\/10.1016\/j.jacr.2023.06.008","journal-title":"J Am Coll Radiol"},{"key":"10849_CR89","doi-asserted-by":"publisher","unstructured":"Nori H, King N, McKinney SM, Carignan D, Horvitz E (2023) Capabilities of gpt-4 on medical challenge problems. arXiv:2303.13375. https:\/\/doi.org\/10.48550\/arXiv.2303.13375","DOI":"10.48550\/arXiv.2303.13375"},{"issue":"5","key":"10849_CR90","doi-asserted-by":"publisher","first-page":"269","DOI":"10.4174\/astr.2023.104.5.269","volume":"104","author":"N Oh","year":"2023","unstructured":"Oh N, Choi GS, Lee WY (2023) ChatGPT goes to the operating room: evaluating GPT-4 performance and its potential in surgical education and training in the era of large language models. Ann Surg Treat Res 104(5):269\u2013273. https:\/\/doi.org\/10.4174\/astr.2023.104.5.269","journal-title":"Ann Surg Treat Res"},{"key":"10849_CR91","doi-asserted-by":"publisher","unstructured":"OpenAI. GPT-4 Technical report (2023). arXiv:2303.08774. https:\/\/doi.org\/10.48550\/arXiv.2303.08774","DOI":"10.48550\/arXiv.2303.08774"},{"key":"10849_CR92","doi-asserted-by":"publisher","unstructured":"Ouyang L, Wu J, Jiang X, et al (2022) Training language models to follow instructions with human feedback. arXiv:2203.02155. https:\/\/doi.org\/10.48550\/arXiv.2203.02155","DOI":"10.48550\/arXiv.2203.02155"},{"key":"10849_CR93","doi-asserted-by":"publisher","DOI":"10.1016\/j.ijsu.2021.105906","volume":"88","author":"MJ Page","year":"2021","unstructured":"Page MJ, McKenzie JE, Bossuyt PM, Boutron I, Hoffmann TC, Mulrow CD et al (2021) The PRISMA 2020 statement: an updated guideline for reporting systematic reviews. Int J Surg 88:105906","journal-title":"Int J Surg"},{"key":"10849_CR94","doi-asserted-by":"publisher","first-page":"llad197","DOI":"10.1093\/ced\/llad197","volume":"2","author":"L Passby","year":"2023","unstructured":"Passby L, Jenko N, Wernham A (2023) Performance of ChatGPT on dermatology specialty certificate examination multiple choice questions. Clin Exp Dermatol 2:llad197. https:\/\/doi.org\/10.1093\/ced\/llad197","journal-title":"Clin Exp Dermatol"},{"key":"10849_CR95","doi-asserted-by":"publisher","first-page":"67","DOI":"10.1016\/j.clinimag.2021.09.018","volume":"81","author":"K Reeder","year":"2022","unstructured":"Reeder K, Lee H (2022) Impact of artificial intelligence on US medical students\u2019 choice of radiology. Clin Imaging 81:67\u201371. https:\/\/doi.org\/10.1016\/j.clinimag.2021.09.018","journal-title":"Clin Imaging"},{"issue":"8","key":"10849_CR96","doi-asserted-by":"publisher","DOI":"10.7759\/cureus.43106","volume":"15","author":"A Rizwan","year":"2023","unstructured":"Rizwan A, Sadiq T (2023) The use of AI in diagnosing diseases and providing management plans: a consultation on cardiovascular disorders with ChatGPT. Cureus 15(8):e43106. https:\/\/doi.org\/10.7759\/cureus.43106","journal-title":"Cureus"},{"key":"10849_CR97","doi-asserted-by":"publisher","DOI":"10.1016\/j.surge.2023.07.001","author":"A Saad","year":"2023","unstructured":"Saad A, Iyengar KP, Kurisunkal V, Botchu R (2023) Assessing ChatGPT\u2019s ability to pass the FRCS orthopaedic part a exam: a critical analysis. Surgeon. https:\/\/doi.org\/10.1016\/j.surge.2023.07.001","journal-title":"Surgeon"},{"issue":"1","key":"10849_CR98","doi-asserted-by":"publisher","first-page":"e103","DOI":"10.52225\/narra.v3i1.103","volume":"3","author":"M Sallam","year":"2023","unstructured":"Sallam M et al (2023) ChatGPT applications in medical, dental, pharmacy, and public health education: a descriptive study highlighting the advantages and limitations. Narra J 3(1):e103. https:\/\/doi.org\/10.52225\/narra.v3i1.103","journal-title":"Narra J"},{"issue":"3","key":"10849_CR100","doi-asserted-by":"publisher","first-page":"156","DOI":"10.4103\/tjem.tjem_79_23","volume":"23","author":"\u0130 Sarbay","year":"2023","unstructured":"Sarbay \u0130 et al (2023) Performance of emergency triage prediction of an open access natural language processing based chatbot application (ChatGPT): a preliminary, scenario-based cross-sectional study. Turk J Emerg Med 23(3):156\u2013161. https:\/\/doi.org\/10.4103\/tjem.tjem_79_23","journal-title":"Turk J Emerg Med"},{"issue":"2","key":"10849_CR101","doi-asserted-by":"publisher","first-page":"384","DOI":"10.1097\/SLA.0000000000003220","volume":"272","author":"RM Satava","year":"2020","unstructured":"Satava RM, Stefanidis D, Levy JS et al (2020) Proving the effectiveness of the fundamentals of robotic surgery (FRS) skills curriculum: a single-blinded, multispecialty. Multi-Inst Randomized Control Trial Ann Surg 272(2):384\u2013392. https:\/\/doi.org\/10.1097\/SLA.0000000000003220","journal-title":"Multi-Inst Randomized Control Trial Ann Surg"},{"issue":"1","key":"10849_CR102","doi-asserted-by":"publisher","first-page":"86","DOI":"10.1007\/s10143-023-01998-2","volume":"46","author":"UT Sevgi","year":"2023","unstructured":"Sevgi UT et al (2023) The role of an open artificial intelligence platform in modern neurosurgical education: a preliminary study. Neurosurg Rev 46(1):86. https:\/\/doi.org\/10.1007\/s10143-023-01998-2","journal-title":"Neurosurg Rev"},{"issue":"337","key":"10849_CR103","doi-asserted-by":"publisher","first-page":"337ra64","DOI":"10.1126\/scitranslmed.aad9398","volume":"8","author":"A Shademan","year":"2016","unstructured":"Shademan A, Decker RS, Opfermann JD, Leonard S, Krieger A, Kim PC (2016) Supervised autonomous robotic soft tissue surgery. Sci Transl Med 8(337):337ra64. https:\/\/doi.org\/10.1126\/scitranslmed.aad9398","journal-title":"Sci Transl Med"},{"key":"10849_CR104","doi-asserted-by":"publisher","unstructured":"Sharma P (2023) Performance of ChatGPT\u00a0on USMLE: unlocking the potential of\u00a0large\u00a0language\u00a0models\u00a0for AI-assisted\u00a0medical\u00a0education. arXiv: 2307.00112. https:\/\/doi.org\/10.48550\/arXiv.2307.00112","DOI":"10.48550\/arXiv.2307.00112"},{"issue":"2","key":"10849_CR105","doi-asserted-by":"publisher","first-page":"e31","DOI":"10.1016\/j.bja.2023.04.017","volume":"131","author":"D Shay","year":"2023","unstructured":"Shay D, Kumar B, Bellamy D, Palepu A, Dershwitz M, Walz JM, Schaefer MS, Beam A (2023) Assessment of ChatGPT success with specialty medical knowledge using anaesthesiology board examination practice questions. Br J Anaesth 131(2):e31\u2013e34. https:\/\/doi.org\/10.1016\/j.bja.2023.04.017","journal-title":"Br J Anaesth"},{"key":"10849_CR106","doi-asserted-by":"publisher","first-page":"j4008","DOI":"10.1136\/bmj.j4008","volume":"358","author":"BJ Shea","year":"2017","unstructured":"Shea BJ, Reeves BC, Wells G, Thuku M, Hamel C, Moran J, Moher D, Tugwell P, Welch V, Kristjansson E, Henry DA (2017) AMSTAR 2: a critical appraisal tool for systematic reviews that include randomised or non-randomised studies of healthcare interventions, or both. BMJ 358:j4008. https:\/\/doi.org\/10.1136\/bmj.j4008","journal-title":"BMJ"},{"key":"10849_CR107","doi-asserted-by":"publisher","unstructured":"Shihadeh J, Ackerman M, Troske A, Lawson N, Gonzalez E (2022) Brilliance bias in GPT-3. In\u00a02022 IEEE Global Humanitarian Technology Conference (GHTC)\u00a0(pp. 62\u201369). https:\/\/doi.org\/10.1109\/GHTC55712.2022.9910995","DOI":"10.1109\/GHTC55712.2022.9910995"},{"key":"10849_CR108","doi-asserted-by":"publisher","DOI":"10.1038\/s41586-023-06291-2","author":"K Singhal","year":"2023","unstructured":"Singhal K, Azizi S, Tu T et al (2023) Large language models encode clinical knowledge. Nature. https:\/\/doi.org\/10.1038\/s41586-023-06291-2","journal-title":"Nature"},{"key":"10849_CR109","doi-asserted-by":"publisher","unstructured":"Singhal K, et al (2023b) Towards Expert-Level Medical Question Answering with Large Language Models. arXiv:2305.09617. https:\/\/doi.org\/10.48550\/arXiv.2305.09617","DOI":"10.48550\/arXiv.2305.09617"},{"issue":"3","key":"10849_CR110","doi-asserted-by":"publisher","first-page":"279","DOI":"10.1093\/ehjdh\/ztad029","volume":"4","author":"I Skalidis","year":"2023","unstructured":"Skalidis I, Cagnina A, Luangphiphat W, Mahendiran T, Muller O, Abbe E, Fournier S (2023) ChatGPT takes on the European Exam in Core Cardiology: an artificial intelligence success story? Eur Heart J Digit Health 4(3):279\u2013281. https:\/\/doi.org\/10.1093\/ehjdh\/ztad029","journal-title":"Eur Heart J Digit Health"},{"key":"10849_CR111","doi-asserted-by":"publisher","DOI":"10.1111\/1742-6723.14280","author":"J Smith","year":"2023","unstructured":"Smith J, Choi PM, Buntine P (2023a) Will code one day run a code? Performance of language models on ACEM primary examinations and implications. Emerg Med Australas. https:\/\/doi.org\/10.1111\/1742-6723.14280","journal-title":"Emerg Med Australas"},{"issue":"8","key":"10849_CR112","doi-asserted-by":"publisher","first-page":"1882","DOI":"10.1177\/00207640231178451","volume":"69","author":"A Smith","year":"2023","unstructured":"Smith A et al (2023b) Old dog, new tricks? Exploring the potential functionalities of ChatGPT in supporting educational methods in social psychiatry. Int J Soc Psychiatry 69(8):1882\u20131889. https:\/\/doi.org\/10.1177\/00207640231178451","journal-title":"Int J Soc Psychiatry"},{"key":"10849_CR113","doi-asserted-by":"publisher","unstructured":"Stiennon N, Ouyang L, Wu J, Ziegler DM, Lowe R, Voss C, Radford A, Amodei D, and Christiano P (2022) Learning to summarize from human feedback. arXiv:2009.01325. https:\/\/doi.org\/10.48550\/arXiv.2009.01325","DOI":"10.48550\/arXiv.2009.01325"},{"issue":"7947","key":"10849_CR114","doi-asserted-by":"publisher","first-page":"214","DOI":"10.1038\/d41586-023-00340-6","volume":"614","author":"C Stokel-Walker","year":"2023","unstructured":"Stokel-Walker C, Van Noorden R (2023) What ChatGPT and generative AI mean for science. Nature 614(7947):214\u2013216. https:\/\/doi.org\/10.1038\/d41586-023-00340-6","journal-title":"Nature"},{"key":"10849_CR115","doi-asserted-by":"publisher","DOI":"10.1101\/2023.03.24.23287731","author":"E Strong","year":"2023","unstructured":"Strong E, DiGiammarino A, Weng Y, Basaviah P, Hosamani P, Kumar A, Nevins A, Kugler J, Hom J, Chen JH (2023) Performance of ChatGPT on free-response, clinical reasoning exams. medRxiv. https:\/\/doi.org\/10.1101\/2023.03.24.23287731","journal-title":"medRxiv"},{"key":"10849_CR116","doi-asserted-by":"publisher","DOI":"10.14309\/ajg.0000000000002320","author":"K Suchman","year":"2023","unstructured":"Suchman K, Garg S, Trindade AJ (2023) Chat generative pretrained transformer fails the multiple-choice American college of gastroenterology self-assessment test. Am J Gastroenterol. https:\/\/doi.org\/10.14309\/ajg.0000000000002320","journal-title":"Am J Gastroenterol"},{"key":"10849_CR117","doi-asserted-by":"publisher","DOI":"10.2196\/47305","volume":"6","author":"K Taira","year":"2023","unstructured":"Taira K, Itaya T, Hanada A (2023) Performance of the Large language model ChatGPT on the national nurse examinations in japan: evaluation study. JMIR Nurs 6:e47305. https:\/\/doi.org\/10.2196\/47305","journal-title":"JMIR Nurs"},{"key":"10849_CR118","doi-asserted-by":"publisher","first-page":"e48002","DOI":"10.2196\/48002","volume":"9","author":"S Takagi","year":"2023","unstructured":"Takagi S, Watari T, Erabi A, Sakaguchi K (2023) Performance of GPT-3.5 and GPT-4 on the Japanese medical licensing examination: comparison study. JMIR Med Educ. 9:e48002. https:\/\/doi.org\/10.2196\/48002","journal-title":"JMIR Med Educ."},{"key":"10849_CR119","doi-asserted-by":"publisher","unstructured":"Taylor R, Kardas M, Cucurull G, et al (2022) Galactica: A Large Language Model for Science. arXiv:2211.09085. https:\/\/doi.org\/10.48550\/arXiv.2211.09085","DOI":"10.48550\/arXiv.2211.09085"},{"key":"10849_CR120","doi-asserted-by":"publisher","DOI":"10.2196\/46599","volume":"9","author":"AJ Thirunavukarasu","year":"2023","unstructured":"Thirunavukarasu AJ, Hassan R, Mahmood S, Sanghera R, Barzangi K, El Mukashfi M, Shah S (2023) Trialling a large language model (ChatGPT) in general practice with the applied knowledge test: observational study demonstrating opportunities and limitations in primary care. JMIR Med Educ 9:e46599. https:\/\/doi.org\/10.2196\/46599","journal-title":"JMIR Med Educ"},{"key":"10849_CR121","doi-asserted-by":"publisher","unstructured":"Toma A, et al (2023) Clinical Camel: An Open-Source Expert-Level Medical Language Model with Dialogue-Based Knowledge Encoding. arXiv: 2305.12031. https:\/\/doi.org\/10.48550\/arXiv.2305.12031","DOI":"10.48550\/arXiv.2305.12031"},{"issue":"10","key":"10849_CR122","doi-asserted-by":"publisher","first-page":"1321","DOI":"10.1007\/s00276-023-03229-1","volume":"45","author":"T Totlis","year":"2023","unstructured":"Totlis T et al (2023) The potential role of ChatGPT and artificial intelligence in anatomy education: a conversation with ChatGPT. Surg Radiol Anat 45(10):1321\u20131329. https:\/\/doi.org\/10.1007\/s00276-023-03229-1","journal-title":"Surg Radiol Anat"},{"key":"10849_CR123","doi-asserted-by":"publisher","unstructured":"Touvron H, Lavril T, Izacard G, et al (2023) LLaMA: Open and Efficient Foundation Language Models. arXiv:2302.13971. https:\/\/doi.org\/10.48550\/arXiv.2302.13971","DOI":"10.48550\/arXiv.2302.13971"},{"key":"10849_CR124","doi-asserted-by":"publisher","unstructured":"Tu T (2023) Towards Generalist Biomedical AI. arXiv:2307.14334. https:\/\/doi.org\/10.48550\/arXiv.2307.14334","DOI":"10.48550\/arXiv.2307.14334"},{"issue":"5","key":"10849_CR125","doi-asserted-by":"publisher","first-page":"298","DOI":"10.1016\/j.oftale.2023.04.011","volume":"98","author":"FJ Valent\u00edn-Bravo","year":"2023","unstructured":"Valent\u00edn-Bravo FJ et al (2023) Artificial Intelligence and new language models in Ophthalmology: complications of the use of silicone oil in vitreoretinal surgery. Arch Soc Esp Oftalmol (engl Ed) 98(5):298\u2013303. https:\/\/doi.org\/10.1016\/j.oftale.2023.04.011","journal-title":"Arch Soc Esp Oftalmol (engl Ed)"},{"key":"10849_CR126","doi-asserted-by":"publisher","unstructured":"Vaswani A, Shazeer N, Parmar N, Uszkoreit J, et al (2017) Attention is all you need. arXiv:1706.03762v5.\u00a0https:\/\/doi.org\/10.48550\/arXiv.1706.03762","DOI":"10.48550\/arXiv.1706.03762"},{"issue":"7","key":"10849_CR127","doi-asserted-by":"publisher","first-page":"653","DOI":"10.1097\/JCMA.0000000000000942","volume":"86","author":"YM Wang","year":"2023","unstructured":"Wang YM, Shen HW, Chen TJ (2023) Performance of ChatGPT on the pharmacist licensing examination in Taiwan. J Chin Med Assoc 86(7):653\u2013658. https:\/\/doi.org\/10.1097\/JCMA.0000000000000942","journal-title":"J Chin Med Assoc"},{"key":"10849_CR128","doi-asserted-by":"publisher","unstructured":"Wang X, et al (2022) Self-consistency improves chain of thought reasoning in language models. arXiv:2203.11171. https:\/\/doi.org\/10.48550\/arXiv.2203.11171","DOI":"10.48550\/arXiv.2203.11171"},{"key":"10849_CR129","doi-asserted-by":"publisher","unstructured":"Wang H, et al (2023a) HuaTuo: Tuning LLaMA Model with Chinese Medical Knowledge. arXiv:2304.06975. https:\/\/doi.org\/10.48550\/arXiv.2304.06975","DOI":"10.48550\/arXiv.2304.06975"},{"key":"10849_CR130","doi-asserted-by":"publisher","unstructured":"Wang S, et al (2023b) ChatCAD: Interactive Computer-Aided Diagnosis on Medical Image using Large Language Models. arXiv: 2302.07257. https:\/\/doi.org\/10.48550\/arXiv.2302.07257","DOI":"10.48550\/arXiv.2302.07257"},{"key":"10849_CR131","doi-asserted-by":"publisher","unstructured":"Wei J, et al (2022) Chain of thought prompting elicits reasoning in large language models. arXiv:2201.11903. https:\/\/doi.org\/10.48550\/arXiv.2201.11903","DOI":"10.48550\/arXiv.2201.11903"},{"key":"10849_CR132","doi-asserted-by":"publisher","DOI":"10.1097\/JCMA.0000000000000946","author":"TL Weng","year":"2023","unstructured":"Weng TL, Wang YM, Chang S, Chen TJ, Hwang SJ (2023) ChatGPT failed Taiwan\u2019s family medicine board exam. J Chin Med Assoc. https:\/\/doi.org\/10.1097\/JCMA.0000000000000946","journal-title":"J Chin Med Assoc"},{"key":"10849_CR133","doi-asserted-by":"publisher","first-page":"1681","DOI":"10.2147\/JMDH.S463128","volume":"17","author":"J Wu","year":"2024","unstructured":"Wu J et al (2024) The application of ChatGPT in medicine: a scoping review and bibliometric analysis. J Multidiscip Healthc 17:1681\u20131692. https:\/\/doi.org\/10.2147\/JMDH.S463128","journal-title":"J Multidiscip Healthc"},{"key":"10849_CR134","doi-asserted-by":"publisher","unstructured":"Wu C, et al (2023) PMC-LLaMA: Further Finetuning LLaMA on Medical Papers. arXiv: 2304.14454. https:\/\/doi.org\/10.48550\/arXiv.2304.14454","DOI":"10.48550\/arXiv.2304.14454"},{"issue":"6","key":"10849_CR135","doi-asserted-by":"publisher","first-page":"2360","DOI":"10.1007\/s00266-023-03443-7","volume":"47","author":"Y Xie","year":"2023","unstructured":"Xie Y et al (2023) Evaluation of the artificial intelligence Chatbot on breast reconstruction and its efficacy in surgical research: a case study. Aesthetic Plast Surg 47(6):2360\u20132369. https:\/\/doi.org\/10.1007\/s00266-023-03443-7","journal-title":"Aesthetic Plast Surg"},{"issue":"1\u20132","key":"10849_CR136","doi-asserted-by":"publisher","first-page":"68","DOI":"10.1111\/ans.18666","volume":"94","author":"Y Xie","year":"2024","unstructured":"Xie Y et al (2024) Investigating the impact of innovative AI chatbot on post-pandemic medical education and clinical assistance: a comprehensive analysis. ANZ J Surg 94(1\u20132):68\u201377. https:\/\/doi.org\/10.1111\/ans.18666","journal-title":"ANZ J Surg"},{"key":"10849_CR137","doi-asserted-by":"publisher","unstructured":"Xiong H, et al (2023) DoctorGLM: Fine-tuning your Chinese Doctor is not a Herculean Task. arXiv:2304.01097v2. https:\/\/doi.org\/10.48550\/arXiv.2304.01097","DOI":"10.48550\/arXiv.2304.01097"},{"issue":"4","key":"10849_CR138","doi-asserted-by":"publisher","first-page":"580","DOI":"10.1016\/j.jamcollsurg.2012.05.035","volume":"215","author":"B Zevin","year":"2012","unstructured":"Zevin B, Levy JS, Satava RM, Grantcharov TP (2012) A consensus-based framework for design, validation, and implementation of simulation-based training curricula in surgery. J Am Coll Surg 215(4):580-586.e3. https:\/\/doi.org\/10.1016\/j.jamcollsurg.2012.05.035","journal-title":"J Am Coll Surg"},{"issue":"4","key":"10849_CR139","doi-asserted-by":"publisher","first-page":"e37589","DOI":"10.7759\/cureus.37589","volume":"15","author":"Z Zhou","year":"2023","unstructured":"Zhou Z (2023) Evaluation of ChatGPT\u2019s capabilities in medical report generation. Cureus. 15(4):e37589. https:\/\/doi.org\/10.7759\/cureus.37589","journal-title":"Cureus."}],"container-title":["Artificial Intelligence Review"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s10462-024-10849-5.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1007\/s10462-024-10849-5\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s10462-024-10849-5.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,9,5]],"date-time":"2024-09-05T05:13:27Z","timestamp":1725513207000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/s10462-024-10849-5"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,8,6]]},"references-count":137,"journal-issue":{"issue":"9","published-online":{"date-parts":[[2024,9]]}},"alternative-id":["10849"],"URL":"https:\/\/doi.org\/10.1007\/s10462-024-10849-5","relation":{},"ISSN":["1573-7462"],"issn-type":[{"value":"1573-7462","type":"electronic"}],"subject":[],"published":{"date-parts":[[2024,8,6]]},"assertion":[{"value":"3 July 2024","order":1,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"6 August 2024","order":2,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Declarations"}},{"value":"The other authors have no competing interests to declare that are relevant to the content of this article. The authors have no relevant financial interests to disclose.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Competing interests"}}],"article-number":"231"}}