{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,1]],"date-time":"2026-04-01T19:32:20Z","timestamp":1775071940149,"version":"3.50.1"},"reference-count":15,"publisher":"Springer Science and Business Media LLC","issue":"4","license":[{"start":{"date-parts":[[2024,11,11]],"date-time":"2024-11-11T00:00:00Z","timestamp":1731283200000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2024,11,11]],"date-time":"2024-11-11T00:00:00Z","timestamp":1731283200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"DOI":"10.13039\/100006942","name":"Medical University of South Carolina","doi-asserted-by":"crossref","id":[{"id":"10.13039\/100006942","id-type":"DOI","asserted-by":"crossref"}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["J Digit Imaging. Inform. med."],"abstract":"<jats:title>Abstract<\/jats:title>\n          <jats:p>This study aimed to evaluate the accuracy and efficiency of ChatGPT-3.5, ChatGPT-4o, Google Gemini, and Google Gemini Advanced in generating CAD-RADS scores based on radiology reports. This retrospective study analyzed 100 consecutive coronary computed tomography angiography reports performed between March 15, 2024, and April 1, 2024, at a single tertiary center. Each report containing a radiologist-assigned CAD-RADS score was processed using four large language models (LLMs) without fine-tuning. The findings section of each report was input into the LLMs, and the models were tasked with generating CAD-RADS scores. The accuracy of LLM-generated scores was compared to the radiologist\u2019s score. Additionally, the time taken by each model to complete the task was recorded. Statistical analyses included Mann\u2013Whitney <jats:italic>U<\/jats:italic> test and interobserver agreement using unweighted Cohen\u2019s Kappa and Krippendorff\u2019s Alpha. ChatGPT-4o demonstrated the highest accuracy, correctly assigning CAD-RADS scores in 87% of cases (<jats:italic>\u03ba<\/jats:italic>\u2009=\u20090.838, <jats:italic>\u03b1<\/jats:italic>\u2009=\u20090.886), followed by Gemini Advanced with 82.6% accuracy (<jats:italic>\u03ba<\/jats:italic>\u2009=\u20090.784, <jats:italic>\u03b1<\/jats:italic>\u2009=\u20090.897). ChatGPT-3.5, although the fastest (median time\u2009=\u20095\u00a0s), was the least accurate (50.5% accuracy, <jats:italic>\u03ba<\/jats:italic>\u2009=\u20090.401, <jats:italic>\u03b1<\/jats:italic>\u2009=\u20090.787). Gemini exhibited a higher failure rate (12%) compared to the other models, with Gemini Advanced slightly improving upon its predecessor. ChatGPT-4o outperformed other LLMs in both accuracy and agreement with radiologist-assigned CAD-RADS scores, though ChatGPT-3.5 was significantly faster. Despite their potential, current publicly available LLMs require further refinement before being deployed for clinical decision-making in CAD-RADS scoring.<\/jats:p>","DOI":"10.1007\/s10278-024-01328-y","type":"journal-article","created":{"date-parts":[[2024,11,11]],"date-time":"2024-11-11T13:50:55Z","timestamp":1731333055000},"page":"2303-2311","update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":14,"title":["ChatGPT vs Gemini: Comparative Accuracy and Efficiency in CAD-RADS Score Assignment from Radiology Reports"],"prefix":"10.1007","volume":"38","author":[{"given":"Matthew","family":"Silbergleit","sequence":"first","affiliation":[]},{"given":"Adrienn","family":"T\u00f3th","sequence":"additional","affiliation":[]},{"given":"Jordan H.","family":"Chamberlin","sequence":"additional","affiliation":[]},{"given":"Mohamed","family":"Hamouda","sequence":"additional","affiliation":[]},{"given":"Dhiraj","family":"Baruah","sequence":"additional","affiliation":[]},{"given":"Sydney","family":"Derrick","sequence":"additional","affiliation":[]},{"given":"U. Joseph","family":"Schoepf","sequence":"additional","affiliation":[]},{"given":"Jeremy R.","family":"Burt","sequence":"additional","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0002-8745-339X","authenticated-orcid":false,"given":"Ismail M.","family":"Kabakus","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2024,11,11]]},"reference":[{"key":"1328_CR1","doi-asserted-by":"publisher","first-page":"56","DOI":"10.1145\/3490443","volume":"65","author":"H Li","year":"2022","unstructured":"Li H: Language models: past, present, and future. Communications of the ACM 65:56-63, 2022","journal-title":"Communications of the ACM"},{"key":"1328_CR2","doi-asserted-by":"publisher","first-page":"80","DOI":"10.4274\/dir.2023.232417","volume":"30","author":"T AkinciD'Antonoli","year":"2024","unstructured":"Akinci D'Antonoli T, et al.: Large language models in radiology: fundamentals, applications, ethical considerations, risks, and future directions. Diagn Interv Radiol 30:80-90, 2024","journal-title":"Diagn Interv Radiol"},{"key":"1328_CR3","doi-asserted-by":"publisher","first-page":"1930","DOI":"10.1038\/s41591-023-02448-8","volume":"29","author":"AJ Thirunavukarasu","year":"2023","unstructured":"Thirunavukarasu AJ, Ting DSJ, Elangovan K, Gutierrez L, Tan TF, Ting DSW: Large language models in medicine. Nat Med 29:1930-1940, 2023","journal-title":"Nat Med"},{"key":"1328_CR4","doi-asserted-by":"publisher","first-page":"1225","DOI":"10.3348\/kjr.2020.1210","volume":"22","author":"JH Yoon","year":"2021","unstructured":"Yoon JH, Kim EK: Deep Learning-Based Artificial Intelligence for Mammography. Korean J Radiol 22:1225-1239, 2021","journal-title":"Korean J Radiol"},{"key":"1328_CR5","doi-asserted-by":"publisher","DOI":"10.1016\/j.compbiomed.2024.108290","volume":"172","author":"Y Tan","year":"2024","unstructured":"Tan Y, et al.: MedChatZH: A tuning LLM for traditional Chinese medicine consultations. Comput Biol Med 172:108290, 2024","journal-title":"Comput Biol Med"},{"key":"1328_CR6","doi-asserted-by":"crossref","unstructured":"Liu Z, et al.: Tailoring Large Language Models to Radiology: A Preliminary Approach to LLM Adaptation for a Highly Specialized Domain. Machine Learning in Medical Imaging 14348:464\u2013473, 2023","DOI":"10.1007\/978-3-031-45673-2_46"},{"key":"1328_CR7","doi-asserted-by":"publisher","first-page":"135","DOI":"10.1038\/s41746-023-00879-8","volume":"6","author":"M Wornow","year":"2023","unstructured":"Wornow M, et al.: The shaky foundations of large language models and foundation models for electronic health records. NPJ Digit Med 6:135, 2023","journal-title":"NPJ Digit Med"},{"key":"1328_CR8","doi-asserted-by":"publisher","first-page":"e232133","DOI":"10.1148\/radiol.232133","volume":"311","author":"A Cozzi","year":"2024","unstructured":"Cozzi A, et al.: BI-RADS Category Assignments by GPT-3.5, GPT-4, and Google Bard: A Multilanguage Study. Radiology 311:e232133, 2024","journal-title":"Radiology"},{"key":"1328_CR9","doi-asserted-by":"publisher","first-page":"1390774","DOI":"10.3389\/fradi.2024.1390774","volume":"4","author":"P Fervers","year":"2024","unstructured":"Fervers P, et al.: ChatGPT yields low accuracy in determining LI-RADS scores based on free-text and structured radiology reports in German language. Front Radiol 4:1390774, 2024","journal-title":"Front Radiol"},{"key":"1328_CR10","doi-asserted-by":"publisher","first-page":"e220183","DOI":"10.1148\/ryct.220183","volume":"4","author":"RC Cury","year":"2022","unstructured":"Cury RC, et al.: CAD-RADS\u2122 2.0 - 2022 Coronary Artery Disease - Reporting and Data System An Expert Consensus Document of the Society of Cardiovascular Computed Tomography (SCCT), the American College of Cardiology (ACC), the American College of Radiology (ACR) and the North America Society of Cardiovascular Imaging (NASCI). Radiol Cardiothorac Imaging 4:e220183, 2022","journal-title":"Radiol Cardiothorac Imaging"},{"key":"1328_CR11","doi-asserted-by":"publisher","first-page":"125","DOI":"10.1016\/j.jcct.2017.11.014","volume":"12","author":"CD Maroules","year":"2018","unstructured":"Maroules CD, et al.: Coronary artery disease reporting and data system (CAD-RADSTM): inter-observer agreement for assessment categories and modifiers. Journal of cardiovascular computed tomography 12:125-130, 2018","journal-title":"Journal of cardiovascular computed tomography"},{"key":"1328_CR12","doi-asserted-by":"publisher","first-page":"176","DOI":"10.1186\/s13244-022-01286-5","volume":"13","author":"D Ippolito","year":"2022","unstructured":"Ippolito D, et al.: Inter-observer agreement and image quality of model-based algorithm applied to the Coronary Artery Disease-Reporting and Data System score. Insights Imaging 13:176, 2022","journal-title":"Insights Imaging"},{"key":"1328_CR13","doi-asserted-by":"publisher","first-page":"1010","DOI":"10.1016\/j.jacr.2023.07.010","volume":"20","author":"NS Patil","year":"2023","unstructured":"Patil NS, Huang RS, van der Pol CB, Larocque N: Using artificial intelligence chatbots as a radiologic decision-making tool for liver imaging: Do chatgpt and bard communicate information consistent with the acr appropriateness criteria? Journal of the American College of Radiology: JACR 20:1010-1013, 2023","journal-title":"Journal of the American College of Radiology: JACR"},{"key":"1328_CR14","doi-asserted-by":"publisher","DOI":"10.1148\/radiol.230922","volume":"307","author":"AA Rahsepar","year":"2023","unstructured":"Rahsepar AA, Tavakoli N, Kim GHJ, Hassani C, Abtin F, Bedayat A: How AI Responds to Common Lung Cancer Questions: ChatGPT vs Google Bard. Radiology 307:e230922, 2023","journal-title":"Radiology"},{"key":"1328_CR15","doi-asserted-by":"publisher","first-page":"2525","DOI":"10.1007\/s11277-024-10886-x","volume":"133","author":"FS Alotaibi","year":"2023","unstructured":"Alotaibi FS, Kaur N: Radiological Report Generation from Chest X-ray Images Using Pre-trained Word Embeddings. Wireless Personal Communications 133:2525-2540, 2023","journal-title":"Wireless Personal Communications"}],"container-title":["Journal of Imaging Informatics in Medicine"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s10278-024-01328-y.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1007\/s10278-024-01328-y\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s10278-024-01328-y.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,9,6]],"date-time":"2025-09-06T00:39:26Z","timestamp":1757119166000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/s10278-024-01328-y"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,11,11]]},"references-count":15,"journal-issue":{"issue":"4","published-online":{"date-parts":[[2025,8]]}},"alternative-id":["1328"],"URL":"https:\/\/doi.org\/10.1007\/s10278-024-01328-y","relation":{},"ISSN":["2948-2933"],"issn-type":[{"value":"2948-2933","type":"electronic"}],"subject":[],"published":{"date-parts":[[2024,11,11]]},"assertion":[{"value":"11 September 2024","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"29 October 2024","order":2,"name":"revised","label":"Revised","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"30 October 2024","order":3,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"11 November 2024","order":4,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Declarations"}},{"value":"Uwe Joseph Schoepf MD received institutional research support and\/or personal fees from Bayer, Bracco, Elucid Bioimaging, Guerbet, HeartFlow, Keya Medical, and Siemens. Jeremy R. Burt is the owner of YellowDot Innovations. The rest of the authors have no conflicts of interest to declare. All co-authors have seen and agree with the contents of the manuscript and there is no financial interest to report.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Conflict of Interest"}}]}}