{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,26]],"date-time":"2026-02-26T15:31:04Z","timestamp":1772119864596,"version":"3.50.1"},"reference-count":39,"publisher":"Association for Computing Machinery (ACM)","issue":"2","license":[{"start":{"date-parts":[[2025,4,24]],"date-time":"2025-04-24T00:00:00Z","timestamp":1745452800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"name":"Mayo Clinic Arizona Center for Digital Health (CHD) Dalio AIM\/ML Enablement Award"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Comput. Healthcare"],"published-print":{"date-parts":[[2025,4,30]]},"abstract":"<jats:p>Education-level or socioeconomic background of patients may dictate their ability to understand medical jargon. Inability to understand primary findings from a radiology report may lead to unnecessary anxiety among patients or missed follow up. We aim to meet this challenge by developing a patient-sensitive summarization model for radiology reports. We selected computed tomography (CT) exams of chest as a use-case and collected 7,000 studies from Mayo Clinic. Summarization model was built on top of the T5 large language model (LLM) as our experiments indicated that its text-to-text transfer architecture was suited for abstractive text summarization, resulting in a model with 0.77B trainable parameters. Noisy ground truth for model training was collected by prompting LLaMA-13B model. We recruited experts (board-certified radiologists) and laymen to manually evaluate model-generated summaries generated by model. Our model rarely missed information as marked by majority opinion of radiologists. Laymen indicated 63% improvement in their understanding by reading model-generated layman summaries. Comparison with zero-shot performance of ChatGPT indicated that the proposed model reduced the rate of hallucination by half and rate of missing important information by fivefold. The proposed model can generate reliable summaries for radiology reports understandable by patients with vastly different levels of medical knowledge.<\/jats:p>","DOI":"10.1145\/3709154","type":"journal-article","created":{"date-parts":[[2024,12,21]],"date-time":"2024-12-21T08:50:17Z","timestamp":1734771017000},"page":"1-15","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":5,"title":["Patient-centric Summarization of Radiology Findings Using Two-step Training of Large Language Models"],"prefix":"10.1145","volume":"6","author":[{"ORCID":"https:\/\/orcid.org\/0000-0001-5932-2491","authenticated-orcid":false,"given":"Amara","family":"Tariq","sequence":"first","affiliation":[{"name":"Mayo Clinic Arizona, Scottsdale, Arizona, United States"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-0526-7970","authenticated-orcid":false,"given":"Shubham","family":"Trivedi","sequence":"additional","affiliation":[{"name":"Mayo Clinic Arizona, Scottsdale, Arizona, United States"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-6521-2512","authenticated-orcid":false,"given":"Aisha","family":"Urooj","sequence":"additional","affiliation":[{"name":"Mayo Clinic Arizona, Scottsdale, Arizona, United States"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-8401-8061","authenticated-orcid":false,"given":"Gokul","family":"Ramasamy","sequence":"additional","affiliation":[{"name":"Mayo Clinic Arizona, Scottsdale, Arizona, United States"}]},{"ORCID":"https:\/\/orcid.org\/0009-0002-7138-9386","authenticated-orcid":false,"given":"Sam","family":"Fathizadeh","sequence":"additional","affiliation":[{"name":"University of Illinois College of Medicine, Chicago, IL, United States"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-4693-2239","authenticated-orcid":false,"given":"Matthew","family":"Stib","sequence":"additional","affiliation":[{"name":"Mayo Clinic Arizona, Scottsdale, Arizona, United States"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-9636-6944","authenticated-orcid":false,"given":"Nelly","family":"Tan","sequence":"additional","affiliation":[{"name":"Mayo Clinic Arizona, Scottsdale, Arizona, United States"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-5157-9903","authenticated-orcid":false,"given":"Bhavik","family":"Patel","sequence":"additional","affiliation":[{"name":"Mayo Clinic Arizona, Scottsdale, Arizona, United States"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-3327-8004","authenticated-orcid":false,"given":"Imon","family":"Banerjee","sequence":"additional","affiliation":[{"name":"Mayo Clinic Arizona, Scottsdale, Arizona, United States"}]}],"member":"320","published-online":{"date-parts":[[2025,4,24]]},"reference":[{"key":"e_1_3_1_2_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.imu.2021.100557"},{"issue":"2","key":"e_1_3_1_3_2","first-page":"e35179","article-title":"Artificial hallucinations in ChatGPT: Implications in scientific writing","volume":"15","author":"Alkaissi Hussam","year":"2023","unstructured":"Hussam Alkaissi, and Samy I. McFarlane. 2023. Artificial hallucinations in ChatGPT: Implications in scientific writing. Cureus 15, 2 (2023), e35179.","journal-title":"Cureus"},{"issue":"5","key":"e_1_3_1_4_2","doi-asserted-by":"crossref","first-page":"522","DOI":"10.1177\/1077558714541480","article-title":"Examining the role of patient experience surveys in measuring health care quality","volume":"71","author":"Price Rebecca Anhang","year":"2014","unstructured":"Rebecca Anhang Price, Marc N. Elliott, Alan M. Zaslavsky, Ron D. Hays, William G. Lehrman, Lise Rybowski, Susan Edgman-Levitan, and Paul D. Cleary. 2014. Examining the role of patient experience surveys in measuring health care quality. Medical Care Research Review 71, 5 (2014), 522\u2013554.","journal-title":"Medical Care Research Review"},{"issue":"1","key":"e_1_3_1_5_2","doi-asserted-by":"crossref","first-page":"105","DOI":"10.1007\/s10278-022-00712-w","article-title":"Natural language processing model for identifying critical findings\u2014A multi-institutional study","volume":"36","author":"Banerjee Imon","year":"2023","unstructured":"Imon Banerjee, Melissa A. Davis, Brianna L. Vey, Sina Mazaheri, Fiza Khan, Vaz Zavaletta, Roger Gerard, Judy Wawira Gichoya, and Bhavik Patel. 2023. Natural language processing model for identifying critical findings\u2014A multi-institutional study. Journal of Digital Imaging. 36, 1, 105\u2013113.","journal-title":"Journal of Digital Imaging"},{"key":"e_1_3_1_6_2","doi-asserted-by":"crossref","first-page":"1898","DOI":"10.1016\/j.jacr.2024.06.018","article-title":"The impact of large language model-generated radiology report summaries on patient comprehension: A randomized controlled trial","volume":"21","author":"Berigan Kayla","year":"2024","unstructured":"Kayla Berigan, Ryan Short, David Reisman, Laura McCray, Joan Skelly, Kimberly Jones, Nicholas T. Befera, and Naiim Ali. 2024. The impact of large language model-generated radiology report summaries on patient comprehension: A randomized controlled trial. Journal American. College of Radiology 21 (2014), 1898\u20131903.","journal-title":"Journal American. College of Radiology"},{"key":"e_1_3_1_7_2","first-page":"845","article-title":"Chestxraybert: A pretrained language model for chest radiology report summarization","volume":"25","author":"Cai Xiaoyan","year":"2021","unstructured":"Xiaoyan Cai, Sen Liu, Junwei Han, Libin Yang, Zhenguo Liu, and Tianming Liu. 2021. Chestxraybert: A pretrained language model for chest radiology report summarization. IEEE Transactions on Multimedia 25 (2021), 845\u2013855.","journal-title":"IEEE Transactions on Multimedia"},{"key":"e_1_3_1_8_2","doi-asserted-by":"crossref","first-page":"5","DOI":"10.1016\/j.clinimag.2022.07.006","article-title":"Patient-level factors influencing adherence to follow-up imaging recommendations","volume":"90","author":"Calvillo Andr\u00e9s \u00c1ngel-Gonz\u00e1lez","year":"2022","unstructured":"Andr\u00e9s \u00c1ngel-Gonz\u00e1lez Calvillo, Laura Caroline Kodaverdian, Roxana Garcia, Daphne Y. Lichtensztajn, and Matthew D. Bucknor. 2022. Patient-level factors influencing adherence to follow-up imaging recommendations. Clinical Imaging 90 (2022), 5\u201310.","journal-title":"Clinical Imaging"},{"issue":"10","key":"e_1_3_1_9_2","doi-asserted-by":"crossref","first-page":"1459","DOI":"10.1001\/jamaoncol.2023.2954","article-title":"Use of artificial intelligence chatbots for cancer treatment information","volume":"9","author":"Chen Shan","year":"2023","unstructured":"Shan Chen, Benjamin H. Kann, Michael B. Foote, Hugo J. W. L. Aerts, Guergana K. Savova, Raymond H. Mak, and Danielle S. Bitterman. 2023. Use of artificial intelligence chatbots for cancer treatment information. JAMA Oncology. 9, 10 (2023), 1459\u20131462.","journal-title":"JAMA Oncology"},{"key":"e_1_3_1_10_2","doi-asserted-by":"publisher","unstructured":"Chieh-Ju Chao Imon Banerjee Andrew Tseng Garvan C. Kane and Chieh-Ju Chao. [n. d.]. EchoGPT: A large language model for echocardiography report summarization. medRxiv. DOI: 10.1101\/2024.01.18.24301503","DOI":"10.1101\/2024.01.18.24301503"},{"key":"e_1_3_1_11_2","first-page":"31","article-title":"A robust two-step adversarial debiasing with partial learning: medical image case-studies","volume":"12469","author":"Correa Ramon","year":"2023","unstructured":"Ramon Correa, Jiwoong Jason Jeong, Bhavik Patel, Hari Trivedi, Judy W. Gichoya, and Imon Banerjee. 2023. A robust two-step adversarial debiasing with partial learning: medical image case-studies. In Medical Imaging 2023: Imaging Informatics for Healthcare, Research, and Applications, Vol. 12469. SPIE, 31\u201338.","journal-title":"Medical Imaging 2023: Imaging Informatics for Healthcare, Research, and Applications"},{"key":"e_1_3_1_12_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.jbi.2023.104548"},{"key":"e_1_3_1_13_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2021.bionlp-1.11"},{"issue":"4","key":"e_1_3_1_14_2","first-page":"CAT\u201321","article-title":"Preventing delayed and missed care by applying artificial intelligence to trigger radiology imaging follow-up","volume":"3","author":"Domingo Jane","year":"2022","unstructured":"Jane Domingo, Galal Galal, Jonathan Huang, Priyanka Soni, Vladislav Mukhin, Camila Altman, Tom Bayer, Thomas Byrd, Stacey Caron, Patrick Creamer. 2022. Preventing delayed and missed care by applying artificial intelligence to trigger radiology imaging follow-up. NEJM Catalyst Innovations in Care Delivery 3, 4 (2022), CAT\u201321.","journal-title":"NEJM Catalyst Innovations in Care Delivery"},{"issue":"4","key":"e_1_3_1_15_2","doi-asserted-by":"crossref","first-page":"744","DOI":"10.1109\/TAI.2021.3086435","article-title":"Customized impression prediction from radiology reports using BERT and LSTMS","volume":"4","author":"Gundogdu Batuhan","year":"2021","unstructured":"Batuhan Gundogdu, Utku Pamuksuz, Jonathan H. Chung, Jessica M. Telleria, Peng Liu, Farrukh Khan, and Paul J. Chang. 2021. Customized impression prediction from radiology reports using BERT and LSTMS. IEEE Transactions on Artificial Intelligence 4, 4 (2021), 744\u2013753.","journal-title":"IEEE Transactions on Artificial Intelligence"},{"key":"e_1_3_1_16_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR52729.2023.01046"},{"key":"e_1_3_1_17_2","unstructured":"Tianyu Han Lisa C. Adams Jens-Michalis Papaioannou Paul Grundmann Tom Oberhauser Alexander L\u00f6ser Daniel Truhn and Keno K. Bressem. 2023. MedAlpaca\u2014An open-source collection of medical conversational AI models and training data. arXiv:2304.08247. Retrieved from https:\/\/arxiv.org\/abs\/2304.08247"},{"key":"e_1_3_1_18_2","unstructured":"Edward J. Hu Yelong Shen Phillip Wallis Zeyuan Allen-Zhu Yuanzhi Li Shean Wang Lu Wang and Weizhu Chen. 2021. Lora: Low-rank adaptation of large language models. arXiv:2106.09685. Retrieved from https:\/\/arxiv.org\/abs\/2106.09685"},{"key":"e_1_3_1_19_2","first-page":"9118","volume-title":"Proceedings of International Conference on Machine Learning","author":"Huang Wenlong","year":"2022","unstructured":"Wenlong Huang, Pieter Abbeel, Deepak Pathak, and Igor Mordatch. 2022. Language models as zero-shot planners: Extracting actionable knowledge for embodied agents. In Proceedings of International Conference on Machine Learning. PMLR, 9118\u20139147."},{"key":"e_1_3_1_20_2","first-page":"22199","article-title":"Large language models are zero-shot reasoners","volume":"35","author":"Kojima Takeshi","year":"2022","unstructured":"Takeshi Kojima, Shixiang, Shane Gu, Machel Reid, Yutaka Matsuo, and Yusuke Iwasawa. 2022. Large language models are zero-shot reasoners. Advances in Neural Information Processing Systems 35 (2022), 22199\u201322213.","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_3_1_21_2","volume-title":"Exploring ChatGPT Capabilities and Limitations: A Critical Review of the NLP Game Changer","author":"Koubaa Anis","year":"2023","unstructured":"Anis Koubaa, Wadii Boulila, Lahouari Ghouti, Ayyub Alzahem, and Shahid Latif. 2023. Exploring ChatGPT Capabilities and Limitations: A Critical Review of the NLP Game Changer. IEEE Access."},{"key":"e_1_3_1_22_2","unstructured":"Xize Liang Chao Chen Jie Wang Yue Wu Zhihang Fu Zhihao Shi Feng Wu and Jieping Ye. 2024. Robust preference optimization with provable noise tolerance for LLMS. arXiv:2404.04102. Retrieved from https:\/\/arxiv.org\/abs\/2404.04102"},{"key":"e_1_3_1_23_2","doi-asserted-by":"publisher","DOI":"10.1093\/bib\/bbac409"},{"key":"e_1_3_1_24_2","unstructured":"Chong Ma Zihao Wu Jiaqi Wang Shaochen Xu Yaonai Wei Zhengliang Liu Lei Guo Xiaoyan Cai Shu Zhang Tuo Zhang. 2023. ImpressionGPT: An iterative optimizing framework for radiology report summarization with ChatGPT. arXiv:2304.08448. Retrieved from https:\/\/arxiv.org\/abs\/2304.08448"},{"issue":"6","key":"e_1_3_1_25_2","doi-asserted-by":"crossref","first-page":"1287","DOI":"10.2214\/AJR.18.20586","article-title":"Automated tracking of follow-up imaging recommendations","volume":"212","author":"Mabotuwana Thusitha","year":"2019","unstructured":"Thusitha Mabotuwana, Christopher S. Hall, Vadiraj Hombal, Prashanth Pai, Usha Nandini Raghavan, Shawn Regis, Brady McKee, Sandeep Dalal, Christoph Wald, and Martin L. Gunn. 2019. Automated tracking of follow-up imaging recommendations. American Journal of Roentgenology 212, 6 (2019), 1287\u20131294.","journal-title":"American Journal of Roentgenology"},{"key":"e_1_3_1_26_2","first-page":"1196","volume-title":"AMIA Annual Symposium Proceedings","volume":"2017","author":"Mabotuwana Thusitha","year":"2017","unstructured":"Thusitha Mabotuwana, Christopher S. Hall, Joel Tieder, and Martin L. Gunn. 2017. Improving quality of follow-up imaging recommendations in radiology. In AMIA Annual Symposium Proceedings, Vol. 2017. American Medical Informatics Association, 1196."},{"key":"e_1_3_1_27_2","doi-asserted-by":"crossref","first-page":"116","DOI":"10.1016\/j.clinimag.2018.12.006","article-title":"Readability of radiology reports: implications for patient-centered care","volume":"54","author":"Martin-Carreras Teresa","year":"2019","unstructured":"Teresa Martin-Carreras, Tessa S. Cook, and Charles E. Kahn. Jr. 2019. Readability of radiology reports: implications for patient-centered care. Clinical Imaging 54, 2019, 116\u2013120.","journal-title":"Clinical Imaging"},{"key":"e_1_3_1_28_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR52729.2023.01456"},{"key":"e_1_3_1_29_2","first-page":"1","volume-title":"JALT 2013 Conference Proceedings","author":"Nemoto Tomoko","year":"2014","unstructured":"Tomoko Nemoto and David Beglar. 2014. Likert-scale questionnaires. In JALT 2013 Conference Proceedings, 1\u20138."},{"issue":"8","key":"e_1_3_1_30_2","first-page":"9","article-title":"Language models are unsupervised multitask learners","volume":"1","author":"Radford Alec","year":"2019","unstructured":"Alec Radford, Jeffrey Wu, Rewon Child, David Luan, Dario Amodei, Ilya Sutskever. 2019. Language models are unsupervised multitask learners. OpenAI Blog 1, 8 (2019), 9.","journal-title":"OpenAI Blog"},{"key":"e_1_3_1_31_2","doi-asserted-by":"publisher","DOI":"10.5555\/3455716.3455856"},{"key":"e_1_3_1_32_2","doi-asserted-by":"publisher","DOI":"10.3390\/healthcare11060887"},{"key":"e_1_3_1_33_2","unstructured":"Noam Shazeer. 2020. Glu variants improve transformer. arXiv:2002.05202. Retrieved from https:\/\/arxiv.org\/abs\/2002.05202"},{"issue":"5","key":"e_1_3_1_34_2","doi-asserted-by":"crossref","first-page":"e231259","DOI":"10.1148\/radiol.231259","article-title":"Evaluating GPT-4 on impressions generation in radiology reports","volume":"307","author":"Sun Zhaoyi","year":"2023","unstructured":"Zhaoyi Sun, Hanley Ong, Patrick Kennedy, Liyan Tang, Shirley Chen, Jonathan Elias, Eugene Lucas, George Shih, and Yifan Peng. 2023. Evaluating GPT-4 on impressions generation in radiology reports. Radiology. 307, 5 (2023), e231259.","journal-title":"Radiology"},{"key":"e_1_3_1_35_2","doi-asserted-by":"crossref","unstructured":"Liyan Tang Igor Shalyminov Amy Wing-Mei Wong Jon Burnsky Jake W. Vincent Yu\u2019an Yang Siffi Singh Song Feng Hwanjun Song Hang Su. 2024. Tofueval: Evaluating hallucinations of LLMS on topic-focused dialogue summarization. arXiv:2402.13249. Retrieved from https:\/\/arxiv.org\/abs\/2402.13249","DOI":"10.18653\/v1\/2024.naacl-long.251"},{"key":"e_1_3_1_36_2","doi-asserted-by":"publisher","DOI":"10.1038\/s41591-023-02448-8"},{"key":"e_1_3_1_37_2","unstructured":"Hugo Touvron Thibaut Lavril Gautier Izacard Xavier Martinet Marie-Anne Lachaux Timoth\u00e9e Lacroix Baptiste Rozi\u00e8re Naman Goyal Eric Hambro Faisal Azhar. 2023. Llama: Open and efficient foundation language models. arXiv:2302.13971. Retrieved from https:\/\/arxiv.org\/pdf\/2302.13971"},{"key":"e_1_3_1_38_2","volume-title":"Proceedings of International Conference on Learning Representations","author":"Wei Jason","year":"2021","unstructured":"Jason Wei, Maarten Bosma, Vincent Zhao, Kelvin Guu, Adams Wei Yu, Brian Lester, Nan Du, Andrew M. Dai, and Quoc V. Le. 2021. Finetuned language models are zero-shot learners. In Proceedings of International Conference on Learning Representations."},{"key":"e_1_3_1_39_2","doi-asserted-by":"crossref","first-page":"204","DOI":"10.18653\/v1\/W18-5623","volume-title":"Proceedings of the Ninth International Workshop on Health Text Mining and Information Analysis","author":"Zhang Yuhao","year":"2018","unstructured":"Yuhao Zhang, Daisy Yi Ding, Tianpei Qian, Christopher D. Manning, and Curtis P. Langlotz. 2018. Learning to summarize radiology findings. In Proceedings of the Ninth International Workshop on Health Text Mining and Information Analysis, 204\u2013213."},{"key":"e_1_3_1_40_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2020.acl-main.458"}],"container-title":["ACM Transactions on Computing for Healthcare"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3709154","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3709154","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,19]],"date-time":"2025-06-19T01:17:31Z","timestamp":1750295851000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3709154"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,4,24]]},"references-count":39,"journal-issue":{"issue":"2","published-print":{"date-parts":[[2025,4,30]]}},"alternative-id":["10.1145\/3709154"],"URL":"https:\/\/doi.org\/10.1145\/3709154","relation":{},"ISSN":["2637-8051"],"issn-type":[{"value":"2637-8051","type":"electronic"}],"subject":[],"published":{"date-parts":[[2025,4,24]]},"assertion":[{"value":"2024-02-15","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2024-12-09","order":2,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2025-04-24","order":3,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}