{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,24]],"date-time":"2026-03-24T15:34:27Z","timestamp":1774366467568,"version":"3.50.1"},"reference-count":87,"publisher":"Frontiers Media SA","license":[{"start":{"date-parts":[[2025,9,11]],"date-time":"2025-09-11T00:00:00Z","timestamp":1757548800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":["frontiersin.org"],"crossmark-restriction":true},"short-container-title":["Front. Digit. Health"],"abstract":"<jats:p>The rapid integration of large language models (LLMs) into healthcare offers significant potential for improving diagnosis, treatment planning, and patient engagement. However, it also presents serious ethical challenges that remain incompletely addressed. In this review, we analyzed 27 peer-reviewed studies published between 2017 and 2025 across four major open-access databases using strict eligibility criteria, robust synthesis methods, and established guidelines to explicitly examine the ethical aspects of deploying LLMs in clinical settings. We explore four key aspects, including the main ethical issues arising from the use of LLMs in healthcare, the prevalent model architectures employed in ethical analyses, the healthcare application domains that are most frequently scrutinized, and the publication and bibliographic patterns characterizing this literature. Our synthesis reveals that bias and fairness (<jats:inline-formula><mml:math xmlns:mml=\"http:\/\/www.w3.org\/1998\/Math\/MathML\" id=\"IM1\"><mml:mi>n<\/mml:mi><mml:mo>=<\/mml:mo><mml:mn>7<\/mml:mn><\/mml:math><\/jats:inline-formula>, 25.9%) are the most frequently discussed concerns, followed by safety, reliability, transparency, accountability, and privacy, and that the GPT family predominates (<jats:inline-formula><mml:math xmlns:mml=\"http:\/\/www.w3.org\/1998\/Math\/MathML\" id=\"IM2\"><mml:mi>n<\/mml:mi><mml:mo>=<\/mml:mo><mml:mn>14<\/mml:mn><\/mml:math><\/jats:inline-formula>, 51.8%) among examined models. While privacy protection and bias mitigation received notable attention in the literature, no existing review has systematically addressed the comprehensive ethical issues surrounding LLMs. Most previous studies focus narrowly on specific clinical subdomains and lack a comprehensive methodology. As a systematic mapping of open-access literature, this synthesis identifies dominant ethical patterns, but it is not exhaustive of all ethical work on LLMs in healthcare. We also synthesize identified challenges, outline future research directions and include a provisional ethical integration framework to guide clinicians, developers, and policymakers in the responsible integration of LLMs into clinical workflows.<\/jats:p>","DOI":"10.3389\/fdgth.2025.1653631","type":"journal-article","created":{"date-parts":[[2025,9,11]],"date-time":"2025-09-11T05:24:58Z","timestamp":1757568298000},"update-policy":"https:\/\/doi.org\/10.3389\/crossmark-policy","source":"Crossref","is-referenced-by-count":9,"title":["A systematic review of ethical considerations of large language models in healthcare and medicine"],"prefix":"10.3389","volume":"7","author":[{"given":"Muhammad","family":"Fareed","sequence":"first","affiliation":[]},{"given":"Madeeha","family":"Fatima","sequence":"additional","affiliation":[]},{"given":"Jamal","family":"Uddin","sequence":"additional","affiliation":[]},{"given":"Adeel","family":"Ahmed","sequence":"additional","affiliation":[]},{"given":"Muhammad Awais","family":"Sattar","sequence":"additional","affiliation":[]}],"member":"1965","published-online":{"date-parts":[[2025,9,11]]},"reference":[{"key":"B1","doi-asserted-by":"publisher","first-page":"S36","DOI":"10.1016\/j.metabol.2017.01.011","article-title":"Artificial intelligence in medicine","volume":"69","author":"Hamet","year":"2017","journal-title":"Metabolism"},{"key":"B2","doi-asserted-by":"publisher","first-page":"13","DOI":"10.1609\/aimag.v38i4.2744","article-title":"A standard model of the mind: toward a common computational framework across artificial intelligence, cognitive science, neuroscience, and robotics","volume":"38","author":"Laird","year":"2017","journal-title":"AI Mag"},{"key":"B3","doi-asserted-by":"publisher","first-page":"100017","DOI":"10.1016\/j.jacadv.2022.100017","article-title":"Deep learning in medicine","volume":"1","author":"Toma","year":"2022","journal-title":"JACC Adv"},{"key":"B4","doi-asserted-by":"publisher","first-page":"214","DOI":"10.1186\/s12911-024-02600-5","article-title":"Transformer models in biomedicine","volume":"24","author":"Madan","year":"2024","journal-title":"BMC Med Inform Decis Mak"},{"key":"B5","doi-asserted-by":"publisher","first-page":"101081","DOI":"10.1016\/j.jpha.2024.101081","article-title":"A review of transformers in drug discovery and beyond","volume":"15","author":"Jiang","year":"2024","journal-title":"J Pharm Anal"},{"key":"B6","doi-asserted-by":"publisher","first-page":"103003","DOI":"10.1016\/j.artmed.2024.103003","article-title":"From pre-training to fine-tuning: an in-depth analysis of large language models in the biomedical domain","volume":"157","author":"Bonfigli","year":"2024","journal-title":"Artif Intell Med"},{"key":"B7","doi-asserted-by":"crossref","DOI":"10.1145\/3706598.3714287","article-title":"Reimagining support: exploring autistic individuals\u2019 visions for AI in coping with negative self-talk","author":"Carik","year":""},{"key":"B8","doi-asserted-by":"publisher","first-page":"e38056","DOI":"10.1016\/j.heliyon.2024.e38056","article-title":"Embedded values-like shape ethical reasoning of large language models on primary care ethical dilemmas","volume":"10","author":"Hadar-Shoval","year":"2024","journal-title":"Heliyon"},{"key":"B9","doi-asserted-by":"publisher","DOI":"10.2196\/67891","article-title":"Competency of large language models in evaluating appropriate responses to suicidal ideation: comparative study","volume":"27","author":"McBain","year":"2025","journal-title":"J Med Internet Res"},{"key":"B10","doi-asserted-by":"publisher","DOI":"10.1038\/s41746-023-00939-z","article-title":"Large language models propagate race-based medicine","volume":"6","author":"Omiye","year":"2023","journal-title":"npj Digit Med"},{"key":"B11","doi-asserted-by":"publisher","first-page":"e60063","DOI":"10.2196\/60063","article-title":"EyeGPT for patient inquiries and medical education: development and validation of an ophthalmology large language model","volume":"26","author":"Chen","year":"2024","journal-title":"J Med Internet Res"},{"key":"B12","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1145\/3682068","article-title":"Differentially private low-rank adaptation of large language model using federated learning","volume":"16","author":"Liu","year":"2024","journal-title":"ACM Trans Manag Inf Syst"},{"key":"B13","doi-asserted-by":"crossref","DOI":"10.1145\/3589334.3648137","article-title":"MentaLLaMA: interpretable mental health analysis on social media with large language models","author":"Yang","year":""},{"key":"B14","doi-asserted-by":"crossref","DOI":"10.1145\/3675888.3676147","article-title":"Responsible software systems for disease diagnostics using symptom text","author":"Marvin","year":""},{"key":"B15","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1145\/3524887","article-title":"Interpretable bias mitigation for textual data: reducing genderization in patient notes while maintaining classification performance","volume":"3","author":"Minot","year":"2022","journal-title":"ACM Trans Comput Healthc"},{"key":"B16","doi-asserted-by":"crossref","DOI":"10.1145\/3599696.3612902","article-title":"Leveraging socio-contextual information in bert for fake health news detection in social media","author":"Upadhyay","year":""},{"key":"B17","doi-asserted-by":"crossref","DOI":"10.1145\/3368555.3384448","article-title":"Hurtful words: quantifying biases in clinical contextual word embeddings","author":"Zhang","year":""},{"key":"B18","doi-asserted-by":"crossref","DOI":"10.1145\/3630106.3658982","article-title":"NLP for maternal healthcare: perspectives and guiding principles in the age of LLMs","author":"Antoniak","year":""},{"key":"B19","article-title":"Selective trust: understanding human-AI partnerships in personal health decision-making process","author":"Arum","year":""},{"key":"B20","doi-asserted-by":"crossref","DOI":"10.1145\/3706598.3713485","article-title":"Private yet social: how LLM chatbots support and challenge eating disorder recovery","author":"Choi","year":""},{"key":"B21","doi-asserted-by":"crossref","DOI":"10.1145\/3675094.3679000","article-title":"Using large language models to compare explainable models for smart home human activity recognition","author":"Fiori","year":""},{"key":"B22","doi-asserted-by":"crossref","DOI":"10.1145\/3706598.3714113","article-title":"Veriplan: integrating formal verification and LLMs into end-user planning","author":"Lee","year":""},{"key":"B23","doi-asserted-by":"crossref","DOI":"10.1145\/3582515.3609536","article-title":"Data decentralisation of LLM-based chatbot systems in chronic disease self-management","author":"Montagna","year":""},{"key":"B24","doi-asserted-by":"publisher","DOI":"10.1145\/3718095","article-title":"Comparative evaluation of GPT models in FHIR proficiency","author":"Pope","year":"2025","journal-title":"ACM Trans Intell Syst Technol"},{"key":"B25","article-title":"Data from: Ethical considerations of using ChatGPT in health care","author":"Wang","year":"2023"},{"key":"B26","doi-asserted-by":"publisher","DOI":"10.1186\/s12873-024-01159-8","article-title":"AI-assisted decision-making in mild traumatic brain injury","volume":"25","author":"Yigit","year":"2025","journal-title":"BMC Emerg Med"},{"key":"B27","doi-asserted-by":"publisher","first-page":"e12","DOI":"10.1016\/S2589-7500(23)00225-X","article-title":"Assessing the potential of GPT-4 to perpetuate racial and gender biases in health care: a model evaluation study","volume":"6","author":"Zack","year":"2024","journal-title":"Lancet Digit Health"},{"key":"B28","doi-asserted-by":"publisher","first-page":"102178","DOI":"10.1016\/j.jksuci.2024.102178","article-title":"Deepextract: semantic-driven extractive text summarization framework using LLMs and hierarchical positional encoding","volume":"36","author":"Onan","year":"2024","journal-title":"J King Saud Univ Comput Inf Sci"},{"key":"B29","doi-asserted-by":"publisher","first-page":"1391","DOI":"10.1186\/s12909-024-06399-7","article-title":"Large language models improve clinical decision making of medical students through patient simulation and structured feedback: a randomized controlled trial","volume":"24","author":"Br\u00fcgge","year":"2024","journal-title":"BMC Med Educ"},{"key":"B30","doi-asserted-by":"publisher","first-page":"314","DOI":"10.1053\/j.semvascsurg.2024.06.001","article-title":"Large language models and artificial intelligence chatbots in vascular surgery","volume":"37","author":"Lareyre","year":"2024","journal-title":"Semin Vasc Surg"},{"key":"B31","article-title":"Data from: Using ChatGPT in a clinical setting: a case report","author":"Ye","year":""},{"key":"B32","doi-asserted-by":"crossref","DOI":"10.1145\/3633624.3633629","article-title":"Xai for medicine by ChatGPT code interpreter","author":"Kitamura","year":""},{"key":"B33","article-title":"Data from: A systematic review of large language model (LLM) evaluations in clinical medicine","author":"Shool","year":""},{"key":"B34","article-title":"Data from: Evaluating and addressing demographic disparities in medical large language models: a systematic review","author":"Omar","year":""},{"key":"B35","article-title":"Data from: The metric-framework for assessing data quality for trustworthy AI in medicine: a systematic review","author":"Schwabe","year":""},{"key":"B36","article-title":"Data from: A systematic review of ChatGPT and other conversational large language models in healthcare","author":"Wang","year":""},{"key":"B37","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1145\/3641289","article-title":"A survey on evaluation of large language models","volume":"15","author":"Chang","year":"2024","journal-title":"ACM Trans Intell Syst Technol"},{"key":"B38","article-title":"Data from: The ethics of ChatGPT in medicine and healthcare: a systematic review on large language models (LLMs) (2024)","author":"Haltaufderheide","year":""},{"key":"B39","article-title":"Data from: AI and ethics: a systematic review of the ethical considerations of large language model use in surgery research","author":"Pressman","year":""},{"key":"B40","article-title":"Data from: Opportunities and challenges for large language models in primary health care","author":"Qin","year":""},{"key":"B41","article-title":"Data from: Challenges and barriers of using large language models (LLM) such as ChatGPT for diagnostic medicine with a focus on digital pathology\u2014a recent scoping review","author":"Ullah","year":""},{"key":"B42","doi-asserted-by":"publisher","first-page":"n71","DOI":"10.1136\/bmj.n71","article-title":"The PRISMA 2020 statement: an updated guideline for reporting systematic reviews","volume":"372","author":"Page","year":"2021","journal-title":"BMJ"},{"key":"B43","article-title":"Guidelines for performing systematic literature reviews in software engineering Keele University and Durham University Joint Report","author":"Kitchenham","year":""},{"key":"B44","doi-asserted-by":"publisher","first-page":"100868","DOI":"10.1016\/j.bj.2025.100868","article-title":"Roles and potential of large language models in healthcare: a comprehensive review","author":"Lin","year":"2025","journal-title":"Biomed J"},{"key":"B45","doi-asserted-by":"publisher","DOI":"10.48550\/arXiv.1706.03762","article-title":"Attention is all you need","author":"Vaswani","year":"2017"},{"key":"B46","article-title":"Bert: pre-training of deep bidirectional transformers for language understanding","author":"Devlin","year":""},{"key":"B47","doi-asserted-by":"publisher","first-page":"1234","DOI":"10.1093\/bioinformatics\/btz682","article-title":"BioBERT: a pre-trained biomedical language representation model for biomedical text mining","volume":"36","author":"Lee","year":"2020","journal-title":"Bioinformatics"},{"key":"B48","article-title":"Clinicalbert: modeling clinical notes and predicting hospital readmission. arXiv [Preprint]","author":"Huang","year":""},{"key":"B49","volume-title":"GPT-3","author":"Kublik","year":"2022"},{"key":"B50","article-title":"LLaMA: open and efficient foundation language models","author":"Touvron","year":""},{"key":"B51","article-title":"GPT-4 technical report","author":"Achiam","year":""},{"key":"B52","article-title":"SkinGPT-4: an interactive dermatology diagnostic system with visual large language model","author":"Zhou","year":""},{"key":"B53","doi-asserted-by":"publisher","first-page":"AIoa2300138","DOI":"10.1056\/AIoa2300138","article-title":"Towards generalist biomedical AI","volume":"1","author":"Tu","year":"2024","journal-title":"Nejm AI"},{"key":"B54","article-title":"Meditron-70b: Scaling medical pretraining for large language models","author":"Chen","year":""},{"key":"B55","doi-asserted-by":"crossref","DOI":"10.1109\/CSCWD61410.2024.10580641","article-title":"Psychat: a client-centric dialogue system for mental health support","author":"Qiu","year":""},{"key":"B56","doi-asserted-by":"publisher","first-page":"17","DOI":"10.58496\/BJML\/2023\/003","article-title":"ChatGPT4, DALL\u00b7E, bard, claude, BERT: open possibilities","volume":"2023","author":"Ali","year":"2023","journal-title":"Babylon J Mach Learn"},{"key":"B57","doi-asserted-by":"publisher","first-page":"642","DOI":"10.1038\/s41433-023-02760-0","article-title":"Google\u2019s AI chatbot \u201cbard\u201d: a side-by-side comparison with ChatGPT and its utilization in ophthalmology","volume":"38","author":"Waisberg","year":"2024","journal-title":"Eye"},{"key":"B58","doi-asserted-by":"publisher","first-page":"733","DOI":"10.26502\/fjhs.244","article-title":"A Chatbot that learns one\u2019s preferences as the next step in human digital twins: A pilot study using HyperCLOVA X, a large language model","volume":"7","author":"Yun","year":"2024","journal-title":"Fortune J Health Sci"},{"key":"B59","article-title":"Intervening anchor token: decoding strategy in alleviating hallucinations for MLLMs","author":"Tang","year":""},{"key":"B60","article-title":"GPT-4o system card","author":"Hurst","year":""},{"key":"B61","article-title":"Capabilities of gemini models in medicine","author":"Saab","year":""},{"key":"B62","article-title":"The LLaMA 3 herd of models","author":"Grattafiori","year":""},{"key":"B63","doi-asserted-by":"publisher","first-page":"e59617","DOI":"10.2196\/59617","article-title":"Viability of open large language models for clinical documentation in German health care: Real-world model evaluation study","volume":"12","author":"Heilmeyer","year":"2024","journal-title":"JMIR Med Inform"},{"key":"B64","doi-asserted-by":"publisher","first-page":"189","DOI":"10.1016\/j.aopr.2025.05.001","article-title":"Deepseek-R1 outperforms Gemini 2.0 pro, OpenAI o1, and o3-mini in bilingual complex ophthalmology reasoning","volume":"5","author":"Xu","year":"2025","journal-title":"Adv Ophthalmol Pract Res"},{"key":"B65","article-title":"Code-driven planning in grid worlds with large language models","author":"Aravindan","year":""},{"key":"B66","article-title":"Deepseek-r1: Incentivizing reasoning capability in LLMs via reinforcement learning","author":"Guo","year":""},{"key":"B67","doi-asserted-by":"crossref","DOI":"10.1145\/3613904.3642420","article-title":"Understanding the impact of long-term memory on self-disclosure with large language model-driven chatbots for public health intervention","author":"Jo","year":""},{"key":"B68","article-title":"Data from: Personal information protection act","year":""},{"key":"B69","article-title":"Medical Devices Act, Republic of Korea (Statute). Data from: Medical devices act. Statutes of the Republic of Korea","year":""},{"key":"B70","doi-asserted-by":"publisher","first-page":"1433","DOI":"10.1007\/s43681-024-00451-4","article-title":"Safeguarding human values: rethinking US law for generative AI\u2019s societal impacts","volume":"5","author":"Cheong","year":"2024","journal-title":"AI Ethics"},{"key":"B71","article-title":"Data from: Harvesting the power of artificial intelligence for surgery: uses, implications, and ethical considerations (2023)","author":"Kavian","year":""},{"key":"B72","doi-asserted-by":"publisher","first-page":"1166120","DOI":"10.3389\/fpubh.2023.1166120","article-title":"ChatGPT and the rise of large language models: the new AI-driven infodemic threat in public health","volume":"11","author":"Angelis","year":"2023","journal-title":"Front Public Health"},{"key":"B73","article-title":"Health Insurance Portability and Accountability Act of 1996, U.S. Congress (Public Law No. 104-191)","year":""},{"key":"B74","article-title":"Regulation (EU) 2016\/679, European Parliament and Council (General Data Protection Regulation)","year":""},{"key":"B75","doi-asserted-by":"publisher","first-page":"AIra2400038","DOI":"10.1056\/AIra2400038","article-title":"Medical ethics of large language models in medicine","volume":"1","author":"Ong","year":"2024","journal-title":"NEJM AI"},{"key":"B76","article-title":"Ethics and Governance of Artificial Intelligence for Health (2021)","year":""},{"key":"B77","article-title":"Recommendation of the Council on Artificial Intelligence (2019)","year":""},{"key":"B78","doi-asserted-by":"crossref","DOI":"10.1093\/acprof:oso\/9780199641321.001.0001","volume-title":"The Ethics of Information","author":"Floridi","year":"2013"},{"key":"B79","volume-title":"Principles of Biomedical Ethics","author":"Beauchamp","year":"1979"},{"key":"B80","doi-asserted-by":"publisher","first-page":"856","DOI":"10.1093\/clinchem\/hvac048","article-title":"Partial postponement of the application of the in vitro diagnostic medical devices regulation in the European Union","volume":"68","author":"Vogeser","year":"2022","journal-title":"Clin Chem"},{"key":"B81","doi-asserted-by":"publisher","first-page":"55","DOI":"10.1007\/s44200-022-00014-0","article-title":"Medical device regulation: should we care about it?","volume":"28","author":"Bianchini","year":"2022","journal-title":"Artery Res"},{"key":"B82","article-title":"Data from: Evaluating of bert-based and large language mod for suicide detection, prevention, and risk assessment: a systematic review","author":"Levkovich","year":""},{"key":"B83","doi-asserted-by":"crossref","DOI":"10.1145\/3712001","article-title":"Security and privacy challenges of large language models: a survey","author":"Das","year":""},{"key":"B84","article-title":"Data from: Ethical and regulatory challenges of large language models in medicine (2024)","author":"Ong","year":""},{"key":"B85","article-title":"Data from: Language model and its interpretability in biomedicine: a scoping review","author":"Lyu","year":""},{"key":"B86","article-title":"Data from: Large language models and generative AI in telehealth: a responsible use lens (2024)","author":"Pool","year":""},{"key":"B87","doi-asserted-by":"crossref","DOI":"10.1109\/BIBM62325.2024.10822376","article-title":"Exploring the ethical challenges of large language models in emergency medicine: a comparative international review","author":"Elbattah","year":""}],"container-title":["Frontiers in Digital Health"],"original-title":[],"link":[{"URL":"https:\/\/www.frontiersin.org\/articles\/10.3389\/fdgth.2025.1653631\/full","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,9,11]],"date-time":"2025-09-11T05:25:14Z","timestamp":1757568314000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.frontiersin.org\/articles\/10.3389\/fdgth.2025.1653631\/full"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,9,11]]},"references-count":87,"alternative-id":["10.3389\/fdgth.2025.1653631"],"URL":"https:\/\/doi.org\/10.3389\/fdgth.2025.1653631","relation":{},"ISSN":["2673-253X"],"issn-type":[{"value":"2673-253X","type":"electronic"}],"subject":[],"published":{"date-parts":[[2025,9,11]]},"article-number":"1653631"}}