{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,23]],"date-time":"2026-03-23T05:06:33Z","timestamp":1774242393398,"version":"3.50.1"},"reference-count":33,"publisher":"Frontiers Media SA","license":[{"start":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T00:00:00Z","timestamp":1750204800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":["frontiersin.org"],"crossmark-restriction":true},"short-container-title":["Front. Digit. Health"],"abstract":"<jats:p>Developments in Machine Learning based Conversational and Generative Artificial Intelligence (GenAI) have created opportunities for sophisticated Conversational Agents to augment elements of healthcare. While not a replacement for professional care, AI offers opportunities for scalability, cost effectiveness, and automation of many aspects of patient care. However, to realize these opportunities and deliver AI-enabled support safely, interactions between patients and AI must be continuously monitored and evaluated against an agreed upon set of performance criteria. This paper presents one such set of criteria which was developed to evaluate interactions with an AI Health Coach designed to support patients receiving obesity treatment and deployed with an active patient user base. The evaluation framework evolved through an iterative process of development, testing, refining, training, reviewing and supervision. The framework evaluates at both individual message and overall conversation level, rating interactions as Acceptable or Unacceptable in four domains: Fidelity, Accuracy, Safety, and Tone (FAST), with a series of questions to be considered with respect to each domain. Processes to ensure consistent evaluation quality were established and additional patient safety procedures were defined for escalations to healthcare providers based on clinical risk. The framework can be implemented by trained evaluators and offers a method by which healthcare settings deploying AI to support patients can review quality and safety, thus ensuring safe adoption.<\/jats:p>","DOI":"10.3389\/fdgth.2025.1460236","type":"journal-article","created":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T13:35:49Z","timestamp":1750253749000},"update-policy":"https:\/\/doi.org\/10.3389\/crossmark-policy","source":"Crossref","is-referenced-by-count":5,"title":["Think FAST: a novel framework to evaluate fidelity, accuracy, safety, and tone in conversational AI health coach dialogues"],"prefix":"10.3389","volume":"7","author":[{"given":"Martha","family":"Neary","sequence":"first","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Emily","family":"Fulton","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Victoria","family":"Rogers","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Julia","family":"Wilson","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Zoe","family":"Griffiths","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Ram","family":"Chuttani","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Paul M.","family":"Sacher","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"1965","published-online":{"date-parts":[[2025,6,18]]},"reference":[{"key":"B1","doi-asserted-by":"publisher","first-page":"559","DOI":"10.1037\/ccp0000848","article-title":"Clinical science and practice in the age of large language models and generative artificial intelligence","volume":"91","author":"Schueller","year":"2023","journal-title":"J Consult Clin Psychol"},{"key":"B2","doi-asserted-by":"publisher","first-page":"102274","DOI":"10.1016\/j.lindif.2023.102274","article-title":"ChatGPT for good? On opportunities and challenges of large language models for education","volume":"103","author":"Kasneci","year":"2023","journal-title":"Learn Individ Differ"},{"key":"B3","article-title":"ChatGPT-Like Large-Scale Foundation Models for Prognostics and Health Management: A Survey and Roadmaps","author":"Li","year":"2023"},{"key":"B4","article-title":"Conversational Agents: Theory and Applications","author":"Wahde","year":"2022"},{"key":"B5","doi-asserted-by":"publisher","first-page":"39","DOI":"10.1007\/s12525-023-00662-3","article-title":"Enhancing conversational agents for successful operation: a multi-perspective evaluation approach for continuous improvement","volume":"33","author":"Lewandowski","year":"2023","journal-title":"Electron Mark"},{"key":"B6","doi-asserted-by":"publisher","first-page":"47","DOI":"10.53841\/bpstcp.2023.19.1.47","article-title":"Can Chatbots replace human coaches? Issues and dilemmas for the coaching profession, coaching clients and for organisations","volume":"19","author":"Passmore","year":"2023","journal-title":"Coach Psychol"},{"key":"B7","doi-asserted-by":"publisher","first-page":"14","DOI":"10.3389\/fpsyg.2023.1148243","article-title":"Defining digital coaching: a qualitative inductive approach","volume":"14","author":"Diller","year":"2023","journal-title":"Front Psychol"},{"key":"B8","doi-asserted-by":"publisher","first-page":"12","DOI":"10.3389\/frai.2023.1229805","article-title":"A review of the explainability and safety of conversational agents for mental health to identify avenues for improvement","volume":"6","author":"Sarkar","year":"2023","journal-title":"Front Artif Intell"},{"key":"B9","article-title":"Artificial Intelligence and Patient Education: Examining the Accuracy and Reproducibility of Responses to Nutrition Questions Related to Inflammatory Bowel Disease by GPT-4","author":"Samaan","year":"2023"},{"key":"B10","doi-asserted-by":"publisher","first-page":"589","DOI":"10.1001\/jamainternmed.2023.1838","article-title":"Comparing physician and artificial intelligence chatbot responses to patient questions posted to a public social Media forum","volume":"183","author":"Ayers","year":"2023","journal-title":"JAMA Intern Med"},{"key":"B11","doi-asserted-by":"publisher","first-page":"118","DOI":"10.1038\/s41746-023-00856-1","article-title":"Systematic review and meta-analysis of the effectiveness of chatbots on lifestyle behaviours","volume":"6","author":"Singh","year":"2023","journal-title":"NPJ Digit Med"},{"key":"B12","doi-asserted-by":"publisher","first-page":"48126","DOI":"10.1109\/ACCESS.2024.3381611","article-title":"Privacy and security concerns in generative AI: a comprehensive survey","volume":"12","author":"Golda","year":"2024","journal-title":"IEEE Access"},{"key":"B13","doi-asserted-by":"publisher","first-page":"103","DOI":"10.60087\/jaigs.v3i1.119","article-title":"Examining ethical aspects of AI: addressing bias and equity in the discipline","volume":"3","author":"Shuford","year":"2024","journal-title":"J Artif Intell Gen Sci"},{"key":"B14","doi-asserted-by":"publisher","first-page":"33","DOI":"10.1007\/s10916-023-01925-4","article-title":"Evaluating the feasibility of ChatGPT in healthcare: an analysis of multiple clinical and research scenarios","volume":"47","author":"Cascella","year":"2023","journal-title":"J Med Syst"},{"key":"B15","doi-asserted-by":"publisher","DOI":"10.7759\/cureus.35179","article-title":"Artificial hallucinations in ChatGPT: implications in scientific writing","volume":"15","author":"Alkaissi","year":"2023","journal-title":"Cureus"},{"key":"B16","doi-asserted-by":"publisher","first-page":"e300884","DOI":"10.1136\/bmjment-2023-300884","article-title":"ChatGPT and mental healthcare: balancing benefits with risks of harms","volume":"26","author":"Blease","year":"2023","journal-title":"BMJ Ment Health"},{"key":"B17","first-page":"1","article-title":"What makes a good conversation? Challenges in designing truly conversational agents","author":"Clark","year":"2019"},{"key":"B18","doi-asserted-by":"crossref","first-page":"23","DOI":"10.1007\/1-4020-3933-6_2","article-title":"Social dialogue with embodied conversational agents","volume-title":"Advances in Natural Multimodal Dialogue Systems","author":"Bickmore","year":"2005"},{"key":"B19","article-title":"SuperGLUE: a stickier benchmark for general-purpose language understanding systems","author":"Wang","year":"2020"},{"key":"B20","article-title":"Defending Against Neural Fake News","author":"Zellers","year":"2020"},{"key":"B21","doi-asserted-by":"publisher","DOI":"10.1101\/2023.07.14.23292669","article-title":"Leveraging large language models for generating responses to patient messages","author":"Liu","year":"2023","journal-title":"MedRxiv Prepr Serv Health Sci"},{"key":"B22","doi-asserted-by":"crossref","DOI":"10.18653\/v1\/K19-1079","article-title":"Do massively pretrained language models make better storytellers?","author":"See","year":"2019"},{"key":"B23","doi-asserted-by":"publisher","first-page":"172","DOI":"10.1038\/s41586-023-06291-2","article-title":"Large language models encode clinical knowledge","volume":"620","author":"Singhal","year":"2023","journal-title":"Nature"},{"key":"B24","article-title":"Training a Helpful and Harmless Assistant with Reinforcement Learning from Human Feedback","author":"Bai","year":""},{"key":"B25","doi-asserted-by":"crossref","first-page":"77","DOI":"10.18653\/v1\/2022.nlp4convai-1.8","article-title":"Human evaluation of conversations is an open problem: comparing the sensitivity of various methods for evaluating dialogue agents","author":"Smith","year":"2022","journal-title":"Proceedings of the 4th Workshop on NLP for Conversational AI"},{"key":"B26","doi-asserted-by":"publisher","first-page":"81","DOI":"10.1007\/s12160-013-9486-6","article-title":"The behavior change technique taxonomy (v1) of 93 hierarchically clustered techniques: building an international consensus for the reporting of behavior change interventions","volume":"46","author":"Michie","year":"2013","journal-title":"Ann Behav Med Publ Soc Behav Med"},{"key":"B27","doi-asserted-by":"crossref","first-page":"630","DOI":"10.4324\/9781315820217","volume-title":"Handbook of Coaching Psychology: A Guide for Practitioners","author":"Palmer","year":"2018"},{"key":"B28","doi-asserted-by":"publisher","first-page":"1225","DOI":"10.1093\/jamia\/ocad091","article-title":"AI In health: keeping the human in the loop","volume":"30","author":"Bakken","year":"2023","journal-title":"J Am Med Inform Assoc"},{"key":"B29","doi-asserted-by":"publisher","first-page":"1227","DOI":"10.1093\/jamia\/ocad065","article-title":"More than algorithms: an analysis of safety events involving ML-enabled medical devices reported to the FDA","volume":"30","author":"Lyell","year":"2023","journal-title":"J Am Med Inform Assoc JAMIA"},{"key":"B30","doi-asserted-by":"publisher","first-page":"1237","DOI":"10.1093\/jamia\/ocad072","article-title":"Using AI-generated suggestions from ChatGPT to optimize clinical decision support","volume":"30","author":"Liu","year":"2023","journal-title":"J Am Med Inform Assoc JAMIA"},{"key":"B31","doi-asserted-by":"publisher","first-page":"e38740","DOI":"10.2196\/38740","article-title":"Designing, developing, evaluating, and implementing a smartphone-delivered, rule-based conversational agent (DISCOVER): development of a conceptual framework","volume":"10","author":"Dhinagaran","year":"2022","journal-title":"JMIR MHealth UHealth"},{"key":"B32","doi-asserted-by":"publisher","first-page":"1061","DOI":"10.3390\/healthcare11081061","article-title":"Framework for guiding the development of high-quality conversational agents in healthcare","volume":"11","author":"Denecke","year":"2023","journal-title":"Healthcare (Basel)"},{"key":"B33","doi-asserted-by":"publisher","DOI":"10.1177\/20552076211053690","article-title":"A process for reviewing mental health apps: using the one mind PsyberGuide credibility rating system","volume":"7","author":"Neary","year":"2021","journal-title":"Digit Health"}],"container-title":["Frontiers in Digital Health"],"original-title":[],"link":[{"URL":"https:\/\/www.frontiersin.org\/articles\/10.3389\/fdgth.2025.1460236\/full","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T13:35:51Z","timestamp":1750253751000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.frontiersin.org\/articles\/10.3389\/fdgth.2025.1460236\/full"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,6,18]]},"references-count":33,"alternative-id":["10.3389\/fdgth.2025.1460236"],"URL":"https:\/\/doi.org\/10.3389\/fdgth.2025.1460236","relation":{},"ISSN":["2673-253X"],"issn-type":[{"value":"2673-253X","type":"electronic"}],"subject":[],"published":{"date-parts":[[2025,6,18]]},"article-number":"1460236"}}