{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,9,11]],"date-time":"2025-09-11T19:38:24Z","timestamp":1757619504777,"version":"3.44.0"},"reference-count":39,"publisher":"Springer Science and Business Media LLC","issue":"1","license":[{"start":{"date-parts":[[2025,7,24]],"date-time":"2025-07-24T00:00:00Z","timestamp":1753315200000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2025,7,24]],"date-time":"2025-07-24T00:00:00Z","timestamp":1753315200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"DOI":"10.13039\/501100003252","name":"Lund University","doi-asserted-by":"crossref","id":[{"id":"10.13039\/501100003252","id-type":"DOI","asserted-by":"crossref"}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Discov Artif Intell"],"abstract":"<jats:title>Abstract<\/jats:title>\n          <jats:sec>\n            <jats:title>Background<\/jats:title>\n            <jats:p>Regular monitoring of healthcare quality and equity is crucial for informing decision-makers and clinicians. This study explores the application of generative AI, more specifically large language models (LLMs), to facilitate standardized monitoring of healthcare quality using the established framework Analysis of Individual Heterogeneity and Discriminatory Accuracy (AIHDA). The study investigates whether a customized GPT can effectively apply the AIHDA-framework to assess healthcare quality in a simulated dataset.<\/jats:p>\n          <\/jats:sec>\n          <jats:sec>\n            <jats:title>Population and methods<\/jats:title>\n            <jats:p>Using simulated data modelled on real-world healthcare information, we evaluated the quality indicator of potentially inappropriate medication (PIM). A customized GPT built on ChatGPT 4o was prompted via the principle TREF (Task, Requirement, Expectation, Format) to perform the analysis. Results were compared to a traditional analysis performed with Stata to evaluate accuracy and reliability.<\/jats:p>\n          <\/jats:sec>\n          <jats:sec>\n            <jats:title>Results<\/jats:title>\n            <jats:p>The GPT successfully conducted the AIHDA analysis, producing results equal to those of the Stata analysis. The GPT provides useful visualizations and structured reports as well as interactive dialog with the end-user in real-time. However, occasional variations in the results occurred in some iterations of the analysis, highlighting potential issues with reliability. The analysis requires close supervision, as the GPT presents both errors and correct results with confidence.<\/jats:p>\n          <\/jats:sec>\n          <jats:sec>\n            <jats:title>Conclusions<\/jats:title>\n            <jats:p>Generative AI and LLMs show promise in supporting standardized monitoring of healthcare quality and equity using the AIHDA-framework. It enables accessible analysis but requires oversight to address limitations such as occasional inaccuracies. Future and more reliable models of LLMs and local deployment on secure servers may further enhance the utility for routine healthcare monitoring.<\/jats:p>\n          <\/jats:sec>","DOI":"10.1007\/s44163-025-00444-0","type":"journal-article","created":{"date-parts":[[2025,7,24]],"date-time":"2025-07-24T15:41:55Z","timestamp":1753371715000},"update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":0,"title":["Evaluating healthcare quality and inequities using generative AI: a simulation study of potentially inappropriate medication among older adults analyzed via the framework analysis of individual heterogeneity and discriminatory accuracy (AIHDA)"],"prefix":"10.1007","volume":"5","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-0701-5155","authenticated-orcid":false,"given":"Johan","family":"\u00d6berg","sequence":"first","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0002-6273-1656","authenticated-orcid":false,"given":"Raquel","family":"Perez-Vicente","sequence":"additional","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0002-1969-5119","authenticated-orcid":false,"given":"Martin","family":"Lindstr\u00f6m","sequence":"additional","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0002-5871-8731","authenticated-orcid":false,"given":"Patrik","family":"Midl\u00f6v","sequence":"additional","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0001-8379-9708","authenticated-orcid":false,"given":"Juan","family":"Merlo","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2025,7,24]]},"reference":[{"issue":"9","key":"444_CR1","doi-asserted-by":"publisher","DOI":"10.1136\/bmjopen-2022-063117","volume":"13","author":"J Merlo","year":"2023","unstructured":"Merlo J, Oberg J, Khalaf K, Perez-Vicente R, Leckie G. Geographical and sociodemographic differences in statin dispensation after acute myocardial infarction in Sweden: a register-based prospective cohort study applying analysis of individual heterogeneity and discriminatory accuracy (AIHDA) for basic comparisons of healthcare quality. BMJ Open. 2023;13(9): e063117.","journal-title":"BMJ Open"},{"issue":"2","key":"444_CR2","doi-asserted-by":"publisher","first-page":"288","DOI":"10.1177\/14034948221075410","volume":"51","author":"M Wemrell","year":"2023","unstructured":"Wemrell M, Vicente RP, Merlo J. Mapping sociodemographic and geographical differences in human papillomavirus non-vaccination among young girls in Sweden. Scand J Public Health. 2023;51(2):288\u201395.","journal-title":"Scand J Public Health"},{"issue":"32\u201333","key":"444_CR3","first-page":"1828","volume":"107","author":"H Sorman","year":"2010","unstructured":"Sorman H. Open comparisons of health care: One of the most important contributions to modernization of health care. Lakartidningen. 2010;107(32\u201333):1828\u20139.","journal-title":"Lakartidningen"},{"issue":"2","key":"444_CR4","doi-asserted-by":"publisher","DOI":"10.1136\/bmjopen-2020-042323","volume":"11","author":"S Axelsson Fisk","year":"2021","unstructured":"Axelsson Fisk S, Lindstrom M, Perez-Vicente R, Merlo J. Understanding the complexity of socioeconomic disparities in smoking prevalence in Sweden: a cross-sectional study applying intersectionality theory. BMJ Open. 2021;11(2): e042323.","journal-title":"BMJ Open"},{"key":"444_CR5","doi-asserted-by":"publisher","first-page":"684","DOI":"10.1016\/j.ssmph.2017.08.005","volume":"3","author":"J Merlo","year":"2017","unstructured":"Merlo J, Mulinari S, Wemrell M, Subramanian SV, Hedblad B. The tyranny of the averages and the indiscriminate use of risk factors in public health: the case of coronary heart disease. SSM - Popul Health. 2017;3:684\u201398.","journal-title":"SSM - Popul Health"},{"key":"444_CR6","doi-asserted-by":"publisher","DOI":"10.1016\/j.healthplace.2019.102145","volume":"58","author":"J Merlo","year":"2019","unstructured":"Merlo J, Wagner P, Leckie G. A simple multilevel approach for analysing geographical inequalities in public health reports: the case of municipality differences in obesity. Health Place. 2019;58: 102145.","journal-title":"Health Place"},{"key":"444_CR7","unstructured":"Minaee S, Mikolov T, Nikzad N, Chenaghlu M, Socher R, Amatriain X, et al. Large language models: A survey. arXiv preprint arXiv:240206196. 2024."},{"key":"444_CR8","doi-asserted-by":"crossref","unstructured":"Dhurandhar A, Nair R, Singh M, Daly E, Ramamurthy KN. Ranking Large Language Models without Ground Truth. arXiv preprint arXiv:240214860. 2024.","DOI":"10.18653\/v1\/2024.findings-acl.143"},{"key":"444_CR9","doi-asserted-by":"publisher","DOI":"10.2196\/22769","volume":"26","author":"L Wang","year":"2024","unstructured":"Wang L, Wan Z, Ni C, Song Q, Li Y, Clayton E, et al. Applications and concerns of ChatGPT and other conversational large language models in health care: systematic review. J Med Internet Res. 2024;26: 1\u201324.","journal-title":"J Med Internet Res"},{"issue":"9","key":"444_CR10","doi-asserted-by":"publisher","first-page":"992","DOI":"10.1016\/j.ajic.2024.03.016","volume":"52","author":"M Omar","year":"2024","unstructured":"Omar M, Brin D, Glicksberg B, Klang E. Utilizing natural language processing and large language models in the diagnosis and prediction of infectious diseases: a systematic review. Am J Infect Control. 2024;52(9):992\u20131001.","journal-title":"Am J Infect Control"},{"key":"444_CR11","unstructured":"Du H, Zhao J, Zhao Y, Xu S, Lin X, Chen Y, et al. Advancing Real-time Pandemic Forecasting Using Large Language Models: A COVID-19 Case Study. arXiv preprint arXiv:240406962. 2024."},{"key":"444_CR12","doi-asserted-by":"crossref","unstructured":"\u00d6berg J, Khalaf K, Perez Vicente R, Johnell K, Fastbom J. Geographic and socioeconomic differences in potentially inappropriate medication among older adults\u2013 Applying a simplified analysis of individual heterogeneity and discriminatory accuracy (AIHDA) for basic comparisons of healthcare quality. BMC Health Serv Res. 2024(Under peer-review 2025-07-22).","DOI":"10.1186\/s12913-025-13335-y"},{"key":"444_CR13","unstructured":"Society AG. Beers Criteria\u00ae for Potentially Inappropriate Medication Use in Older Adults. 2019."},{"issue":"7","key":"444_CR14","doi-asserted-by":"publisher","first-page":"861","DOI":"10.1007\/s00228-015-1860-9","volume":"71","author":"A Renom-Guiteras","year":"2015","unstructured":"Renom-Guiteras A, Meyer G, Th\u00fcrmann PA. The EU(7)-PIM list: a list of potentially inappropriate medications for older people consented by experts from seven European countries. Eur J Clin Pharmacol. 2015;71(7):861\u201375.","journal-title":"Eur J Clin Pharmacol"},{"issue":"3","key":"444_CR15","doi-asserted-by":"publisher","first-page":"245","DOI":"10.1007\/s40266-022-00922-5","volume":"39","author":"F Pazan","year":"2022","unstructured":"Pazan F, Weiss C, Wehling M, Bauer JM, Berthold HK, Denkinger M, et al. The FORTA (fit for the aged) list 2021: fourth version of a validated clinical aid for improved pharmacotherapy in older adults. Drugs Aging. 2022;39(3):245\u20137.","journal-title":"Drugs Aging"},{"issue":"31\u201332","key":"444_CR16","first-page":"543","volume":"107","author":"S Holt","year":"2010","unstructured":"Holt S, Schmiedl S, Th\u00fcrmann PA. Potentially inappropriate medications in the elderly: the PRISCUS list. Dtsch Arztebl Int. 2010;107(31\u201332):543\u201351.","journal-title":"Dtsch Arztebl Int"},{"issue":"1","key":"444_CR17","doi-asserted-by":"publisher","DOI":"10.1016\/j.xcrm.2023.101356","volume":"5","author":"JCL Ong","year":"2024","unstructured":"Ong JCL, Seng BJJ, Law JZF, Low LL, Kwa ALH, Giacomini KM, et al. Artificial intelligence, ChatGPT, and other large language models for social determinants of health: current state and future directions. Cell Rep Med. 2024;5(1): 101356.","journal-title":"Cell Rep Med"},{"issue":"12","key":"444_CR18","doi-asserted-by":"publisher","first-page":"3590","DOI":"10.1038\/s41591-024-03258-2","volume":"30","author":"SR Pfohl","year":"2024","unstructured":"Pfohl SR, Cole-Lewis H, Sayres R, Neal D, Asiedu M, Dieng A, et al. A toolbox for surfacing health equity harms and biases in large language models. Nat Med. 2024;30(12):3590\u2013600.","journal-title":"Nat Med"},{"key":"444_CR19","unstructured":"Welfare SNBoHa. Indikatorer f\u00f6r god l\u00e4kemedelsterapi hos \u00e4ldre. National board of health and welfare; 2017."},{"key":"444_CR20","doi-asserted-by":"publisher","unstructured":"\u00d6berg J. Simulated dataset for analysis via Large language models, via the analytical framework Analysis of Individual Heterogeneity and Discriminatory Accuracy (AIHDA). https:\/\/doi.org\/10.6084\/m9.figshare.28560710.v1: figshare; 2025.","DOI":"10.6084\/m9.figshare.28560710.v1"},{"issue":"1","key":"444_CR21","doi-asserted-by":"publisher","first-page":"47","DOI":"10.1007\/s11528-023-00896-0","volume":"68","author":"W Cain","year":"2024","unstructured":"Cain W. Prompting change: exploring prompt engineering in large language model AI and its potential to transform education. TechTrends. 2024;68(1):47\u201357.","journal-title":"TechTrends"},{"key":"444_CR22","unstructured":"\u00d6berg J. AIHDA-GPT chatgpt.com2025 [Available from: https:\/\/chatgpt.com\/g\/g-UGPzPUOMD-maihda-gpt."},{"key":"444_CR23","doi-asserted-by":"publisher","DOI":"10.1016\/j.healthplace.2019","volume":"58","author":"J Merlo","year":"2019","unstructured":"Merlo J, Wagner P, Leckie G. A simple multilevel approach for analysing geographical inequalities in public health reports: the case of municipality differences in obesity. Health Place. 2019;58: 102145. https:\/\/doi.org\/10.1016\/j.healthplace.2019.","journal-title":"Health Place"},{"key":"444_CR24","doi-asserted-by":"crossref","unstructured":"Hosmer DW, Lemeshow S. Applied logistic regression. 2nd ed. New York: Wiley; 2000. xii, p 373.","DOI":"10.1002\/0471722146"},{"key":"444_CR25","unstructured":"Raj H, Rosati D, Majumdar S. Measuring reliability of large language models through semantic consistency. arXiv preprint arXiv:221105853. 2022."},{"key":"444_CR26","unstructured":"Wang W, Haddow B, Birch A, Peng W. Assessing the reliability of large language model knowledge. arXiv preprint arXiv:231009820. 2023."},{"issue":"4","key":"444_CR27","doi-asserted-by":"publisher","first-page":"89","DOI":"10.1007\/s10462-024-10720-7","volume":"57","author":"R Bridgelall","year":"2024","unstructured":"Bridgelall R. Unraveling the mysteries of AI chatbots. Artif Intell Rev. 2024;57(4):89.","journal-title":"Artif Intell Rev"},{"issue":"2","key":"444_CR28","doi-asserted-by":"publisher","first-page":"38","DOI":"10.1007\/s10676-024-09775-5","volume":"26","author":"MT Hicks","year":"2024","unstructured":"Hicks MT, Humphries J, Slater J. ChatGPT is bullshit. Ethics Inf Technol. 2024;26(2):38.","journal-title":"Ethics Inf Technol"},{"issue":"2","key":"444_CR29","doi-asserted-by":"publisher","first-page":"208","DOI":"10.1093\/aje\/kwu108","volume":"180","author":"J Merlo","year":"2014","unstructured":"Merlo J. Invited commentary: multilevel analysis of individual heterogeneity-a fundamental critique of the current probabilistic risk factor epidemiology. Am J Epidemiol. 2014;180(2):208\u201312.","journal-title":"Am J Epidemiol"},{"key":"444_CR30","doi-asserted-by":"publisher","first-page":"74","DOI":"10.1016\/j.socscimed.2017.12.026","volume":"203","author":"J Merlo","year":"2018","unstructured":"Merlo J. Multilevel analysis of individual heterogeneity and discriminatory accuracy (MAIHDA) within an intersectional framework. Soc Sci Med. 2018;203:74\u201380.","journal-title":"Soc Sci Med"},{"key":"444_CR31","unstructured":"OpenAI o3-mini -Pushing the frontier of cost-effective reasoning. [press release]. 2025."},{"key":"444_CR32","unstructured":"Qian J, Wang H, Li Z, Li S, Yan X. Limitations of language models in arithmetic and symbolic induction. arXiv preprint arXiv:220805051. 2022."},{"issue":"22","key":"444_CR33","doi-asserted-by":"publisher","first-page":"8615","DOI":"10.3390\/s22228615","volume":"22","author":"A Mavrogiorgou","year":"2022","unstructured":"Mavrogiorgou A, Kiourtis A, Kleftakis S, Mavrogiorgos K, Zafeiropoulos N, Kyriazis D. A catalogue of machine learning algorithms for healthcare risk predictions. Sensors. 2022;22(22):8615.","journal-title":"Sensors"},{"key":"444_CR34","volume-title":"Information systems","author":"A Kiourtis","year":"2024","unstructured":"Kiourtis A, Mavrogiorgou A, Kyriazis D. \u0391 cross-sector data space for correlating environmental risks with human health. In: Information systems. Cham: Springer Nature Switzerland; 2024."},{"issue":"5","key":"444_CR35","doi-asserted-by":"publisher","first-page":"369","DOI":"10.5455\/aim.2019.27.369-373","volume":"27","author":"D Kyriazis","year":"2019","unstructured":"Kyriazis D, Autexier S, Boniface M, Engen V, Jimenez-Peris R, Jordan B, et al. The CrowdHEALTH project and the hollistic health records: collective wisdom driving public health policies. Acta Inform Med. 2019;27(5):369\u201373.","journal-title":"Acta Inform Med"},{"key":"444_CR36","doi-asserted-by":"crossref","unstructured":"Re\u0161\u010di\u010d N, Alberts J, Altenburg TM, Chinapaw MJM, Nigro AD, Fenoglio D, et al. SmartCHANGE: AI-based long-term health risk evaluation for driving behaviour change strategies in children and youth. In: 2023 International Conference on Applied Mathematics & Computer Science (ICAMCS); 2023.","DOI":"10.1109\/ICAMCS59110.2023.00020"},{"key":"444_CR37","doi-asserted-by":"crossref","unstructured":"Mavrogiorgou A, Kleftakis S, Mavrogiorgos K, Zafeiropoulos N, Menychtas A, Kiourtis A, et al. beHEALTHIER: A Microservices Platform for Analyzing and Exploiting Healthcare Data. In: 2021 IEEE 34th International Symposium on Computer-Based Medical Systems (CBMS); 2021.","DOI":"10.1109\/CBMS52027.2021.00078"},{"issue":"9","key":"444_CR38","doi-asserted-by":"publisher","first-page":"1978","DOI":"10.3390\/s19091978","volume":"19","author":"A Mavrogiorgou","year":"2019","unstructured":"Mavrogiorgou A, Kiourtis A, Perakis K, Pitsios S, Kyriazis D. IoT in healthcare: achieving interoperability of high-quality data acquired by IoT medical devices. Sensors. 2019;19(9):1978.","journal-title":"Sensors"},{"key":"444_CR39","doi-asserted-by":"publisher","DOI":"10.1016\/j.cmpb.2019.06.026","volume":"181","author":"A Mavrogiorgou","year":"2019","unstructured":"Mavrogiorgou A, Kiourtis A, Perakis K, Miltiadou D, Pitsios S, Kyriazis D. Analyzing data and data sources towards a unified approach for ensuring end-to-end data and data sources quality in healthcare 4.0. Comput Methods Programs Biomed. 2019;181: 104967.","journal-title":"Comput Methods Programs Biomed"}],"container-title":["Discover Artificial Intelligence"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s44163-025-00444-0.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1007\/s44163-025-00444-0\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s44163-025-00444-0.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,9,7]],"date-time":"2025-09-07T21:34:27Z","timestamp":1757280867000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/s44163-025-00444-0"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,7,24]]},"references-count":39,"journal-issue":{"issue":"1","published-online":{"date-parts":[[2025,12]]}},"alternative-id":["444"],"URL":"https:\/\/doi.org\/10.1007\/s44163-025-00444-0","relation":{},"ISSN":["2731-0809"],"issn-type":[{"type":"electronic","value":"2731-0809"}],"subject":[],"published":{"date-parts":[[2025,7,24]]},"assertion":[{"value":"17 March 2025","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"16 July 2025","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"24 July 2025","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Declarations"}},{"value":"Ethic approval was not needed since all data is simulated.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Ethics approval and consent for participate"}},{"value":"Not applicable.","order":3,"name":"Ethics","group":{"name":"EthicsHeading","label":"Consent for publication"}},{"value":"The authors declare no competing interests.","order":4,"name":"Ethics","group":{"name":"EthicsHeading","label":"Competing interests"}}],"article-number":"175"}}