{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,7,24]],"date-time":"2026-07-24T03:04:45Z","timestamp":1784862285920,"version":"3.55.0"},"reference-count":47,"publisher":"Springer Science and Business Media LLC","issue":"1","license":[{"start":{"date-parts":[[2026,2,23]],"date-time":"2026-02-23T00:00:00Z","timestamp":1771804800000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2026,3,31]],"date-time":"2026-03-31T00:00:00Z","timestamp":1774915200000},"content-version":"vor","delay-in-days":36,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"name":"NIH","award":["R01LM013337"],"award-info":[{"award-number":["R01LM013337"]}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["npj Digit. Med."],"abstract":"<jats:title>Abstract<\/jats:title>\n                  <jats:p>Social media is a critical platform for understanding and fostering public engagement with health interventions. However, the lack of real-time social media infoveillance on public health issues may lead to delayed responses and suboptimal policy adjustments. To address this gap, we developed PH-LLM\u2014a novel suite of large language models (LLMs) designed for real-time public health monitoring. We curated a multilingual training corpus and trained PH-LLM using QLoRA and LoRA plus, leveraging Qwen 2.5. We constructed a benchmark comprising 19 English and 20 multilingual held-out tasks and evaluated PH-LLM\u2019s zero-shot performance. PH-LLM consistently outperformed baseline LLMs of similar and larger sizes. PH-LLM-14B and PH-LLM-32B surpassed Qwen2.5-72B-Instruct, Llama-3.1-70B-Instruct, Mistral-Large-Instruct-2407, and GPT-4o in both English tasks (&gt;=56.0% vs. &lt;= 52.3%) and multilingual tasks (&gt;=59.6% vs. &lt;= 59.1%). PH-LLM represents a significant advancement in real-time public health infoveillance, offering state-of-the-art multilingual capabilities and cost-effective solutions for monitoring public sentiment on health issues.<\/jats:p>","DOI":"10.1038\/s41746-026-02435-6","type":"journal-article","created":{"date-parts":[[2026,2,23]],"date-time":"2026-02-23T07:05:04Z","timestamp":1771830304000},"update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":1,"title":["A suite of large language models for public health infoveillance"],"prefix":"10.1038","volume":"9","author":[{"given":"Xinyu","family":"Zhou","sequence":"first","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Jiaqi","family":"Zhou","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Chiyu","family":"Wang","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Qianqian","family":"Xie","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Kaize","family":"Ding","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Chengsheng","family":"Mao","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Yuntian","family":"Liu","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Zhiyuan","family":"Cao","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Huangrui","family":"Chu","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Xi","family":"Chen","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Hua","family":"Xu","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Heidi J.","family":"Larson","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Yuan","family":"Luo","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]}],"member":"297","published-online":{"date-parts":[[2026,2,23]]},"reference":[{"key":"2435_CR1","doi-asserted-by":"publisher","DOI":"10.2196\/jmir.1157","volume":"11","author":"G Eysenbach","year":"2009","unstructured":"Eysenbach, G. Infodemiology and infoveillance: framework for an emerging set of public health informatics methods to analyze search, communication and publication behavior on the Internet. J. Med. Internet Res. 11, e1157 (2009).","journal-title":"J. Med. Internet Res."},{"key":"2435_CR2","doi-asserted-by":"publisher","DOI":"10.2196\/30979","volume":"1","author":"N Calleja","year":"2021","unstructured":"Calleja, N. et al. A public health research agenda for managing infodemics: methods and results of the first WHO infodemiology conference. JMIR Infodemiol. 1, e30979 (2021).","journal-title":"JMIR Infodemiol."},{"key":"2435_CR3","doi-asserted-by":"publisher","DOI":"10.1136\/bmjgh-2023-013515","volume":"8","author":"K Terry","year":"2023","unstructured":"Terry, K., Yang, F., Yao, Q. & Liu, C. The role of social media in public health crises caused by infectious disease: a scoping review. BMJ Glob. Health 8, e013515 (2023).","journal-title":"BMJ Glob. Health"},{"key":"2435_CR4","doi-asserted-by":"publisher","first-page":"425","DOI":"10.1093\/eurpub\/ckae029","volume":"34","author":"AK Purba","year":"2024","unstructured":"Purba, A. K., Pearce, A., Henderson, M., McKee, M. & Katikireddi, S. V. Social media as a determinant of health. Eur. J. Public Health 34, 425\u2013426 (2024).","journal-title":"Eur. J. Public Health"},{"key":"2435_CR5","unstructured":"Infodemic. https:\/\/www.who.int\/health-topics\/infodemic#tab=tab_1."},{"key":"2435_CR6","doi-asserted-by":"publisher","DOI":"10.1038\/s41746-021-00412-9","volume":"4","author":"DV Gunasekeran","year":"2021","unstructured":"Gunasekeran, D. V., Tseng, R. M. W. W., Tham, Y.-C. & Wong, T. Y. Applications of digital health for public health responses to COVID-19: a systematic scoping review of artificial intelligence, telehealth and related technologies. NPJ Digital Med. 4, 40 (2021).","journal-title":"NPJ Digital Med."},{"key":"2435_CR7","doi-asserted-by":"publisher","first-page":"e175","DOI":"10.1016\/S2589-7500(20)30315-0","volume":"3","author":"S-F Tsao","year":"2021","unstructured":"Tsao, S.-F. et al. What social media told us in the time of COVID-19: a scoping review. Lancet Digital Health 3, e175\u2013e194 (2021).","journal-title":"Lancet Digital Health"},{"key":"2435_CR8","doi-asserted-by":"publisher","unstructured":"Espinosa, L. & Salath\u00e9, M. Use of large language models as a scalable approach to understanding public health discourse. medRxiv https:\/\/doi.org\/10.1101\/2024.02.06.24302383 (2024).","DOI":"10.1101\/2024.02.06.24302383"},{"key":"2435_CR9","doi-asserted-by":"publisher","first-page":"2181","DOI":"10.1093\/jamia\/ocae210","volume":"31","author":"Y Guo","year":"2024","unstructured":"Guo, Y., Ovadje, A., Al-Garadi, M. A. & Sarker, A. Evaluating large language models for health-related text classification tasks with public social media data. J. Am. Med. Inform. Assoc. 31, 2181\u20132189 (2024).","journal-title":"J. Am. Med. Inform. Assoc."},{"key":"2435_CR10","doi-asserted-by":"publisher","unstructured":"He, L., Omranian, S., McRoy, S. & Zheng, K. Using Large Language Models for sentiment analysis of health-related social media data: empirical evaluation and practical tips. medRxiv https:\/\/doi.org\/10.1101\/2024.03.19.24304544 (2024).","DOI":"10.1101\/2024.03.19.24304544"},{"key":"2435_CR11","first-page":"102723","volume":"42","author":"S Kim","year":"2024","unstructured":"Kim, S., Kim, K. & Jo, C. W. Accuracy of a large language model in distinguishing anti-and pro-vaccination messages on social media: The case of human papillomavirus vaccination. Preventive Med. Rep. 42, 102723 (2024).","journal-title":"Preventive Med. Rep."},{"key":"2435_CR12","doi-asserted-by":"crossref","unstructured":"Shah, S. M., Gillani, S. A., Baig, M. S. A., Saleem, M. A. & Siddiqui, M. H. Advancing depression detection on social media platforms through fine-tuned large language models. Online Social Networks and Media 46, 100311 (2025).","DOI":"10.1016\/j.osnem.2025.100311"},{"key":"2435_CR13","doi-asserted-by":"crossref","unstructured":"Yang, K. et al. MentaLLaMA: interpretable mental health analysis on social media with large language models. In Proc. ACM on Web Conference 2024, 4489\u20134500 (2024).","DOI":"10.1145\/3589334.3648137"},{"key":"2435_CR14","doi-asserted-by":"publisher","first-page":"172","DOI":"10.1038\/s41586-023-06291-2","volume":"620","author":"K Singhal","year":"2023","unstructured":"Singhal, K. et al. Large language models encode clinical knowledge. Nature 620, 172\u2013180 (2023).","journal-title":"Nature"},{"key":"2435_CR15","doi-asserted-by":"crossref","unstructured":"Wu, C. et al. PMC-LLaMA: toward building open-source language models for medicine. J. Am. Med. Inform. Assoc. 31, 1833\u20131843 (2024).","DOI":"10.1093\/jamia\/ocae045"},{"key":"2435_CR16","doi-asserted-by":"crossref","unstructured":"Jiang, Y., Qiu, R., Zhang, Y. & Zhang, P.-F. Balanced and explainable social media analysis for public health with large language models. In Australasian Database Conference, 73\u201386 (Springer, 2023).","DOI":"10.1007\/978-3-031-47843-7_6"},{"key":"2435_CR17","doi-asserted-by":"crossref","unstructured":"Li, W. et al. Zero-shot Explainable Mental Health Analysis on Social Media by Incorporating Mental Scales. In Companion Proc. ACM on Web Conference 2024, 959\u2013962 (2024).","DOI":"10.1145\/3589335.3651584"},{"key":"2435_CR18","doi-asserted-by":"crossref","unstructured":"Du, H. et al. Advancing real-time infectious disease forecasting using large language models. Nat. Comput. Sci. 5, 467\u2013480 (2025).","DOI":"10.1038\/s43588-025-00798-6"},{"key":"2435_CR19","unstructured":"Harris, J. et al. Evaluating large language models for public health classification and extraction tasks. Preprint at https:\/\/arxiv.org\/abs\/2405.14766 (2024)."},{"key":"2435_CR20","unstructured":"Touvron, H. et al. Llama: Open and efficient foundation language models. Preprint at https:\/\/arxiv.org\/abs\/2302.13971 (2023)."},{"key":"2435_CR21","unstructured":"Achiam, J. et al. Gpt-4 technical report. Preprint at https:\/\/arxiv.org\/abs\/2303.08774 (2023)."},{"key":"2435_CR22","doi-asserted-by":"crossref","unstructured":"Zheng, Y., Zhang, R., Zhang, J., Ye, Y. & Luo, Z. LlamaFactory: Unified Efficient Fine-Tuning of 100+ Language Models. 400\u2013410 (Association for Computational Linguistics, 2024).","DOI":"10.18653\/v1\/2024.acl-demos.38"},{"key":"2435_CR23","doi-asserted-by":"crossref","unstructured":"Depoux, A. et al. The pandemic of social media panic travels faster than the COVID-19 outbreak. Vol. 27, taaa031 (Oxford University Press, 2020).","DOI":"10.1093\/jtm\/taaa031"},{"key":"2435_CR24","unstructured":"Yang, A. et al. Qwen2 technical report. Preprint at https:\/\/arxiv.org\/abs\/2407.10671 (2024)."},{"key":"2435_CR25","unstructured":"Dubey, A. et al. The llama 3 herd of models. Preprint at https:\/\/arxiv.org\/abs\/2407.21783 (2024)."},{"key":"2435_CR26","unstructured":"Large Enough | Mistral AI | Frontier AI in your hands (2024) https:\/\/mistral.ai\/news\/mistral-large-2407\/."},{"key":"2435_CR27","doi-asserted-by":"crossref","unstructured":"Muennighoff, N. et al. Crosslingual generalization through multitask finetuning. 15991\u201316111 (Association for Computational Linguistics, 2023).","DOI":"10.18653\/v1\/2023.acl-long.891"},{"key":"2435_CR28","unstructured":"X API | Products - Twitter Developer Platform, https:\/\/developer.x.com\/en\/products\/x-api."},{"key":"2435_CR29","doi-asserted-by":"publisher","first-page":"382","DOI":"10.1080\/02763869.2020.1826228","volume":"39","author":"J White","year":"2020","unstructured":"White, J. PubMed 2.0. Med. Ref. Serv. Q. 39, 382\u2013387 (2020).","journal-title":"Med. Ref. Serv. Q."},{"key":"2435_CR30","unstructured":"Mukherjee, S. et al. Orca: Progressive learning from complex explanation traces of gpt-4. Preprint at 2306.02707 https:\/\/arxiv.org\/abs\/2306.00270 (2023)."},{"key":"2435_CR31","unstructured":"Han, T. et al. MedAlpaca--an open-source collection of medical conversational AI models and training data. Preprint at https:\/\/arxiv.org\/abs\/2304.08247 (2023)."},{"key":"2435_CR32","unstructured":"Pal, A., Umapathi, L. K. & Sankarasubbu, M. Medmcqa: A large-scale multi-subject multi-choice dataset for medical domain question answering. In Conference on Health, Inference, and Learning, 248\u2013260 (PMLR, 2022)."},{"key":"2435_CR33","unstructured":"Li, H., Koto, F., Wu, M., Aji, A. F. & Baldwin, T. Bactrian-x: Multilingual replicable instruction-following models with low-rank adaptation. Preprint at https:\/\/arxiv.org\/abs\/2305.15011 (2023)."},{"key":"2435_CR34","doi-asserted-by":"crossref","unstructured":"Dettmers, T., Pagnoni, A., Holtzman, A. & Zettlemoyer, L. Qlora: efficient finetuning of quantized LLMs. Adv. Neural Inform. Process. Syst. 36 (2024).","DOI":"10.52202\/075280-0441"},{"key":"2435_CR35","unstructured":"Hu, E. J. et al. LoRA: Low-rank adaptation of large language models. Preprint at https:\/\/arxiv.org\/abs\/2106.09685 (2021)."},{"key":"2435_CR36","doi-asserted-by":"publisher","DOI":"10.1038\/s41746-025-01533-1","volume":"8","author":"Q Xie","year":"2025","unstructured":"Xie, Q. et al. Medical foundation large language models for comprehensive text analysis and beyond. npj Digital Med. 8, 141 (2025).","journal-title":"npj Digital Med."},{"key":"2435_CR37","unstructured":"Hayou, S., Ghosh, N. & Yu, B. LoRA+: efficient low rank adaptation of large models. in Proceedings of the 41st International Conference on Machine Learning Vol. 235 Article 712 (JMLR.org, Vienna, Austria, 2024)."},{"key":"2435_CR38","doi-asserted-by":"crossref","unstructured":"Poddar, S., Samad, A. M., Mukherjee, R., Ganguly, N. & Ghosh, S. Caves: A dataset to facilitate explainable classification and summarization of concerns towards covid vaccines. In Proc. 45th International ACM SIGIR Conference on Research and Development in Information Retrieval, 3154\u20133164 (2022).","DOI":"10.1145\/3477495.3531745"},{"key":"2435_CR39","doi-asserted-by":"publisher","first-page":"1023281","DOI":"10.3389\/frai.2023.1023281","volume":"6","author":"M M\u00fcller","year":"2023","unstructured":"M\u00fcller, M., Salath\u00e9, M. & Kummervold, P. E. Covid-twitter-bert: a natural language processing model to analyse covid-19 content on twitter. Front. Artif. Intell. 6, 1023281 (2023).","journal-title":"Front. Artif. Intell."},{"key":"2435_CR40","doi-asserted-by":"publisher","first-page":"4663","DOI":"10.1007\/s40747-021-00608-2","volume":"8","author":"I Mollas","year":"2022","unstructured":"Mollas, I., Chrysopoulou, Z., Karlos, S. & Tsoumakas, G. ETHOS: a multi-label hate speech detection dataset. Complex Intell. Syst. 8, 4663\u20134678 (2022).","journal-title":"Complex Intell. Syst."},{"key":"2435_CR41","doi-asserted-by":"crossref","unstructured":"Kennedy, B. et al. Introducing the Gab Hate Corpus: defi ning and applying hate-based rhetoric to social media posts at scale. Lang. Resour. Eval. 56, 79\u2013108 (2022).","DOI":"10.1007\/s10579-021-09569-x"},{"key":"2435_CR42","unstructured":"Memon, S. A. & Carley, K. M. Characterizing COVID-19 misinformation communities using a novel Twitter dataset. Preprint at https:\/\/arxiv.org\/abs\/2008.00791 (2020)."},{"key":"2435_CR43","doi-asserted-by":"publisher","first-page":"e26895","DOI":"10.2196\/26895","volume":"1","author":"L Lin","year":"2021","unstructured":"Lin, L. et al. Public attitudes and factors of COVID-19 testing hesitancy in the United Kingdom and China: comparative infodemiology study. JM\u0130R Infodemiol. 1, e26895 (2021).","journal-title":"JM\u0130R Infodemiol."},{"key":"2435_CR44","doi-asserted-by":"publisher","first-page":"232","DOI":"10.1016\/j.procs.2021.05.086","volume":"189","author":"MSH Ameur","year":"2021","unstructured":"Ameur, M. S. H. & Aliane, H. AraCOVID19-MFH: Arabic COVID-19 multi-label fake news & hate speech detection dataset. Procedia Comput. Sci. 189, 232\u2013241 (2021).","journal-title":"Procedia Comput. Sci."},{"key":"2435_CR45","doi-asserted-by":"crossref","unstructured":"Saputri, M. S., Mahendra, R. & Adriani, M. Emotion classification on indonesian twitter dataset. In 2018 International Conference on Asian Language Processing (IALP) 90\u201395 (IEEE, 2018)","DOI":"10.1109\/IALP.2018.8629262"},{"key":"2435_CR46","unstructured":"Alqurashi, S., Hamoui, B., Alashaikh, A., Alhindi, A. & Alanazi, E. Eating garlic prevents COVID-19 infection: detecting misinformation on the Arabic content of Twitter. Preprint at https:\/\/arxiv.org\/abs\/2101.05626 (2021)."},{"key":"2435_CR47","doi-asserted-by":"publisher","first-page":"e27632","DOI":"10.2196\/27632","volume":"23","author":"Z Hou","year":"2021","unstructured":"Hou, Z. et al. Assessing COVID-19 vaccine hesitancy, confidence, and public engagement: a global social listening study. J. Med. Internet Res. 23, e27632 (2021).","journal-title":"J. Med. Internet Res."}],"container-title":["npj Digital Medicine"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.nature.com\/articles\/s41746-026-02435-6","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/www.nature.com\/articles\/s41746-026-02435-6.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/www.nature.com\/articles\/s41746-026-02435-6.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2026,3,31]],"date-time":"2026-03-31T19:14:45Z","timestamp":1774984485000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.nature.com\/articles\/s41746-026-02435-6"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2026,2,23]]},"references-count":47,"journal-issue":{"issue":"1","published-online":{"date-parts":[[2026,12]]}},"alternative-id":["2435"],"URL":"https:\/\/doi.org\/10.1038\/s41746-026-02435-6","relation":{},"ISSN":["2398-6352"],"issn-type":[{"value":"2398-6352","type":"electronic"}],"subject":[],"published":{"date-parts":[[2026,2,23]]},"assertion":[{"value":"25 March 2025","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"4 February 2026","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"23 February 2026","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"YL serves on the editorial board of npj Digital Medicine. The other authors have declared no competing interest.","order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Competing interests"}}],"article-number":"270"}}