{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,1]],"date-time":"2026-03-01T11:02:02Z","timestamp":1772362922082,"version":"3.50.1"},"reference-count":78,"publisher":"Association for Computing Machinery (ACM)","issue":"3","content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Web"],"published-print":{"date-parts":[[2025,8,31]]},"abstract":"<jats:p>Understanding user intents in information access scenarios can help us provide more relevant and personalized search results and recommendations. However, analyzing user intents is not easy, especially for emerging forms of Web search such as Artificial Intelligence (AI)-driven chat. To understand user intents from retrospective log data, we need a way to label them with meaningful categories that capture their diversity and dynamics. Existing methods rely on manual or Machine-Learned (ML) labeling, which is either expensive or inflexible for large and dynamic datasets. Large Language Models (LLMs) could generate rich and relevant concepts, descriptions, and examples for user intents using log data of user interactions. However, using LLMs to generate a user intent taxonomy and applying it for a given Information Retrieval (IR) application can be problematic for two main reasons: (1) such a taxonomy is not externally validated; and (2) there may be an undesirable feedback loop if an LLM does both these tasks without external validation. To address this, we propose a new methodology with human experts and assessors to verify the quality of the LLM-generated taxonomy. We also present an end-to-end pipeline that uses an LLM with Human-in-the-Loop (HITL) to produce, refine, and apply labels for user intent analysis in log data. We demonstrate its effectiveness by uncovering new insights into user intents from search and chat logs from the Microsoft Bing Web search engine. The novelty in this research stems from the method for generating purpose-driven user intent taxonomies with strong validation. Our approach not only helps remove methodological and practical bottlenecks from intent-focused research, but also provides a new framework for generating, validating, and applying other kinds of taxonomies in a scalable and adaptable way, with reasonable human effort.<\/jats:p>","DOI":"10.1145\/3732294","type":"journal-article","created":{"date-parts":[[2025,5,6]],"date-time":"2025-05-06T07:19:59Z","timestamp":1746515999000},"page":"1-29","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":15,"title":["Using Large Language Models to Generate, Validate, and Apply User Intent Taxonomies"],"prefix":"10.1145","volume":"19","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-3797-4293","authenticated-orcid":false,"given":"Chirag","family":"Shah","sequence":"first","affiliation":[{"name":"University of Washington","place":["Seattle, United States"]}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-0265-4249","authenticated-orcid":false,"given":"Ryen","family":"White","sequence":"additional","affiliation":[{"name":"Microsoft Research, Microsoft","place":["Redmond, United States"]}]},{"ORCID":"https:\/\/orcid.org\/0009-0001-4878-6781","authenticated-orcid":false,"given":"Reid","family":"Andersen","sequence":"additional","affiliation":[{"name":"Microsoft Corp","place":["Redmond, United States"]}]},{"ORCID":"https:\/\/orcid.org\/0009-0006-4044-3756","authenticated-orcid":false,"given":"Georg","family":"Buscher","sequence":"additional","affiliation":[{"name":"Microsoft Corp","place":["Redmond, United States"]}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-1507-5200","authenticated-orcid":false,"given":"Scott","family":"Counts","sequence":"additional","affiliation":[{"name":"Microsoft Research","place":["Redmond, United States"]}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-1052-4142","authenticated-orcid":false,"given":"Sarkar","family":"Das","sequence":"additional","affiliation":[{"name":"Penn State University","place":["University Park, United States"]}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-5467-1331","authenticated-orcid":false,"given":"Ali","family":"Montazer","sequence":"additional","affiliation":[{"name":"University of Massachusetts","place":["Amherst, United States"]}]},{"ORCID":"https:\/\/orcid.org\/0009-0001-1166-0472","authenticated-orcid":false,"given":"Sathish","family":"Manivannan","sequence":"additional","affiliation":[{"name":"Microsoft Corp","place":["Redmond, United States"]}]},{"ORCID":"https:\/\/orcid.org\/0009-0007-1157-018X","authenticated-orcid":false,"given":"Jennifer","family":"Neville","sequence":"additional","affiliation":[{"name":"Microsoft Research","place":["Redmond, United States"]}]},{"ORCID":"https:\/\/orcid.org\/0009-0005-3231-2863","authenticated-orcid":false,"given":"Nagu","family":"Rangan","sequence":"additional","affiliation":[{"name":"Microsoft Corp","place":["Redmond, United States"]}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-3553-4331","authenticated-orcid":false,"given":"Tara","family":"Safavi","sequence":"additional","affiliation":[{"name":"Microsoft Research","place":["Redmond, United States"]}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-1318-8140","authenticated-orcid":false,"given":"Siddharth","family":"Suri","sequence":"additional","affiliation":[{"name":"Microsoft Research","place":["New York, United States"]}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-5298-1221","authenticated-orcid":false,"given":"Mengting","family":"Wan","sequence":"additional","affiliation":[{"name":"Microsoft Corp","place":["Redmond, United States"]}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-5111-8795","authenticated-orcid":false,"given":"Leijie","family":"Wang","sequence":"additional","affiliation":[{"name":"University of Washington","place":["Seattle, United States"]}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-6615-8615","authenticated-orcid":false,"given":"Longqi","family":"Yang","sequence":"additional","affiliation":[{"name":"Microsoft Corp","place":["Redmond, United States"]}]}],"member":"320","published-online":{"date-parts":[[2025,8,22]]},"reference":[{"key":"e_1_3_3_2_2","doi-asserted-by":"publisher","DOI":"10.1177\/1609406920984608"},{"key":"e_1_3_3_3_2","doi-asserted-by":"publisher","DOI":"10.1145\/2348283.2348328"},{"key":"e_1_3_3_4_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.aiopen.2023.08.001"},{"key":"e_1_3_3_5_2","doi-asserted-by":"publisher","DOI":"10.1145\/3477495.3531737"},{"key":"e_1_3_3_6_2","doi-asserted-by":"crossref","unstructured":"Maryam Amirizaniani Tanya Roosta Aman Chadha and Chirag Shah. 2024. AuditLLM: A tool for auditing large language models using multiprobe approach. arXiv:2402.09334. Retrieved from https:\/\/arxiv.org\/abs\/2402.09334","DOI":"10.1145\/3627673.3679222"},{"key":"e_1_3_3_7_2","unstructured":"Maryam Amirizaniani Jihan Yao Adrian Lavergne Elizabeth Snell Okada Aman Chadha Tanya Roosta and Chirag Shah. 2024. Developing a framework for auditing large language models using human-in-the-loop. arXiv:2402.09346. Retrieved from https:\/\/arxiv.org\/abs\/2402.09346"},{"key":"e_1_3_3_8_2","unstructured":"Dogu Araci. 2019. Finbert: Financial sentiment analysis with pre-trained language models. arXiv:1908.10063. Retrieved from https:\/\/arxiv.org\/abs\/1908.10063"},{"key":"e_1_3_3_9_2","unstructured":"Muneera Bano Didar Zowghi and Jon Whittle. 2023. Exploring qualitative research using LLMs. arXiv:2306.13298. Retrieved from https:\/\/arxiv.org\/abs\/2306.13298"},{"key":"e_1_3_3_10_2","doi-asserted-by":"publisher","DOI":"10.1145\/3539618.3591923"},{"key":"e_1_3_3_11_2","doi-asserted-by":"publisher","DOI":"10.1145\/3477495.3531926"},{"key":"e_1_3_3_12_2","doi-asserted-by":"publisher","DOI":"10.1191\/1478088706qp063oa"},{"key":"e_1_3_3_13_2","doi-asserted-by":"publisher","DOI":"10.1145\/792550.792552"},{"key":"e_1_3_3_14_2","first-page":"1877","article-title":"Language models are few-shot learners","volume":"33","author":"Brown Tom","year":"2020","unstructured":"Tom Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared D Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, et\u00a0al. 2020. Language models are few-shot learners. Advances in Neural Information Processing Systems 33 (2020), 1877\u20131901.","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_3_3_15_2","doi-asserted-by":"publisher","DOI":"10.1145\/3544549.3582749"},{"key":"e_1_3_3_16_2","doi-asserted-by":"publisher","DOI":"10.1007\/s11042-019-07880-y"},{"key":"e_1_3_3_17_2","first-page":"446","volume-title":"Proceedings of the 23rd Conference on Very Large Databases","volume":"97","author":"Chakrabarti Soumen","year":"1997","unstructured":"Soumen Chakrabarti, Byron Dom, Rakesh Agrawal, and Prabhakar Raghavan. 1997. Using taxonomy, discriminants, and signatures for navigating in text databases. In Proceedings of the 23rd Conference on Very Large Databases, Vol. 97. 446\u2013455."},{"key":"e_1_3_3_18_2","doi-asserted-by":"publisher","DOI":"10.1145\/2187980.2188206"},{"key":"e_1_3_3_19_2","doi-asserted-by":"publisher","DOI":"10.1145\/1183614.1183768"},{"key":"e_1_3_3_20_2","doi-asserted-by":"crossref","unstructured":"Eunsol Choi He He Mohit Iyyer Mark Yatskar Wen-tau Yih Yejin Choi Percy Liang and Luke Zettlemoyer. 2018. QuAC: Question answering in context. arXiv:1808.07036. Retrieved from https:\/\/arxiv.org\/abs\/1808.07036","DOI":"10.18653\/v1\/D18-1241"},{"key":"e_1_3_3_21_2","doi-asserted-by":"publisher","DOI":"10.1108\/14684520310489032"},{"key":"e_1_3_3_22_2","doi-asserted-by":"publisher","DOI":"10.1177\/001316446002000104"},{"key":"e_1_3_3_23_2","doi-asserted-by":"publisher","DOI":"10.1145\/1341531.1341545"},{"key":"e_1_3_3_24_2","doi-asserted-by":"crossref","unstructured":"Jeffrey Dalton Chenyan Xiong and Jamie Callan. 2020. TREC CAsT 2019: The conversational assistance track overview. arXiv:2003.13624. Retrieved from https:\/\/arxiv.org\/abs\/2003.13624","DOI":"10.6028\/NIST.SP.1266.cast-overview"},{"key":"e_1_3_3_25_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.ijinfomgt.2023.102642"},{"key":"e_1_3_3_26_2","doi-asserted-by":"publisher","DOI":"10.1145\/2556195.2556217"},{"key":"e_1_3_3_27_2","doi-asserted-by":"publisher","DOI":"10.1145\/3578337.3605136"},{"key":"e_1_3_3_28_2","doi-asserted-by":"publisher","DOI":"10.1037\/h0031619"},{"key":"e_1_3_3_29_2","doi-asserted-by":"publisher","DOI":"10.1145\/3366424.3382183"},{"key":"e_1_3_3_30_2","doi-asserted-by":"publisher","DOI":"10.1145\/3613905.3650786"},{"key":"e_1_3_3_31_2","doi-asserted-by":"publisher","DOI":"10.1145\/3397271.3401418"},{"key":"e_1_3_3_32_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2020.nlpcss-1.2"},{"key":"e_1_3_3_33_2","doi-asserted-by":"publisher","DOI":"10.1145\/3589335.3648332"},{"key":"e_1_3_3_34_2","unstructured":"Xingwei He Zhenghao Lin Yeyun Gong Alex Jin Hang Zhang Chen Lin Jian Jiao Siu Ming Yiu Nan Duan Weizhu Chen et\u00a0al. 2023. Annollm: Making large language models to be better crowdsourced annotators. arXiv:2303.16854. Retrieved from https:\/\/arxiv.org\/abs\/2303.16854"},{"key":"e_1_3_3_35_2","doi-asserted-by":"publisher","DOI":"10.1073\/pnas.1807184115"},{"key":"e_1_3_3_36_2","doi-asserted-by":"publisher","DOI":"10.1145\/3576896"},{"key":"e_1_3_3_37_2","doi-asserted-by":"publisher","DOI":"10.1145\/1242572.1242739"},{"key":"e_1_3_3_38_2","unstructured":"Vitor Jeronymo Luiz Bonifacio Hugo Abonizio Marzieh Fadaee Roberto Lotufo Jakub Zavrel and Rodrigo Nogueira. 2023. InPars-v2: Large language models as efficient dataset generators for information retrieval. arXiv:2301.01820. Retrieved from https:\/\/arxiv.org\/abs\/2301.01820"},{"key":"e_1_3_3_39_2","doi-asserted-by":"publisher","DOI":"10.1145\/3530019.3535305"},{"key":"e_1_3_3_40_2","doi-asserted-by":"publisher","DOI":"10.5555\/1241540.1241550"},{"key":"e_1_3_3_41_2","doi-asserted-by":"publisher","DOI":"10.1145\/2766462.2767757"},{"key":"e_1_3_3_42_2","first-page":"1","article-title":"An update for taxonomy designers: Methodological guidance from information systems research","author":"Kundisch Dennis","year":"2021","unstructured":"Dennis Kundisch, Jan Muntermann, Anna Maria Oberl\u00e4nder, Daniel Rau, Maximilian R\u00f6glinger, Thorsten Schoormann, and Daniel Szopinski. 2021. An update for taxonomy designers: Methodological guidance from information systems research. Business and Information Systems Engineering (2021), 1\u201319.","journal-title":"Business and Information Systems Engineering"},{"key":"e_1_3_3_43_2","doi-asserted-by":"publisher","DOI":"10.1145\/3477495.3532074"},{"key":"e_1_3_3_44_2","doi-asserted-by":"publisher","DOI":"10.1145\/1390334.1390393"},{"key":"e_1_3_3_45_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.ipm.2008.07.005"},{"key":"e_1_3_3_46_2","doi-asserted-by":"publisher","DOI":"10.29085\/9781783304837"},{"key":"e_1_3_3_47_2","doi-asserted-by":"publisher","DOI":"10.1145\/3295750.3298922"},{"key":"e_1_3_3_48_2","doi-asserted-by":"publisher","DOI":"10.1145\/3539618.3592032"},{"key":"e_1_3_3_49_2","doi-asserted-by":"publisher","DOI":"10.1145\/1121949.1121979"},{"key":"e_1_3_3_50_2","doi-asserted-by":"publisher","DOI":"10.1145\/2911451.2914746"},{"key":"e_1_3_3_51_2","first-page":"1","article-title":"Auditing large language models: A three-layered approach","author":"M\u00f6kander Jakob","year":"2023","unstructured":"Jakob M\u00f6kander, Jonas Schuett, Hannah Rose Kirk, and Luciano Floridi. 2023. Auditing large language models: A three-layered approach. AI and Ethics (2023), 1\u201331.","journal-title":"AI and Ethics"},{"key":"e_1_3_3_52_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-031-36336-8_4"},{"key":"e_1_3_3_53_2","doi-asserted-by":"publisher","DOI":"10.1007\/3-540-44466-1_38"},{"key":"e_1_3_3_54_2","doi-asserted-by":"publisher","DOI":"10.5220\/0005591001790186"},{"key":"e_1_3_3_55_2","doi-asserted-by":"publisher","DOI":"10.1145\/3020165.3020183"},{"key":"e_1_3_3_56_2","doi-asserted-by":"publisher","DOI":"10.1145\/3404835.3462897"},{"key":"e_1_3_3_57_2","doi-asserted-by":"publisher","DOI":"10.1145\/988672.988675"},{"key":"e_1_3_3_58_2","doi-asserted-by":"publisher","DOI":"10.1145\/3709599"},{"key":"e_1_3_3_59_2","doi-asserted-by":"publisher","DOI":"10.1145\/3576840.3578288"},{"key":"e_1_3_3_60_2","first-page":"1","article-title":"Large language models encode clinical knowledge","author":"Singhal Karan","year":"2023","unstructured":"Karan Singhal, Shekoofeh Azizi, Tao Tu, S Sara Mahdavi, Jason Wei, Hyung Won Chung, Nathan Scales, Ajay Tanwani, Heather Cole-Lewis, Stephen Pfohl, et\u00a0al. 2023. Large language models encode clinical knowledge. Nature (2023), 1\u20139.","journal-title":"Nature"},{"issue":"1","key":"e_1_3_3_61_2","article-title":"Why do people use ChatGPT? Exploring user motivations for generative conversational AI","volume":"29","author":"Skjuve Marita","year":"2024","unstructured":"Marita Skjuve, Petter Bae Brandtz\u00e6g, and Asbj\u00f8rn F\u00f8lstad. 2024. Why do people use ChatGPT? Exploring user motivations for generative conversational AI. First Monday 29, 1 (2024).","journal-title":"First Monday"},{"key":"e_1_3_3_62_2","doi-asserted-by":"publisher","DOI":"10.1145\/584792.584913"},{"key":"e_1_3_3_63_2","doi-asserted-by":"publisher","DOI":"10.1145\/3159652.3159714"},{"key":"e_1_3_3_64_2","unstructured":"Siddharth Suri Scott Counts Leijie Wang Chacha Chen Mengting Wan Tara Safavi Jennifer Neville Chirag Shah Ryen W White Reid Andersen et\u00a0al. 2024. The use of generative search engines for knowledge work and complex tasks. arXiv:2404.04268. Retrieved from https:\/\/arxiv.org\/abs\/2404.04268"},{"key":"e_1_3_3_65_2","doi-asserted-by":"publisher","DOI":"10.1145\/1277741.1277770"},{"key":"e_1_3_3_66_2","doi-asserted-by":"publisher","DOI":"10.1145\/3626772.3657707"},{"key":"e_1_3_3_67_2","unstructured":"Petter T\u00f6rnberg. 2023. How to use LLMs for text analysis. arXiv:2307.13106. Retrieved from https:\/\/arxiv.org\/abs\/2307.13106"},{"key":"e_1_3_3_68_2","doi-asserted-by":"publisher","DOI":"10.1145\/2872427.2883058"},{"key":"e_1_3_3_69_2","doi-asserted-by":"publisher","DOI":"10.1177\/0165551515615833"},{"key":"e_1_3_3_70_2","doi-asserted-by":"publisher","DOI":"10.1145\/3409256.3409822"},{"key":"e_1_3_3_71_2","first-page":"1","article-title":"Guidance for researchers and peer-reviewers on the ethical use of Large Language Models (LLMs) in scientific research workflows","author":"Watkins Ryan","year":"2023","unstructured":"Ryan Watkins. 2023. Guidance for researchers and peer-reviewers on the ethical use of Large Language Models (LLMs) in scientific research workflows. AI and Ethics (2023), 1\u20136.","journal-title":"AI and Ethics"},{"key":"e_1_3_3_72_2","doi-asserted-by":"publisher","DOI":"10.1145\/1242572.1242576"},{"key":"e_1_3_3_73_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.future.2022.05.014"},{"key":"e_1_3_3_74_2","doi-asserted-by":"publisher","DOI":"10.1016\/S0306-4573(01)00018-8"},{"key":"e_1_3_3_75_2","doi-asserted-by":"publisher","DOI":"10.5555\/2390948.2391093"},{"key":"e_1_3_3_76_2","doi-asserted-by":"publisher","DOI":"10.1561\/1500000081"},{"key":"e_1_3_3_77_2","doi-asserted-by":"publisher","DOI":"10.1145\/290941.290956"},{"key":"e_1_3_3_78_2","doi-asserted-by":"publisher","DOI":"10.1145\/3331184.3331198"},{"key":"e_1_3_3_79_2","doi-asserted-by":"publisher","DOI":"10.1145\/3539597.3570445"}],"container-title":["ACM Transactions on the Web"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3732294","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,8,22]],"date-time":"2025-08-22T15:27:21Z","timestamp":1755876441000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3732294"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,8,22]]},"references-count":78,"journal-issue":{"issue":"3","published-print":{"date-parts":[[2025,8,31]]}},"alternative-id":["10.1145\/3732294"],"URL":"https:\/\/doi.org\/10.1145\/3732294","relation":{},"ISSN":["1559-1131","1559-114X"],"issn-type":[{"value":"1559-1131","type":"print"},{"value":"1559-114X","type":"electronic"}],"subject":[],"published":{"date-parts":[[2025,8,22]]},"assertion":[{"value":"2024-07-15","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2025-03-12","order":2,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2025-08-22","order":3,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}