{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,17]],"date-time":"2026-03-17T18:58:28Z","timestamp":1773773908204,"version":"3.50.1"},"publisher-location":"New York, NY, USA","reference-count":137,"publisher":"ACM","license":[{"start":{"date-parts":[[2024,8,24]],"date-time":"2024-08-24T00:00:00Z","timestamp":1724457600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2024,8,25]]},"DOI":"10.1145\/3637528.3671467","type":"proceedings-article","created":{"date-parts":[[2024,8,25]],"date-time":"2024-08-25T04:54:55Z","timestamp":1724561695000},"page":"6523-6533","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":22,"title":["Grounding and Evaluation for Large Language Models: Practical Challenges and Lessons Learned (Survey)"],"prefix":"10.1145","author":[{"ORCID":"https:\/\/orcid.org\/0000-0003-1237-087X","authenticated-orcid":false,"given":"Krishnaram","family":"Kenthapadi","sequence":"first","affiliation":[{"name":"Oracle Health AI, Redwood City, CA, USA"}]},{"ORCID":"https:\/\/orcid.org\/0009-0008-9979-5574","authenticated-orcid":false,"given":"Mehrnoosh","family":"Sameki","sequence":"additional","affiliation":[{"name":"Microsoft Azure AI, Boston, MA, USA"}]},{"ORCID":"https:\/\/orcid.org\/0009-0004-3459-0288","authenticated-orcid":false,"given":"Ankur","family":"Taly","sequence":"additional","affiliation":[{"name":"Google Cloud AI, Sunnyvale, CA, USA"}]}],"member":"320","published-online":{"date-parts":[[2024,8,24]]},"reference":[{"key":"e_1_3_2_1_1_1","doi-asserted-by":"crossref","unstructured":"Martin Abadi Andy Chu Ian Goodfellow H Brendan McMahan Ilya Mironov Kunal Talwar and Li Zhang. 2016. Deep learning with differential privacy. In CCS.","DOI":"10.1145\/2976749.2978318"},{"key":"e_1_3_2_1_2_1","doi-asserted-by":"publisher","DOI":"10.1073\/pnas.2307304121"},{"key":"e_1_3_2_1_3_1","unstructured":"Google Cloud AI. 2024. Check grounding | Vertex AI Agent Builder. https: \/\/cloud.google.com\/generative-ai-app-builder\/docs\/check-grounding"},{"key":"e_1_3_2_1_4_1","unstructured":"Microsoft Azure AI. 2024. Groundedness detection. https:\/\/learn.microsoft. com\/en-us\/azure\/ai-services\/content-safety\/quickstart-groundedness"},{"key":"e_1_3_2_1_5_1","volume-title":"The Twelfth International Conference on Learning Representations.","author":"Asai Akari","year":"2024","unstructured":"Akari Asai, Zeqiu Wu, Yizhong Wang, Avirup Sil, and Hannaneh Hajishirzi. 2024. Self-RAG: Learning to Retrieve, Generate, and Critique through Self-Reflection. In The Twelfth International Conference on Learning Representations."},{"key":"e_1_3_2_1_6_1","unstructured":"Amanda Askell Yuntao Bai Anna Chen Dawn Drain Deep Ganguli Tom Henighan Andy Jones Nicholas Joseph Ben Mann Nova DasSarma et al. 2021. A general language assistant as a laboratory for alignment. arXiv preprint arXiv:2112.00861 (2021)."},{"key":"e_1_3_2_1_7_1","volume-title":"International Conference on Machine Learning. PMLR, 457--467","author":"Aydore Sergul","year":"2021","unstructured":"Sergul Aydore, William Brown, Michael Kearns, Krishnaram Kenthapadi, Luca Melis, Aaron Roth, and Ankit A Siva. 2021. Differentially private query release through adaptive projection. In International Conference on Machine Learning. PMLR, 457--467."},{"key":"e_1_3_2_1_8_1","volume-title":"Measuring implicit bias in explicitly unbiased large language models. arXiv preprint arXiv:2402.04105","author":"Bai Xuechunzi","year":"2024","unstructured":"Xuechunzi Bai, Angelina Wang, Ilia Sucholutsky, and Thomas L Griffiths. 2024. Measuring implicit bias in explicitly unbiased large language models. arXiv preprint arXiv:2402.04105 (2024)."},{"key":"e_1_3_2_1_9_1","volume-title":"Mihai Christodorescu, Anupam Datta, Soheil Feizi, et al.","author":"Barrett Clark","year":"2023","unstructured":"Clark Barrett, Brad Boyd, Elie Bursztein, Nicholas Carlini, Brad Chen, Jihye Choi, Amrita Roy Chowdhury, Mihai Christodorescu, Anupam Datta, Soheil Feizi, et al. 2023. Identifying and mitigating the security risks of generative AI. Foundations and Trends\u00ae in Privacy and Security 6, 1 (2023), 1--52."},{"key":"e_1_3_2_1_10_1","doi-asserted-by":"publisher","DOI":"10.1145\/3442188.3445922"},{"key":"e_1_3_2_1_11_1","unstructured":"Manish Bhatt Sahana Chennabasappa Cyrus Nikolaidis Shengye Wan Ivan Evtimov Dominik Gabi Daniel Song Faizan Ahmad Cornelius Aschermann Lorenzo Fontana et al. 2023. Purple Llama CyberSecEval: A secure coding benchmark for language models. arXiv preprint arXiv:2312.04724 (2023)."},{"key":"e_1_3_2_1_12_1","unstructured":"Tolga Bolukbasi Kai-Wei Chang James Y Zou Venkatesh Saligrama and Adam T Kalai. 2016. Man is to computer programmer as woman is to homemaker? Debiasing word embeddings. In NeurIPS."},{"key":"e_1_3_2_1_13_1","volume-title":"Bowen Baker, Leo Gao, Leopold Aschenbrenner, Yining Chen, Adrien Ecoffet, Manas Joglekar, Jan Leike, et al.","author":"Burns Collin","year":"2023","unstructured":"Collin Burns, Pavel Izmailov, Jan Hendrik Kirchner, Bowen Baker, Leo Gao, Leopold Aschenbrenner, Yining Chen, Adrien Ecoffet, Manas Joglekar, Jan Leike, et al. 2023. Weak-to-strong generalization: Eliciting strong capabilities with weak supervision. arXiv preprint arXiv:2312.09390 (2023)."},{"key":"e_1_3_2_1_14_1","volume-title":"Detecting and mitigating bias in natural language processing. Brookings Institution","author":"Caliskan Aylin","year":"2021","unstructured":"Aylin Caliskan. 2021. Detecting and mitigating bias in natural language processing. Brookings Institution (2021)."},{"key":"e_1_3_2_1_15_1","volume-title":"Semantics derived automatically from language corpora contain human-like biases. Science 356, 6334","author":"Caliskan Aylin","year":"2017","unstructured":"Aylin Caliskan, Joanna J Bryson, and Arvind Narayanan. 2017. Semantics derived automatically from language corpora contain human-like biases. Science 356, 6334 (2017)."},{"key":"e_1_3_2_1_16_1","volume-title":"Extracting training data from diffusion models. arXiv preprint arXiv:2301.13188","author":"Carlini Nicholas","year":"2023","unstructured":"Nicholas Carlini, Jamie Hayes, Milad Nasr, Matthew Jagielski, Vikash Sehwag, Florian Tram\u00e8r, Borja Balle, Daphne Ippolito, and Eric Wallace. 2023. Extracting training data from diffusion models. arXiv preprint arXiv:2301.13188 (2023)."},{"key":"e_1_3_2_1_17_1","volume-title":"USENIX Security Symposium","volume":"6","author":"Carlini Nicholas","year":"2021","unstructured":"Nicholas Carlini, Florian Tramer, Eric Wallace, Matthew Jagielski, Ariel Herbert- Voss, Katherine Lee, Adam Roberts, Tom Brown, Dawn Song, Ulfar Erlingsson, Alina Oprea, and Colin Raffel. 2021. Extracting Training Data from Large Language Models. In USENIX Security Symposium, Vol. 6."},{"key":"e_1_3_2_1_18_1","volume-title":"J\u00e9r\u00e9my Scheurer, Javier Rando, Rachel Freedman, Tomasz Korbak, David Lindner, Pedro Freire, et al.","author":"Casper Stephen","year":"2023","unstructured":"Stephen Casper, Xander Davies, Claudia Shi, Thomas Krendl Gilbert, J\u00e9r\u00e9my Scheurer, Javier Rando, Rachel Freedman, Tomasz Korbak, David Lindner, Pedro Freire, et al. 2023. Open problems and fundamental limitations of reinforcement learning from human feedback. arXiv preprint arXiv:2307.15217 (2023)."},{"key":"e_1_3_2_1_19_1","unstructured":"Chi-Min Chan Chunpu Xu Ruibin Yuan Hongyin Luo Wei Xue Yike Guo and Jie Fu. 2024. RQ-RAG: Learning to Refine Queries for Retrieval Augmented Generation. arXiv:2404.00610"},{"key":"e_1_3_2_1_20_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2023.findings-emnlp.345"},{"key":"e_1_3_2_1_21_1","doi-asserted-by":"crossref","unstructured":"Tong Chen Hongwei Wang Sihao Chen Wenhao Yu Kaixin Ma Xinran Zhao Hongming Zhang and Dong Yu. 2023. Dense X Retrieval: What Retrieval Granularity Should We Use? arXiv:2312.06648","DOI":"10.18653\/v1\/2024.emnlp-main.845"},{"key":"e_1_3_2_1_22_1","doi-asserted-by":"crossref","unstructured":"Maria De-Arteaga Alexey Romanov Hanna Wallach Jennifer Chayes Christian Borgs Alexandra Chouldechova Sahin Geyik Krishnaram Kenthapadi and Adam Tauman Kalai. 2019. Bias in bios: A case study of semantic representation bias in a high-stakes setting. In FAccT.","DOI":"10.1145\/3287560.3287572"},{"key":"e_1_3_2_1_23_1","unstructured":"Leon Derczynski Erick Galinkin Jeffrey Martin Subho Majumdar and Nanna Inie. 2024. garak: A Framework for Security Probing Large Language Models. https:\/\/garak.ai. (2024)."},{"key":"e_1_3_2_1_24_1","doi-asserted-by":"publisher","DOI":"10.1145\/3531146.3533221"},{"key":"e_1_3_2_1_25_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2023.findings-emnlp.754"},{"key":"e_1_3_2_1_26_1","unstructured":"Matthijs Douze Alexandr Guzhva Chengqi Deng Jeff Johnson Gergely Szilvasy Pierre-Emmanuel Mazar\u00e9 Maria Lomeli Lucas Hosseini and Herv\u00e9 J\u00e9gou. 2024. The Faiss library. (2024). arXiv:cs.LG\/2401.08281"},{"key":"e_1_3_2_1_27_1","volume-title":"Proceedings of the 18th Conference of the European Chapter of the Association for Computational Linguistics: System Demonstrations. 150--158","author":"Es Shahul","year":"2024","unstructured":"Shahul Es, Jithin James, Luis Espinosa Anke, and Steven Schockaert. 2024. RAGAs: Automated Evaluation of Retrieval Augmented Generation. In Proceedings of the 18th Conference of the European Chapter of the Association for Computational Linguistics: System Demonstrations. 150--158."},{"key":"e_1_3_2_1_28_1","volume-title":"Publicly detectable watermarking for language models. arXiv preprint arXiv:2310.18491","author":"Fairoze Jaiden","year":"2023","unstructured":"Jaiden Fairoze, Sanjam Garg, Somesh Jha, Saeed Mahloujifar, Mohammad Mahmoody, and Mingyuan Wang. 2023. Publicly detectable watermarking for language models. arXiv preprint arXiv:2310.18491 (2023)."},{"key":"e_1_3_2_1_29_1","volume-title":"Don't Hallucinate","author":"Feng Shangbin","unstructured":"Shangbin Feng, Weijia Shi, Yike Wang, Wenxuan Ding, Vidhisha Balachandran, and Yulia Tsvetkov. 2024. Don't Hallucinate, Abstain: Identifying LLM Knowledge Gaps via Multi-LLM Collaboration. arXiv:2402.00367"},{"key":"e_1_3_2_1_30_1","volume-title":"Sungchul Kim","author":"Gallegos Isabel O","year":"2023","unstructured":"Isabel O Gallegos, Ryan A Rossi, Joe Barrow, Md Mehrab Tanjim, Sungchul Kim, Franck Dernoncourt, Tong Yu, Ruiyi Zhang, and Nesreen K Ahmed. 2023. Bias and fairness in large language models: A survey. arXiv preprint arXiv:2309.00770 (2023)."},{"key":"e_1_3_2_1_31_1","unstructured":"Deep Ganguli Liane Lovitt Jackson Kernion Amanda Askell Yuntao Bai Saurav Kadavath Ben Mann Ethan Perez Nicholas Schiefer Kamal Ndousse et al. 2022. Red teaming language models to reduce harms: Methods scaling behaviors and lessons learned. arXiv preprint arXiv:2209.07858 (2022)."},{"key":"e_1_3_2_1_32_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1"},{"key":"e_1_3_2_1_33_1","unstructured":"L Gao J Tow B Abbasi S Biderman S Black A DiPofi C Foster L Golding J Hsu A Le Noac'h et al. 2023. A framework for few-shot language model evaluation. Zenodo (2023)."},{"key":"e_1_3_2_1_34_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1"},{"key":"e_1_3_2_1_35_1","unstructured":"Yunfan Gao Yun Xiong Xinyu Gao Kangxiang Jia Jinliu Pan Yuxi Bi Yi Dai Jiawei Sun Meng Wang and Haofen Wang. 2024. Retrieval-Augmented Generation for Large Language Models: A Survey. arXiv:2312.10997"},{"key":"e_1_3_2_1_36_1","doi-asserted-by":"publisher","DOI":"10.1073\/pnas.1720347115"},{"key":"e_1_3_2_1_37_1","doi-asserted-by":"publisher","DOI":"10.1613\/jair.1.13715"},{"key":"e_1_3_2_1_38_1","doi-asserted-by":"crossref","unstructured":"Zorik Gekhman Gal Yona Roee Aharoni Matan Eyal Amir Feder Roi Reichart and Jonathan Herzig. 2024. Does Fine-Tuning LLMs on New Knowledge Encourage Hallucinations? arXiv:2405.05904","DOI":"10.18653\/v1\/2024.emnlp-main.444"},{"key":"e_1_3_2_1_39_1","volume-title":"Watermarking pre-trained language models with backdooring. arXiv preprint arXiv:2210.07543","author":"Gu Chenxi","year":"2022","unstructured":"Chenxi Gu, Chengsong Huang, Xiaoqing Zheng, Kai-Wei Chang, and Cho-Jui Hsieh. 2022. Watermarking pre-trained language models with backdooring. arXiv preprint arXiv:2210.07543 (2022)."},{"key":"e_1_3_2_1_40_1","volume-title":"Allie Del Giorno, Sivakanth Gopi, Mojan Javaheripi, Piero Kauffmann, Gustavo de Rosa, Olli Saarikivi, et al.","author":"Gunasekar Suriya","year":"2023","unstructured":"Suriya Gunasekar, Yi Zhang, Jyoti Aneja, Caio C\u00e9sar Teodoro Mendes, Allie Del Giorno, Sivakanth Gopi, Mojan Javaheripi, Piero Kauffmann, Gustavo de Rosa, Olli Saarikivi, et al. 2023. Textbooks are all you need. arXiv preprint arXiv:2306.11644 (2023)."},{"key":"e_1_3_2_1_41_1","doi-asserted-by":"crossref","unstructured":"Anisha Gunjal Jihan Yin and Erhan Bas. 2024. Detecting and preventing hallucinations in large vision language models. In AAAI.","DOI":"10.1609\/aaai.v38i16.29771"},{"key":"e_1_3_2_1_42_1","unstructured":"Zishan Guo Renren Jin Chuang Liu Yufei Huang Dan Shi Linhao Yu Yan Liu Jiaxuan Li Bojian Xiong Deyi Xiong et al. 2023. Evaluating large language models: A comprehensive survey. arXiv preprint arXiv:2310.19736 (2023)."},{"key":"e_1_3_2_1_43_1","volume-title":"Measuring Distributional Shifts in Text: The Advantage of Language Model-Based Embeddings. arXiv preprint arXiv:2312.02337","author":"Gupta Gyandev","year":"2023","unstructured":"Gyandev Gupta, Bashir Rastegarpanah, Amalendu Iyer, Joshua Rubin, and Krishnaram Kenthapadi. 2023. Measuring Distributional Shifts in Text: The Advantage of Language Model-Based Embeddings. arXiv preprint arXiv:2312.02337 (2023)."},{"key":"e_1_3_2_1_44_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1"},{"key":"e_1_3_2_1_45_1","volume-title":"What's in a Name? Auditing Large Language Models for Race and Gender Bias. arXiv preprint arXiv:2402.14875","author":"Haim Amit","year":"2024","unstructured":"Amit Haim, Alejandro Salinas, and Julian Nyarko. 2024. What's in a Name? Auditing Large Language Models for Race and Gender Bias. arXiv preprint arXiv:2402.14875 (2024)."},{"key":"e_1_3_2_1_46_1","volume-title":"Dan Jurafsky, and Sharese King.","author":"Hofmann Valentin","year":"2024","unstructured":"Valentin Hofmann, Pratyusha Ria Kalluri, Dan Jurafsky, and Sharese King. 2024. Dialect prejudice predicts AI decisions about people's character, employability, and criminality. arXiv preprint arXiv:2403.00742 (2024)."},{"key":"e_1_3_2_1_47_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1"},{"key":"e_1_3_2_1_48_1","unstructured":"Lei Huang Weijiang Yu Weitao Ma Weihong Zhong Zhangyin Feng Haotian Wang Qianglong Chen Weihua Peng Xiaocheng Feng Bing Qin et al. 2023. A survey on hallucination in large language models: Principles taxonomy challenges and open questions. arXiv preprint arXiv:2311.05232 (2023)."},{"key":"e_1_3_2_1_49_1","unstructured":"Evan Hubinger Carson Denison Jesse Mu Mike Lambert Meg Tong Monte MacDiarmid Tamera Lanham Daniel M Ziegler Tim Maxwell Newton Cheng et al. 2024. Sleeper agents: Training deceptive LLMs that persist through safety training. arXiv preprint arXiv:2401.05566 (2024)."},{"key":"e_1_3_2_1_50_1","volume-title":"Llama Guard: LLM-based input-output safeguard for human-AI conversations. arXiv preprint arXiv:2312.06674","author":"Inan Hakan","year":"2023","unstructured":"Hakan Inan, Kartikeya Upasani, Jianfeng Chi, Rashi Rungta, Krithika Iyer, Yuning Mao, Michael Tontchev, Qing Hu, Brian Fuller, Davide Testuggine, et al. 2023. Llama Guard: LLM-based input-output safeguard for human-AI conversations. arXiv preprint arXiv:2312.06674 (2023)."},{"key":"e_1_3_2_1_51_1","unstructured":"Amal Iyer and Krishnaram Kenthapadi. 2023. Introducing Fiddler Auditor: Evaluate the Robustness of LLMs and NLP Models. Fiddler AI Blog."},{"key":"e_1_3_2_1_52_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1"},{"key":"e_1_3_2_1_53_1","volume-title":"Sung Ju Hwang, and Jong C. Park","author":"Jeong Soyeong","year":"2024","unstructured":"Soyeong Jeong, Jinheon Baek, Sukmin Cho, Sung Ju Hwang, and Jong C. Park. 2024. Adaptive-RAG: Learning to Adapt Retrieval-Augmented Large Language Models through Question Complexity. arXiv:2403.14403"},{"key":"e_1_3_2_1_54_1","doi-asserted-by":"publisher","DOI":"10.1145\/3571730"},{"key":"e_1_3_2_1_55_1","volume-title":"Data-Efficient Alignment of Large Language Models with Human Feedback Through Natural Language. In NeurIPS 2023 Workshop on Instruction Tuning and Instruction Following.","author":"Jin Di","year":"2023","unstructured":"Di Jin, Shikib Mehri, Devamanyu Hazarika, Aishwarya Padmakumar, Sungjin Lee, Yang Liu, and Mahdi Namazifar. 2023. Data-Efficient Alignment of Large Language Models with Human Feedback Through Natural Language. In NeurIPS 2023 Workshop on Instruction Tuning and Instruction Following."},{"key":"e_1_3_2_1_56_1","volume-title":"Adversaries Can Misuse Combinations of Safe Models. arXiv preprint arXiv:2406.14595","author":"Jones Erik","year":"2024","unstructured":"Erik Jones, Anca Dragan, and Jacob Steinhardt. 2024. Adversaries Can Misuse Combinations of Safe Models. arXiv preprint arXiv:2406.14595 (2024)."},{"key":"e_1_3_2_1_57_1","volume-title":"The Twelfth International Conference on Learning Representations (ICLR).","author":"Jones Erik","year":"2024","unstructured":"Erik Jones, Hamid Palangi, Clarisse Sim\u00f5es Ribeiro, Varun Chandrasekaran, Subhabrata Mukherjee, Arindam Mitra, Ahmed Hassan Awadallah, and Ece Kamar. 2024. Teaching Language Models to Hallucinate Less with Synthetic Tasks. In The Twelfth International Conference on Learning Representations (ICLR)."},{"key":"e_1_3_2_1_58_1","volume-title":"Personas as a way to model truthfulness in language models. arXiv preprint arXiv:2310.18168","author":"Joshi Nitish","year":"2023","unstructured":"Nitish Joshi, Javier Rando, Abulhair Saparov, Najoung Kim, and He He. 2023. Personas as a way to model truthfulness in language models. arXiv preprint arXiv:2310.18168 (2023)."},{"key":"e_1_3_2_1_59_1","unstructured":"Saurav Kadavath Tom Conerly Amanda Askell Tom Henighan Dawn Drain Ethan Perez Nicholas Schiefer Zac Hatfield-Dodds Nova DasSarma Eli Tran-Johnson Scott Johnston Sheer El-Showk Andy Jones Nelson Elhage Tristan Hume Anna Chen Yuntao Bai Sam Bowman Stanislav Fort Deep Ganguli Danny Hernandez Josh Jacobson Jackson Kernion Shauna Kravec Liane Lovitt Kamal Ndousse Catherine Olsson Sam Ringer Dario Amodei Tom Brown Jack Clark Nicholas Joseph Ben Mann Sam McCandlish Chris Olah and Jared Kaplan. 2022. Language Models (Mostly) KnowWhat They Know. arXiv:2207.05221"},{"key":"e_1_3_2_1_60_1","volume-title":"Learning The Difference That Makes A Difference With Counterfactually-Augmented Data. In International Conference on Learning Representations. https:\/\/openreview. net\/forum?id=Sklgs0NFvr","author":"Kaushik Divyansh","year":"2020","unstructured":"Divyansh Kaushik, Eduard Hovy, and Zachary Lipton. 2020. Learning The Difference That Makes A Difference With Counterfactually-Augmented Data. In International Conference on Learning Representations. https:\/\/openreview. net\/forum?id=Sklgs0NFvr"},{"key":"e_1_3_2_1_61_1","doi-asserted-by":"publisher","DOI":"10.1145\/3461702.3462516"},{"key":"e_1_3_2_1_62_1","volume-title":"International Conference on Machine Learning. PMLR, 17061--17084","author":"Kirchenbauer John","year":"2023","unstructured":"John Kirchenbauer, Jonas Geiping, Yuxin Wen, Jonathan Katz, Ian Miers, and Tom Goldstein. 2023. A watermark for large language models. In International Conference on Machine Learning. PMLR, 17061--17084."},{"key":"e_1_3_2_1_63_1","unstructured":"Abdullatif K\u00f6ksal Renat Aksitov and Chung-Ching Chang. 2023. Hallucination Augmented Recitations for Language Models. arXiv:2311.07424"},{"key":"e_1_3_2_1_64_1","unstructured":"Microsoft Learn. 2023. Overview of Responsible AI practices for Azure OpenAI models. Azure AI Services Documentation."},{"key":"e_1_3_2_1_65_1","volume-title":"Lin (Eds.)","volume":"33","author":"Lewis Patrick","year":"2020","unstructured":"Patrick Lewis, Ethan Perez, Aleksandra Piktus, Fabio Petroni, Vladimir Karpukhin, Naman Goyal, Heinrich K\u00fcttler, Mike Lewis,Wen-tau Yih, Tim Rockt\u00e4schel, Sebastian Riedel, and Douwe Kiela. 2020. Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks. In Advances in Neural Information Processing Systems, H. Larochelle, M. Ranzato, R. Hadsell, M.F. Balcan, and H. Lin (Eds.), Vol. 33. Curran Associates, Inc., 9459--9474. https:\/\/proceedings.neurips. cc\/paper_files\/paper\/2020\/file\/6b493230205f780e1bc26945df7481e5-Paper.pdf"},{"key":"e_1_3_2_1_66_1","unstructured":"Percy Liang Rishi Bommasani Tony Lee Dimitris Tsipras Dilara Soylu Michihiro Yasunaga Yian Zhang Deepak Narayanan YuhuaiWu Ananya Kumar et al. 2022. Holistic evaluation of language models. arXiv preprint arXiv:2211.09110 (2022)."},{"key":"e_1_3_2_1_67_1","volume-title":"Evaluating Verifiability in Generative Search Engines. In The 2023 Conference on Empirical Methods in Natural Language Processing. https:\/\/openreview.net\/forum?id=ZQV5iRPAua","author":"Liu Nelson F.","year":"2023","unstructured":"Nelson F. Liu, Tianyi Zhang, and Percy Liang. 2023. Evaluating Verifiability in Generative Search Engines. In The 2023 Conference on Empirical Methods in Natural Language Processing. https:\/\/openreview.net\/forum?id=ZQV5iRPAua"},{"key":"e_1_3_2_1_68_1","unstructured":"Sijia Liu Yuanshun Yao Jinghan Jia Stephen Casper Nathalie Baracaldo Peter Hase Xiaojun Xu Yuguang Yao Hang Li Kush R Varshney et al. 2024. Rethinking Machine Unlearning for Large Language Models. arXiv preprint arXiv:2402.08787 (2024)."},{"key":"e_1_3_2_1_69_1","first-page":"690","article-title":"Iterative methods for private synthetic data: Unifying framework and new methods","volume":"34","author":"Liu Terrance","year":"2021","unstructured":"Terrance Liu, Giuseppe Vietri, and Steven Z Wu. 2021. Iterative methods for private synthetic data: Unifying framework and new methods. Advances in Neural Information Processing Systems 34 (2021), 690--702.","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_3_2_1_70_1","first-page":"16548","article-title":"Are two heads the same as one? Identifying disparate treatment in fair neural networks","volume":"35","author":"Lohaus Michael","year":"2022","unstructured":"Michael Lohaus, Matth\u00e4us Kleindessner, Krishnaram Kenthapadi, Francesco Locatello, and Chris Russell. 2022. Are two heads the same as one? Identifying disparate treatment in fair neural networks. Advances in Neural Information Processing Systems 35 (2022), 16548--16562.","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_3_2_1_71_1","unstructured":"Shayne Longpre Stella Biderman Alon Albalak Hailey Schoelkopf Daniel McDuff Sayash Kapoor Kevin Klyman Kyle Lo Gabriel Ilharco Nay San et al. 2024. The Responsible Foundation Model Development Cheatsheet: A Review of Tools & Resources. arXiv preprint arXiv:2406.16746 (2024)."},{"key":"e_1_3_2_1_72_1","volume-title":"Search Augmented Instruction Learning. In The 2023 Conference on Empirical Methods in Natural Language Processing. https:\/\/openreview.net\/forum?id=noIvPGG8P1","author":"Luo Hongyin","unstructured":"Hongyin Luo, Tianhua Zhang, Yung-Sung Chuang, Yuan Gong, Yoon Kim, Xixin Wu, Helen M. Meng, and James R. Glass. 2023. Search Augmented Instruction Learning. In The 2023 Conference on Empirical Methods in Natural Language Processing. https:\/\/openreview.net\/forum?id=noIvPGG8P1"},{"key":"e_1_3_2_1_73_1","volume-title":"Katherine Hermann, SeanWelleck, Amir Yazdanbakhsh, and Peter Clark.","author":"Madaan Aman","year":"2023","unstructured":"Aman Madaan, Niket Tandon, Prakhar Gupta, Skyler Hallinan, Luyu Gao, Sarah Wiegreffe, Uri Alon, Nouha Dziri, Shrimai Prabhumoye, Yiming Yang, Shashank Gupta, Bodhisattwa Prasad Majumder, Katherine Hermann, SeanWelleck, Amir Yazdanbakhsh, and Peter Clark. 2023. Self-Refine: Iterative Refinement with Self-Feedback. https:\/\/arxiv.org\/abs\/2303.17651. arXiv:cs.CL\/2303.17651"},{"key":"e_1_3_2_1_74_1","unstructured":"Ahmed Magooda Alec Helyar Kyle Jackson David Sullivan Chad Atalla Emily Sheng Dan Vann Richard Edgar Hamid Palangi Roman Lutz et al. 2023. A Framework for Automated Measurement of Responsible AI Harms in Generative AI Applications. arXiv preprint arXiv:2310.17750 (2023)."},{"key":"e_1_3_2_1_75_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2023.emnlp-main.557"},{"key":"e_1_3_2_1_76_1","volume-title":"PRP: Propagating Universal Perturbations to Attack Large Language Model Guard-Rails. arXiv preprint arXiv:2402.15911","author":"Mangaokar Neal","year":"2024","unstructured":"Neal Mangaokar, Ashish Hooda, Jihye Choi, Shreyas Chandrashekaran, Kassem Fawaz, Somesh Jha, and Atul Prakash. 2024. PRP: Propagating Universal Perturbations to Attack Large Language Model Guard-Rails. arXiv preprint arXiv:2402.15911 (2024)."},{"key":"e_1_3_2_1_77_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2023.findings-emnlp.182"},{"key":"e_1_3_2_1_78_1","first-page":"17359","article-title":"Locating and editing factual associations in GPT","volume":"35","author":"Meng Kevin","year":"2022","unstructured":"Kevin Meng, David Bau, Alex Andonian, and Yonatan Belinkov. 2022. Locating and editing factual associations in GPT. Advances in Neural Information Processing Systems 35 (2022), 17359--17372.","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_3_2_1_79_1","unstructured":"Jacob Menick Maja Trebacz Vladimir Mikulik John Aslanides Francis Song Martin Chadwick Mia Glaese Susannah Young Lucy Campbell-Gillingham Geoffrey Irving and Nat McAleese. 2022. Teaching language models to support answers with verified quotes. arXiv:2203.11147"},{"key":"e_1_3_2_1_80_1","doi-asserted-by":"publisher","DOI":"10.1145\/3476415.3476428"},{"key":"e_1_3_2_1_81_1","volume-title":"International Conference on Machine Learning. PMLR, 24950--24962","author":"Mitchell Eric","year":"2023","unstructured":"Eric Mitchell, Yoonho Lee, Alexander Khazatsky, Christopher D Manning, and Chelsea Finn. 2023. DetectGPT: Zero-shot machine-generated text detection using probability curvature. In International Conference on Machine Learning. PMLR, 24950--24962."},{"key":"e_1_3_2_1_82_1","unstructured":"Sidharth Mudgal Jong Lee Harish Ganapathy YaGuang Li Tao Wang Yanping Huang Zhifeng Chen Heng-Tze Cheng Michael Collins Trevor Strohman Jilin Chen Alex Beutel and Ahmad Beirami. 2024. Controlled Decoding from Language Models. arXiv:2310.17022"},{"key":"e_1_3_2_1_83_1","volume-title":"David Amore Cecchini, Rakshit Khajuria, Prikshit Sharma, Ali Tarik Mirik, Veysel Kocaman, and David Talby.","author":"Nazir Arshaan","year":"2024","unstructured":"Arshaan Nazir, Thadaka Kalyan Chakravarthy, David Amore Cecchini, Rakshit Khajuria, Prikshit Sharma, Ali Tarik Mirik, Veysel Kocaman, and David Talby. 2024. LangTest: A comprehensive evaluation library for custom LLM and NLP models. Software Impacts (2024), 100619."},{"key":"e_1_3_2_1_84_1","volume-title":"XInstructBLIP: A Framework for aligning X-Modal instruction-aware representations to LLMs and Emergent Cross-modal Reasoning. arXiv preprint arXiv:2311.18799","author":"Panagopoulou Artemis","year":"2023","unstructured":"Artemis Panagopoulou, Le Xue, Ning Yu, Junnan Li, Dongxu Li, Shafiq Joty, Ran Xu, Silvio Savarese, Caiming Xiong, and Juan Carlos Niebles. 2023. XInstructBLIP: A Framework for aligning X-Modal instruction-aware representations to LLMs and Emergent Cross-modal Reasoning. arXiv preprint arXiv:2311.18799 (2023)."},{"key":"e_1_3_2_1_85_1","volume-title":"Scalable Private Learning with PATE. In International Conference on Learning Representations.","author":"Papernot Nicolas","year":"2018","unstructured":"Nicolas Papernot, Shuang Song, Ilya Mironov, Ananth Raghunathan, Kunal Talwar, and Ulfar Erlingsson. 2018. Scalable Private Learning with PATE. In International Conference on Learning Representations."},{"key":"e_1_3_2_1_86_1","unstructured":"Baolin Peng Michel Galley Pengcheng He Hao Cheng Yujia Xie Yu Hu Qiuyuan Huang Lars Liden Zhou Yu Weizhu Chen and Jianfeng Gao. 2023. Check Your Facts and Try Again: Improving Large Language Models with External Knowledge and Automated Feedback. arXiv:2302.12813"},{"key":"e_1_3_2_1_87_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2022.emnlp-main.225"},{"key":"e_1_3_2_1_88_1","unstructured":"Pinecone. [n. d.]. Pinecone Vector Database. http:\/\/pinecone.io"},{"key":"e_1_3_2_1_89_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2023.emnlp-main.255"},{"key":"e_1_3_2_1_90_1","volume-title":"The Twelfth International Conference on Learning Representations.","author":"Qi Xiangyu","year":"2024","unstructured":"Xiangyu Qi, Yi Zeng, Tinghao Xie, Pin-Yu Chen, Ruoxi Jia, Prateek Mittal, and Peter Henderson. 2024. Fine-tuning Aligned Language Models Compromises Safety, Even When Users Do Not Intend To!. In The Twelfth International Conference on Learning Representations."},{"key":"e_1_3_2_1_91_1","article-title":"Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer","volume":"21","author":"Raffel Colin","year":"2020","unstructured":"Colin Raffel, Noam Shazeer, Adam Roberts, Katherine Lee, Sharan Narang, Michael Matena, Yanqi Zhou,Wei Li, and Peter J. Liu. 2020. Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer. J. Mach. Learn. Res. 21 (2020), 140:1--140:67. http:\/\/jmlr.org\/papers\/v21\/20-074.html","journal-title":"J. Mach. Learn. Res."},{"key":"e_1_3_2_1_92_1","doi-asserted-by":"publisher","DOI":"10.1162\/coli_a_00486"},{"key":"e_1_3_2_1_93_1","doi-asserted-by":"publisher","DOI":"10.1109\/SaTML54575.2023.00039"},{"key":"e_1_3_2_1_94_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2023.emnlp-main.155"},{"key":"e_1_3_2_1_95_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2023.emnlp-demo.40"},{"key":"e_1_3_2_1_96_1","volume-title":"Large Language Models are Biased Because They Are Large Language Models. arXiv preprint arXiv:2406.13138","author":"Resnik Philip","year":"2024","unstructured":"Philip Resnik. 2024. Large Language Models are Biased Because They Are Large Language Models. arXiv preprint arXiv:2406.13138 (2024)."},{"key":"e_1_3_2_1_97_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1"},{"key":"e_1_3_2_1_98_1","volume-title":"Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies","volume":"1","author":"Romanov Alexey","year":"2019","unstructured":"Alexey Romanov, Maria De-Arteaga, HannaWallach, Jennifer Chayes, Christian Borgs, Alexandra Chouldechova, Sahin Geyik, Krishnaram Kenthapadi, Anna Rumshisky, and Adam Kalai. 2019. What's in a Name? Reducing Bias in Bios without Access to Protected Attributes. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). 4187--4195."},{"key":"e_1_3_2_1_99_1","volume-title":"New York Times","author":"Roose Kevin","year":"2023","unstructured":"Kevin Roose. 2023. A Conversation With Bings Chatbot Left Me Deeply Unsettled. New York Times (2023)."},{"key":"e_1_3_2_1_100_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2024.naacl-long.20"},{"key":"e_1_3_2_1_101_1","volume-title":"International Conference on Machine Learning. PMLR, 29971--30004","author":"Santurkar Shibani","year":"2023","unstructured":"Shibani Santurkar, Esin Durmus, Faisal Ladhak, Cinoo Lee, Percy Liang, and Tatsunori Hashimoto. 2023. Whose opinions do language models reflect?. In International Conference on Machine Learning. PMLR, 29971--30004."},{"key":"e_1_3_2_1_102_1","unstructured":"Toby Shevlane Sebastian Farquhar Ben Garfinkel Mary Phuong Jess Whittlestone Jade Leung Daniel Kokotajlo Nahema Marchal Markus Anderljung Noam Kolt et al. 2023. Model evaluation for extreme risks. arXiv preprint arXiv:2305.15324 (2023)."},{"key":"e_1_3_2_1_103_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2024.naacl-short.69"},{"key":"e_1_3_2_1_104_1","volume-title":"On Early Detection of Hallucinations in Factual Question Answering. arXiv preprint arXiv:2312.14183","author":"Snyder Ben","year":"2023","unstructured":"Ben Snyder, Marius Moisescu, and Muhammad Bilal Zafar. 2023. On Early Detection of Hallucinations in Factual Question Answering. arXiv preprint arXiv:2312.14183 (2023)."},{"key":"e_1_3_2_1_105_1","volume-title":"Lin (Eds.)","volume":"33","author":"Stiennon Nisan","year":"2020","unstructured":"Nisan Stiennon, Long Ouyang, Jeffrey Wu, Daniel Ziegler, Ryan Lowe, Chelsea Voss, Alec Radford, Dario Amodei, and Paul F Christiano. 2020. Learning to summarize with human feedback. In Advances in Neural Information Processing Systems, H. Larochelle, M. Ranzato, R. Hadsell, M.F. Balcan, and H. Lin (Eds.), Vol. 33. Curran Associates, Inc., 3008--3021. https:\/\/proceedings.neurips.cc\/ paper_files\/paper\/2020\/file\/1f89885d556929e98d3ef9b86448f951-Paper.pdf"},{"key":"e_1_3_2_1_106_1","volume-title":"Evaluating and mitigating discrimination in language model decisions. arXiv preprint arXiv:2312.03689","author":"Tamkin Alex","year":"2023","unstructured":"Alex Tamkin, Amanda Askell, Liane Lovitt, Esin Durmus, Nicholas Joseph, Shauna Kravec, Karina Nguyen, Jared Kaplan, and Deep Ganguli. 2023. Evaluating and mitigating discrimination in language model decisions. arXiv preprint arXiv:2312.03689 (2023)."},{"key":"e_1_3_2_1_107_1","volume-title":"Benchmarking differentially private synthetic data generation algorithms. arXiv preprint arXiv:2112.09238","author":"Tao Yuchao","year":"2021","unstructured":"Yuchao Tao, Ryan McKenna, Michael Hay, Ashwin Machanavajjhala, and Gerome Miklau. 2021. Benchmarking differentially private synthetic data generation algorithms. arXiv preprint arXiv:2112.09238 (2021)."},{"key":"e_1_3_2_1_108_1","unstructured":"Yi Tay Vinh Q. Tran Mostafa Dehghani Jianmo Ni Dara Bahri Harsh Mehta Zhen Qin Kai Hui Zhe Zhao Jai Gupta Tal Schuster William W. Cohen and Donald Metzler. 2022. Transformer Memory as a Differentiable Search Index. In Advances in Neural Information Processing Systems Alice H. Oh Alekh Agarwal Danielle Belgrave and Kyunghyun Cho (Eds.). https:\/\/openreview.net\/forum? id=Vu-B0clPfq"},{"key":"e_1_3_2_1_109_1","volume-title":"Fine-Tuning Language Models for Factuality. In The Twelfth International Conference on Learning Representations. https:\/\/openreview.net\/forum? id=WPZ2yPag4K","author":"Tian Katherine","year":"2024","unstructured":"Katherine Tian, Eric Mitchell, Huaxiu Yao, Christopher D Manning, and Chelsea Finn. 2024. Fine-Tuning Language Models for Factuality. In The Twelfth International Conference on Learning Representations. https:\/\/openreview.net\/forum? id=WPZ2yPag4K"},{"key":"e_1_3_2_1_110_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1"},{"key":"e_1_3_2_1_111_1","volume-title":"Skating to Where the Puck is Going: Anticipating and Managing Risks from Frontier AI Systems. Report from the","author":"Toner Helen","year":"2023","unstructured":"Helen Toner, Jessica Ji, John Bansemer, Lucy Lim, Chris Painter, Courtney Corley, Jess Whittlestone, Matt Botvinick, Mikel Rodriguez, and Ram Shankar Siva Kumar. 2023. Skating to Where the Puck is Going: Anticipating and Managing Risks from Frontier AI Systems. Report from the July 2023 Roundtable hosted by the Center for Security and Emerging Technology (CSET) at Georgetown University and Google DeepMind."},{"key":"e_1_3_2_1_112_1","unstructured":"Lifu Tu Semih Yavuz Jin Qu Jiacheng Xu Rui Meng Caiming Xiong and Yingbo Zhou. 2024. Unlocking Anticipatory Text Generation: A Constrained Approach for Faithful Decoding with Large Language Models. https:\/\/openreview.net\/ forum?id=774elYc5tw"},{"key":"e_1_3_2_1_113_1","volume-title":"Language models don't always say what they think: Unfaithful explanations in chainof- thought prompting. Advances in Neural Information Processing Systems 36","author":"Turpin Miles","year":"2023","unstructured":"Miles Turpin, Julian Michael, Ethan Perez, and Samuel Bowman. 2023. Language models don't always say what they think: Unfaithful explanations in chainof- thought prompting. Advances in Neural Information Processing Systems 36 (2023)."},{"key":"e_1_3_2_1_114_1","volume-title":"Proceedings of the 18th Conference of the European","author":"Vernikos Giorgos","year":"2024","unstructured":"Giorgos Vernikos, Arthur Brazinskas, Jakub Adamek, Jonathan Mallinson, Aliaksei Severyn, and Eric Malmi. 2024. Small Language Models Improve Giants by Rewriting Their Outputs. In Proceedings of the 18th Conference of the European Chapter of the Association for Computational Linguistics (Volume 1: Long Papers), Yvette Graham and Matthew Purver (Eds.). Association for Computational Linguistics, St. Julian's, Malta, 2703--2718. https:\/\/aclanthology.org\/2024.eacllong. 165"},{"key":"e_1_3_2_1_115_1","doi-asserted-by":"crossref","unstructured":"Tu Vu Mohit Iyyer Xuezhi Wang Noah Constant Jerry Wei Jason Wei Chris Tar Yun-Hsuan Sung Denny Zhou Quoc Le and Thang Luong. 2023. Fresh-LLMs: Refreshing Large Language Models with Search Engine Augmentation. arXiv:2310.03214","DOI":"10.18653\/v1\/2024.findings-acl.813"},{"key":"e_1_3_2_1_116_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2021.naacl-main.13"},{"key":"e_1_3_2_1_117_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2022.naacl-main.339"},{"key":"e_1_3_2_1_118_1","volume-title":"Md Rizwan Parvez, and Graham Neubig","author":"Wang Zhiruo","year":"2023","unstructured":"Zhiruo Wang, Jun Araki, Zhengbao Jiang, Md Rizwan Parvez, and Graham Neubig. 2023. Learning to Filter Context for Retrieval-Augmented Generation. arXiv:2311.08377"},{"key":"e_1_3_2_1_119_1","volume-title":"Thirty-seventh Conference on Neural Information Processing Systems. https:\/\/openreview.net\/forum?id=jA235JGM09","author":"Wei Alexander","year":"2023","unstructured":"Alexander Wei, Nika Haghtalab, and Jacob Steinhardt. 2023. Jailbroken: How Does LLM Safety Training Fail?. In Thirty-seventh Conference on Neural Information Processing Systems. https:\/\/openreview.net\/forum?id=jA235JGM09"},{"key":"e_1_3_2_1_120_1","first-page":"24824","article-title":"Chain-of-thought prompting elicits reasoning in large language models","volume":"35","author":"Wei Jason","year":"2022","unstructured":"Jason Wei, Xuezhi Wang, Dale Schuurmans, Maarten Bosma, Fei Xia, Ed Chi, Quoc V Le, and Denny Zhou. 2022. Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35 (2022), 24824--24837.","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_3_2_1_121_1","volume-title":"Prompt injection attacks against GPT-3. Simon Willison's Weblog","author":"Willison Simon","year":"2022","unstructured":"Simon Willison. 2022. Prompt injection attacks against GPT-3. Simon Willison's Weblog (2022)."},{"key":"e_1_3_2_1_122_1","doi-asserted-by":"publisher","DOI":"10.1145\/3593013.3594072"},{"key":"e_1_3_2_1_123_1","unstructured":"KevinWu EricWu and James Zou. 2024. ClashEval: Quantifying the tug-of-war between an LLM's internal prior and external evidence. arXiv:2404.10198"},{"key":"e_1_3_2_1_124_1","volume-title":"Thirty-seventh Conference on Neural Information Processing Systems. https:\/\/openreview.net\/ forum?id=uPSQv0leAu","author":"Xie Sang Michael","year":"2023","unstructured":"Sang Michael Xie, Shibani Santurkar, Tengyu Ma, and Percy Liang. 2023. Data Selection for Language Models via Importance Resampling. In Thirty-seventh Conference on Neural Information Processing Systems. https:\/\/openreview.net\/ forum?id=uPSQv0leAu"},{"key":"e_1_3_2_1_125_1","volume-title":"The Twelfth International Conference on Learning Representations. https:\/\/openreview. net\/forum?id=xw5nxFWMlo","author":"Xu Peng","year":"2024","unstructured":"Peng Xu, Wei Ping, Xianchao Wu, Lawrence McAfee, Chen Zhu, Zihan Liu, Sandeep Subramanian, Evelina Bakhturina, Mohammad Shoeybi, and Bryan Catanzaro. 2024. Retrieval meets Long Context Large Language Models. In The Twelfth International Conference on Learning Representations. https:\/\/openreview. net\/forum?id=xw5nxFWMlo"},{"key":"e_1_3_2_1_126_1","volume-title":"FUDGE: Controlled Text Generation With Future Discriminators. In Proceedings of the 2021 Conference of the North American","author":"Yang Kevin","year":"2021","unstructured":"Kevin Yang and Dan Klein. 2021. FUDGE: Controlled Text Generation With Future Discriminators. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Kristina Toutanova, Anna Rumshisky, Luke Zettlemoyer, Dilek Hakkani-Tur, Iz Beltagy, Steven Bethard, Ryan Cotterell, Tanmoy Chakraborty, and Yichao Zhou (Eds.). Association for Computational Linguistics, Online, 3511--3535. https:\/\/aclanthology.org\/2021.naacl-main.276"},{"key":"e_1_3_2_1_127_1","volume-title":"A survey on large language model (LLM) security and privacy: The good, the bad, and the ugly. High-Confidence Computing","author":"Yao Yifan","year":"2024","unstructured":"Yifan Yao, Jinhao Duan, Kaidi Xu, Yuanfang Cai, Zhibo Sun, and Yue Zhang. 2024. A survey on large language model (LLM) security and privacy: The good, the bad, and the ugly. High-Confidence Computing (2024), 100211."},{"key":"e_1_3_2_1_128_1","volume-title":"Woodpecker: Hallucination correction for multimodal large language models. arXiv preprint arXiv:2310.16045","author":"Yin Shukang","year":"2023","unstructured":"Shukang Yin, Chaoyou Fu, Sirui Zhao, Tong Xu, Hao Wang, Dianbo Sui, Yunhang Shen, Ke Li, Xing Sun, and Enhong Chen. 2023. Woodpecker: Hallucination correction for multimodal large language models. arXiv preprint arXiv:2310.16045 (2023)."},{"key":"e_1_3_2_1_129_1","volume-title":"The Eleventh International Conference on Learning Representations. https:\/\/openreview.net\/forum?id=fB0hRu9GZUS","author":"Yu Wenhao","year":"2023","unstructured":"Wenhao Yu, Dan Iter, Shuohang Wang, Yichong Xu, Mingxuan Ju, Soumya Sanyal, Chenguang Zhu, Michael Zeng, and Meng Jiang. 2023. Generate rather than Retrieve: Large Language Models are Strong Context Generators. In The Eleventh International Conference on Learning Representations. https:\/\/openreview.net\/forum?id=fB0hRu9GZUS"},{"key":"e_1_3_2_1_130_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2024.naacl-long.394"},{"key":"e_1_3_2_1_131_1","doi-asserted-by":"publisher","DOI":"10.1145\/3658673"},{"key":"e_1_3_2_1_132_1","unstructured":"Penghao Zhao Hailin Zhang Qinhan Yu Zhengren Wang Yunteng Geng Fangcheng Fu Ling Yang Wentao Zhang Jie Jiang and Bin Cui. 2024. Retrieval- Augmented Generation for AI-Generated Content: A Survey. arXiv:2402.19473"},{"key":"e_1_3_2_1_133_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1"},{"key":"e_1_3_2_1_134_1","volume-title":"LIMA: Less Is More for Alignment. In Thirty-seventh Conference on Neural Information Processing Systems. https: \/\/openreview.net\/forum?id=KBMOKmX2he","author":"Zhou Chunting","year":"2023","unstructured":"Chunting Zhou, Pengfei Liu, Puxin Xu, Srini Iyer, Jiao Sun, Yuning Mao, Xuezhe Ma, Avia Efrat, Ping Yu, Lili Yu, Susan Zhang, Gargi Ghosh, Mike Lewis, Luke Zettlemoyer, and Omer Levy. 2023. LIMA: Less Is More for Alignment. In Thirty-seventh Conference on Neural Information Processing Systems. https: \/\/openreview.net\/forum?id=KBMOKmX2he"},{"key":"e_1_3_2_1_135_1","volume-title":"Prompt- Bench: A unified library for evaluation of large language models. arXiv preprint arXiv:2312.07910","author":"Zhu Kaijie","year":"2023","unstructured":"Kaijie Zhu, Qinlin Zhao, Hao Chen, Jindong Wang, and Xing Xie. 2023. Prompt- Bench: A unified library for evaluation of large language models. arXiv preprint arXiv:2312.07910 (2023)."},{"key":"e_1_3_2_1_136_1","volume-title":"Large language models for information retrieval: A survey. arXiv preprint arXiv:2308.07107","author":"Zhu Yutao","year":"2023","unstructured":"Yutao Zhu, Huaying Yuan, Shuting Wang, Jiongnan Liu, Wenhan Liu, Chenlong Deng, Zhicheng Dou, and Ji-Rong Wen. 2023. Large language models for information retrieval: A survey. arXiv preprint arXiv:2308.07107 (2023)."},{"key":"e_1_3_2_1_137_1","volume-title":"Universal and transferable adversarial attacks on aligned language models. arXiv preprint arXiv:2307.15043","author":"Zou Andy","year":"2023","unstructured":"Andy Zou, Zifan Wang, J Zico Kolter, and Matt Fredrikson. 2023. Universal and transferable adversarial attacks on aligned language models. arXiv preprint arXiv:2307.15043 (2023)."}],"event":{"name":"KDD '24: The 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining","location":"Barcelona Spain","acronym":"KDD '24","sponsor":["SIGMOD ACM Special Interest Group on Management of Data","SIGKDD ACM Special Interest Group on Knowledge Discovery in Data"]},"container-title":["Proceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining"],"original-title":[],"link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3637528.3671467","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3637528.3671467","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,19]],"date-time":"2025-06-19T00:03:26Z","timestamp":1750291406000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3637528.3671467"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,8,24]]},"references-count":137,"alternative-id":["10.1145\/3637528.3671467","10.1145\/3637528"],"URL":"https:\/\/doi.org\/10.1145\/3637528.3671467","relation":{},"subject":[],"published":{"date-parts":[[2024,8,24]]},"assertion":[{"value":"2024-08-24","order":3,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}