{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,6,5]],"date-time":"2026-06-05T04:51:22Z","timestamp":1780635082635,"version":"3.54.1"},"reference-count":58,"publisher":"Association for Computing Machinery (ACM)","issue":"CoNEXT4","license":[{"start":{"date-parts":[[2024,11,25]],"date-time":"2024-11-25T00:00:00Z","timestamp":1732492800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["Proc. ACM Netw."],"published-print":{"date-parts":[[2024,12]]},"abstract":"<jats:p>We explore the capabilities of Large Language Models (LLMs) to assist or substitute devices (i.e., firewalls) and humans (i.e., security experts) respectively in the detection and analysis of security incidents. We leverage transformer-based technologies, from relatively small to foundational sizes, to address the problem of correctly identifying the attack severity (and accessorily identifying and explaining the attack type). We contrast a broad range of LLM techniques (prompting, retrieval augmented generation, and fine-tuning of several models) using state-of-the-art machine learning models as a baseline. Using proprietary data from commercial deployment, our study provides an unbiased picture of the strengths and weaknesses of LLM for intrusion detection.<\/jats:p>","DOI":"10.1145\/3696379","type":"journal-article","created":{"date-parts":[[2024,11,25]],"date-time":"2024-11-25T11:15:47Z","timestamp":1732533347000},"page":"1-23","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":16,"title":["A Systematic Comparison of Large Language Models Performance for Intrusion Detection"],"prefix":"10.1145","volume":"2","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-2810-6056","authenticated-orcid":false,"given":"Minh-Thanh","family":"Bui","sequence":"first","affiliation":[{"name":"Huawei Technologies Co. Ltd., Boulogne-Billancourt, France"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-3144-9065","authenticated-orcid":false,"given":"Matteo","family":"Boffa","sequence":"additional","affiliation":[{"name":"Politecnico di Torino &amp; Huawei Technologies Co. Ltd., Turin, Italy"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-7702-2991","authenticated-orcid":false,"given":"Rodolfo Vieira","family":"Valentim","sequence":"additional","affiliation":[{"name":"Universit\u00e0 di Torino &amp; Huawei Technologies Co. Ltd., Turin, Italy"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-3408-7143","authenticated-orcid":false,"given":"Jose Manuel","family":"Navarro","sequence":"additional","affiliation":[{"name":"Huawei Technologies Co. Ltd., Boulogne-Billancourt, France"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-7240-2699","authenticated-orcid":false,"given":"Fuxing","family":"Chen","sequence":"additional","affiliation":[{"name":"Huawei Technologies Co. Ltd., Boulogne-Billancourt, France"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0009-0006-2537-1205","authenticated-orcid":false,"given":"Xiaosheng","family":"Bao","sequence":"additional","affiliation":[{"name":"Huawei Technologies Co. Ltd., Boulogne-Billancourt, France"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-7258-0919","authenticated-orcid":false,"given":"Zied Ben","family":"Houidi","sequence":"additional","affiliation":[{"name":"Huawei Technologies Co. Ltd., Boulogne-Billancourt, France"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-3936-8876","authenticated-orcid":false,"given":"Dario","family":"Rossi","sequence":"additional","affiliation":[{"name":"Huawei Technologies Co. Ltd., Boulogne-Billancourt, France"}],"role":[{"vocabulary":"crossref","role":"author"}]}],"member":"320","published-online":{"date-parts":[[2024,11,25]]},"reference":[{"key":"e_1_2_1_1_1","unstructured":"[n. d.]. Captum: Model Interpretability for Pytorch. https:\/\/captum.ai\/"},{"key":"e_1_2_1_2_1","unstructured":"[n. d.]. Chroma the AI-native open-source vector database. https:\/\/python.langchain.com\/docs\/integrations\/vectorstores\/chroma\/"},{"key":"e_1_2_1_3_1","unstructured":"[n. d.]. Microsoft Security Copilot Resources. https:\/\/microsoft.github.io\/PartnerResources\/skilling\/microsoft-securityacademy\/microsoft-security-copilot Accessed: 2023--12--13."},{"key":"e_1_2_1_4_1","unstructured":"[n. d.]. Transformers Interpret. https:\/\/github.com\/cdpierse\/transformers-interpret"},{"key":"e_1_2_1_5_1","unstructured":"2024. all-MiniLM-L6-v2 embedding model. https:\/\/huggingface.co\/sentence-transformers\/all-MiniLM-L6-v2"},{"key":"e_1_2_1_6_1","unstructured":"2024. all-mpnet-base-v2 embedding model. https:\/\/huggingface.co\/sentence-transformers\/all-mpnet-base-v2"},{"key":"e_1_2_1_7_1","volume-title":"SecureBERT: A Domain-Specific Language Model for Cybersecurity. In International Conference on Security and Privacy in Communication Systems. Springer, 39--56","author":"Aghaei Ehsan","year":"2022","unstructured":"Ehsan Aghaei, Xi Niu, Waseem Shadid, and Ehab Al-Shaer. 2022. SecureBERT: A Domain-Specific Language Model for Cybersecurity. In International Conference on Security and Privacy in Communication Systems. Springer, 39--56."},{"key":"e_1_2_1_8_1","volume-title":"31st USENIX Security Symposium (USENIX Security 22)","author":"Alahmadi Bushra A","year":"2022","unstructured":"Bushra A Alahmadi, Louise Axon, and Ivan Martinovic. 2022. 99% false positives: A qualitative study of SOC analysts? perspectives on security alarms. In 31st USENIX Security Symposium (USENIX Security 22). 2783--2800."},{"key":"e_1_2_1_9_1","doi-asserted-by":"publisher","DOI":"10.1109\/AICCSA56895.2022.10017800"},{"key":"e_1_2_1_10_1","volume-title":"31st USENIX Security Symposium. 3971--3988","author":"Arp Daniel","year":"2022","unstructured":"Daniel Arp, Erwin Quiring, Feargus Pendlebury, Alexander Warnecke, Fabio Pierazzi, Christian Wressnegger, Lorenzo Cavallaro, and Konrad Rieck. 2022. Dos and don'ts of machine learning in computer security. In 31st USENIX Security Symposium. 3971--3988."},{"key":"e_1_2_1_11_1","doi-asserted-by":"publisher","DOI":"10.1109\/CSNet52717.2021.9614644"},{"key":"e_1_2_1_12_1","volume-title":"RatGPT: Turning online LLMs into Proxies for Malware Attacks. arXiv preprint arXiv:2308.09183","author":"Beckerich Mika","year":"2023","unstructured":"Mika Beckerich, Laura Plein, and Sergio Coronado. 2023. RatGPT: Turning online LLMs into Proxies for Malware Attacks. arXiv preprint arXiv:2308.09183 (2023)."},{"key":"e_1_2_1_13_1","doi-asserted-by":"publisher","DOI":"10.1109\/EuroSPW55150.2022.00038"},{"key":"e_1_2_1_14_1","volume-title":"Luca Vassio, Danilo Giordano, Idilio Drago, Marco Mellia, and Zied Ben Houidi.","author":"Boffa Matteo","year":"2023","unstructured":"Matteo Boffa, Rodolfo Vieira Valentim, Luca Vassio, Danilo Giordano, Idilio Drago, Marco Mellia, and Zied Ben Houidi. 2023. LogPr\\?ecis: Unleashing Language Models for Automated Shell Log Analysis. arXiv preprint arXiv:2307.08309 (2023)."},{"key":"e_1_2_1_15_1","volume-title":"Lin (Eds.)","volume":"33","author":"Brown Tom","year":"2020","unstructured":"Tom Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared D Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, Sandhini Agarwal, Ariel Herbert-Voss, Gretchen Krueger, Tom Henighan, Rewon Child, Aditya Ramesh, Daniel Ziegler, Jeffrey Wu, Clemens Winter, Chris Hesse, Mark Chen, Eric Sigler, Mateusz Litwin, Scott Gray, Benjamin Chess, Jack Clark, Christopher Berner, Sam McCandlish, Alec Radford, Ilya Sutskever, and Dario Amodei. 2020. Language Models are Few-Shot Learners. In Advances in Neural Information Processing Systems, H. Larochelle, M. Ranzato, R. Hadsell, M.F. Balcan, and H. Lin (Eds.), Vol. 33. Curran Associates, Inc., 1877--1901. https:\/\/proceedings.neurips.cc\/paper_files\/paper\/2020\/file\/1457c0d6bfcb4967418bfb8ac142f64a-Paper.pdf"},{"key":"e_1_2_1_16_1","volume-title":"Yuanzhi Li, Scott Lundberg, et al.","author":"Bubeck S\u00e9bastien","year":"2023","unstructured":"S\u00e9bastien Bubeck, Varun Chandrasekaran, Ronen Eldan, Johannes Gehrke, Eric Horvitz, Ece Kamar, Peter Lee, Yin Tat Lee, Yuanzhi Li, Scott Lundberg, et al. 2023. Sparks of artificial general intelligence: Early experiments with gpt-4. arXiv preprint arXiv:2303.12712 (2023)."},{"key":"e_1_2_1_17_1","unstructured":"Nicholas Carlini Florian Tramer Eric Wallace Matthew Jagielski Ariel Herbert-Voss Katherine Lee Adam Roberts Tom Brown Dawn Song Ulfar Erlingsson et al. 2020. Colin Ra el. Extracting training data from large language models. arXiv preprint arXiv:2012.07805 (2020)."},{"key":"e_1_2_1_18_1","doi-asserted-by":"publisher","DOI":"10.1080\/08839514.2022.2145642"},{"key":"e_1_2_1_19_1","volume-title":"Qlora: Efficient finetuning of quantized llms. Advances in Neural Information Processing Systems 36","author":"Dettmers Tim","year":"2024","unstructured":"Tim Dettmers, Artidoro Pagnoni, Ari Holtzman, and Luke Zettlemoyer. 2024. Qlora: Efficient finetuning of quantized llms. Advances in Neural Information Processing Systems 36 (2024)."},{"key":"e_1_2_1_20_1","volume-title":"Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805","author":"Devlin Jacob","year":"2018","unstructured":"Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2018. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018)."},{"key":"e_1_2_1_21_1","volume-title":"LLM Agents can Autonomously Hack Websites. arXiv preprint arXiv:2402.06664","author":"Fang Richard","year":"2024","unstructured":"Richard Fang, Rohan Bindu, Akul Gupta, Qiusi Zhan, and Daniel Kang. 2024. LLM Agents can Autonomously Hack Websites. arXiv preprint arXiv:2402.06664 (2024)."},{"key":"e_1_2_1_22_1","volume-title":"SecureFalcon: The Next Cyber Reasoning System for Cyber Security. arXiv preprint arXiv:2307.06616","author":"Ferrag Mohamed Amine","year":"2023","unstructured":"Mohamed Amine Ferrag, Ammar Battah, Norbert Tihanyi, Merouane Debbah, Thierry Lestable, and Lucas C Cordeiro. 2023. SecureFalcon: The Next Cyber Reasoning System for Cyber Security. arXiv preprint arXiv:2307.06616 (2023)."},{"key":"e_1_2_1_23_1","volume-title":"Revolutionizing Cyber Threat Detection with Large Language Models. arXiv preprint arXiv:2306.14263","author":"Ferrag Mohamed Amine","year":"2023","unstructured":"Mohamed Amine Ferrag, Mthandazo Ndhlovu, Norbert Tihanyi, Lucas C Cordeiro, Merouane Debbah, and Thierry Lestable. 2023. Revolutionizing Cyber Threat Detection with Large Language Models. arXiv preprint arXiv:2306.14263 (2023)."},{"key":"e_1_2_1_24_1","volume-title":"Retrieval-augmented generation for large language models: A survey. arXiv preprint arXiv:2312.10997","author":"Gao Yunfan","year":"2023","unstructured":"Yunfan Gao, Yun Xiong, Xinyu Gao, Kangxiang Jia, Jinliu Pan, Yuxi Bi, Yi Dai, Jiawei Sun, and Haofen Wang. 2023. Retrieval-augmented generation for large language models: A survey. arXiv preprint arXiv:2312.10997 (2023)."},{"key":"e_1_2_1_25_1","volume-title":"Retrieval-augmented generation for large language models: A survey. arXiv:2312.10997","author":"Gao Yunfan","year":"2023","unstructured":"Yunfan Gao, Yun Xiong, Xinyu Gao, Kangxiang Jia, Jinliu Pan, Yuxi Bi, Yi Dai, Jiawei Sun, and Haofen Wang. 2023. Retrieval-augmented generation for large language models: A survey. arXiv:2312.10997 (2023). https:\/\/arxiv.org\/abs\/ 2312.10997"},{"key":"e_1_2_1_26_1","doi-asserted-by":"publisher","DOI":"10.1145\/3576915.3623202"},{"key":"e_1_2_1_27_1","volume-title":"Zied Ben Houidi, and Dario Rossi","author":"Gioacchini Luca","year":"2021","unstructured":"Luca Gioacchini, Luca Vassio, Marco Mellia, Idilio Drago, Zied Ben Houidi, and Dario Rossi. 2021. DarkVec: Automatic Analysis of Darknet Traffic with Word Embeddings. In ACM CoNEXT."},{"key":"e_1_2_1_28_1","doi-asserted-by":"publisher","DOI":"10.1145\/3605764.3623985"},{"key":"e_1_2_1_29_1","doi-asserted-by":"publisher","unstructured":"Maarten Grootendorst. 2020. KeyBERT: Minimal keyword extraction with BERT. https:\/\/doi.org\/10.5281\/zenodo. 4461265","DOI":"10.5281\/zenodo"},{"key":"e_1_2_1_30_1","volume-title":"Unixcoder: Unified cross-modal pre-training for code representation. arXiv preprint arXiv:2203.03850","author":"Guo Daya","year":"2022","unstructured":"Daya Guo, Shuai Lu, Nan Duan, Yanlin Wang, Ming Zhou, and Jian Yin. 2022. Unixcoder: Unified cross-modal pre-training for code representation. arXiv preprint arXiv:2203.03850 (2022)."},{"key":"e_1_2_1_31_1","volume-title":"netFound: Foundation Model for Network Security. arXiv preprint arXiv:2310.17025","author":"Guthula Satyandra","year":"2023","unstructured":"Satyandra Guthula, Navya Battula, Roman Beltiukov,Wenbo Guo, and Arpit Gupta. 2023. netFound: Foundation Model for Network Security. arXiv preprint arXiv:2310.17025 (2023)."},{"key":"e_1_2_1_32_1","volume-title":"Spear Phishing With Large Language Models. arXiv preprint arXiv:2305.06972","author":"Hazell Julian","year":"2023","unstructured":"Julian Hazell. 2023. Spear Phishing With Large Language Models. arXiv preprint arXiv:2305.06972 (2023)."},{"key":"e_1_2_1_33_1","volume-title":"Abdul Rehman Javed, Zunera Jalil, Xuan Liu, and Waleed S Alnumay.","author":"Imtiaz Syed Ibrahim","year":"2021","unstructured":"Syed Ibrahim Imtiaz, Saif ur Rehman, Abdul Rehman Javed, Zunera Jalil, Xuan Liu, and Waleed S Alnumay. 2021. DeepAMD: Detection and identification of Android malware using high-efficient Deep Artificial Neural Network. Future Generation computer systems 115 (2021), 844--856."},{"key":"e_1_2_1_34_1","unstructured":"Albert Q. Jiang Alexandre Sablayrolles Arthur Mensch Chris Bamford Devendra Singh Chaplot Diego de las Casas Florian Bressand Gianna Lengyel Guillaume Lample Lucile Saulnier L\u00e9lio Renard Lavaud Marie-Anne Lachaux Pierre Stock Teven Le Scao Thibaut Lavril Thomas Wang Timoth\u00e9e Lacroix and William El Sayed. 2023. Mistral 7B. arXiv:2310.06825 [cs.CL]"},{"key":"e_1_2_1_35_1","volume-title":"ChatIDS: Explainable Cybersecurity Using Generative AI. arXiv preprint arXiv:2306.14504","author":"J\u00fcttner Victor","year":"2023","unstructured":"Victor J\u00fcttner, Martin Grimmer, and Erik Buchmann. 2023. ChatIDS: Explainable Cybersecurity Using Generative AI. arXiv preprint arXiv:2306.14504 (2023)."},{"key":"e_1_2_1_36_1","volume-title":"Resilient and adaptive framework for large scale android malware fingerprinting using deep learning and NLP techniques. arXiv preprint arXiv:2105.13491","author":"Billah Karbab ElMouatez","year":"2021","unstructured":"ElMouatez Billah Karbab and Mourad Debbabi. 2021. Resilient and adaptive framework for large scale android malware fingerprinting using deep learning and NLP techniques. arXiv preprint arXiv:2105.13491 (2021)."},{"key":"e_1_2_1_37_1","volume-title":"Giovanni Vacanti, and Alexandru Coca. [n. d.]. Alibi: Algorithms for monitoring and explaining machine learning models","author":"Klaise Janis","year":"2019","unstructured":"Janis Klaise, Arnaud Van Looveren, Giovanni Vacanti, and Alexandru Coca. [n. d.]. Alibi: Algorithms for monitoring and explaining machine learning models, 2019. URL https:\/\/github. com\/SeldonIO\/alibi ([n. d.])."},{"key":"e_1_2_1_38_1","volume-title":"Captum: A unified and generic model interpretability library for pytorch. arXiv preprint arXiv:2009.07896","author":"Kokhlikyan Narine","year":"2020","unstructured":"Narine Kokhlikyan, Vivek Miglani, Miguel Martin, Edward Wang, Bilal Alsallakh, Jonathan Reynolds, Alexander Melnikov, Natalia Kliushkina, Carlos Araya, Siqi Yan, et al. 2020. Captum: A unified and generic model interpretability library for pytorch. arXiv preprint arXiv:2009.07896 (2020)."},{"key":"e_1_2_1_39_1","volume-title":"SecureBERT and LLAMA 2 Empowered Control Area Network Intrusion Detection and Classification. arXiv preprint arXiv:2311.12074","author":"Li Xuemei","year":"2023","unstructured":"Xuemei Li and Huirong Fu. 2023. SecureBERT and LLAMA 2 Empowered Control Area Network Intrusion Detection and Classification. arXiv preprint arXiv:2311.12074 (2023)."},{"key":"e_1_2_1_40_1","doi-asserted-by":"publisher","DOI":"10.1145\/3576915.3623120"},{"key":"e_1_2_1_41_1","volume-title":"Decoupled weight decay regularization. arXiv preprint arXiv:1711.05101","author":"Loshchilov Ilya","year":"2017","unstructured":"Ilya Loshchilov and Frank Hutter. 2017. Decoupled weight decay regularization. arXiv preprint arXiv:1711.05101 (2017)."},{"key":"e_1_2_1_42_1","volume-title":"Efficient estimation of word representations in vector space. arXiv:1301.3781","author":"Mikolov Tomas","year":"2013","unstructured":"Tomas Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean. 2013. Efficient estimation of word representations in vector space. arXiv:1301.3781 (2013)."},{"key":"e_1_2_1_43_1","doi-asserted-by":"publisher","DOI":"10.1145\/3576915.3616652"},{"key":"e_1_2_1_44_1","volume-title":"Scalable extraction of training data from (production) language models. arXiv preprint arXiv:2311.17035","author":"Nasr Milad","year":"2023","unstructured":"Milad Nasr, Nicholas Carlini, Jonathan Hayase, Matthew Jagielski, A Feder Cooper, Daphne Ippolito, Christopher A Choquette-Choo, Eric Wallace, Florian Tram\u00e8r, and Katherine Lee. 2023. Scalable extraction of training data from (production) language models. arXiv preprint arXiv:2311.17035 (2023)."},{"key":"e_1_2_1_45_1","volume-title":"Interpretml: A unified framework for machine learning interpretability. arXiv preprint arXiv:1909.09223","author":"Nori Harsha","year":"2019","unstructured":"Harsha Nori, Samuel Jenkins, Paul Koch, and Rich Caruana. 2019. Interpretml: A unified framework for machine learning interpretability. arXiv preprint arXiv:1909.09223 (2019)."},{"key":"e_1_2_1_46_1","unstructured":"OpenAI. 2023. GPT-4 Technical Report. arXiv:2303.08774 [cs.CL]"},{"key":"e_1_2_1_47_1","doi-asserted-by":"publisher","DOI":"10.1145\/3607505.3607513"},{"key":"e_1_2_1_48_1","volume-title":"International conference on machine learning. PMLR, 8748--8763","author":"Radford Alec","year":"2021","unstructured":"Alec Radford, Jong Wook Kim, Chris Hallacy, Aditya Ramesh, Gabriel Goh, Sandhini Agarwal, Girish Sastry, Amanda Askell, Pamela Mishkin, Jack Clark, et al. 2021. Learning transferable visual models from natural language supervision. In International conference on machine learning. PMLR, 8748--8763."},{"key":"e_1_2_1_49_1","unstructured":"Alec Radford Jeffrey Wu Rewon Child David Luan Dario Amodei Ilya Sutskever et al. 2019. Language models are unsupervised multitask learners. OpenAI blog 1 8 (2019) 9."},{"key":"e_1_2_1_50_1","doi-asserted-by":"publisher","DOI":"10.1109\/SMC52423.2021.9659287"},{"key":"e_1_2_1_51_1","doi-asserted-by":"publisher","DOI":"10.3390\/bdcc7020060"},{"key":"e_1_2_1_52_1","doi-asserted-by":"publisher","DOI":"10.1108\/eb026526"},{"key":"e_1_2_1_53_1","volume-title":"Axiomatic Attribution for Deep Networks. CoRR abs\/1703.01365","author":"Sundararajan Mukund","year":"2017","unstructured":"Mukund Sundararajan, Ankur Taly, and Qiqi Yan. 2017. Axiomatic Attribution for Deep Networks. CoRR abs\/1703.01365 (2017). arXiv:1703.01365 http:\/\/arxiv.org\/abs\/1703.01365"},{"key":"e_1_2_1_54_1","doi-asserted-by":"crossref","unstructured":"Yu Tian and Zhenyu Li. 2024. Dom-BERT: Detecting Malicious Domains with Pre-training Model. In Passive And Active Measurements.","DOI":"10.1007\/978-3-031-56249-5_6"},{"key":"e_1_2_1_55_1","unstructured":"Hugo Touvron Louis Martin Kevin Stone Peter Albert Amjad Almahairi Yasmine Babaei Nikolay Bashlykov Soumya Batra Prajjwal Bhargava Shruti Bhosale et al. 2023. Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288 (2023)."},{"key":"e_1_2_1_56_1","doi-asserted-by":"publisher","DOI":"10.1038\/s42256-022-00568-3"},{"key":"e_1_2_1_57_1","volume-title":"Large language models as optimizers. https:\/\/arxiv.org\/pdf\/2309.03409.pdf. arXiv:2309.03409","author":"Yang Chengrun","year":"2023","unstructured":"Chengrun Yang, Xuezhi Wang, Yifeng Lu, Hanxiao Liu, Quoc V Le, Denny Zhou, and Xinyun Chen. 2023. Large language models as optimizers. https:\/\/arxiv.org\/pdf\/2309.03409.pdf. arXiv:2309.03409 (2023)."},{"key":"e_1_2_1_58_1","volume-title":"Joshua Ainslie, Chris Alberti, Santiago Ontanon, Philip Pham, Anirudh Ravula, Qifan Wang, Li Yang, et al.","author":"Zaheer Manzil","year":"2020","unstructured":"Manzil Zaheer, Guru Guruganesh, Kumar Avinava Dubey, Joshua Ainslie, Chris Alberti, Santiago Ontanon, Philip Pham, Anirudh Ravula, Qifan Wang, Li Yang, et al. 2020. Big bird: Transformers for longer sequences. Advances in neural information processing systems 33 (2020), 17283--17297."}],"container-title":["Proceedings of the ACM on Networking"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3696379","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3696379","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,8,23]],"date-time":"2025-08-23T01:24:35Z","timestamp":1755912275000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3696379"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,11,25]]},"references-count":58,"journal-issue":{"issue":"CoNEXT4","published-print":{"date-parts":[[2024,12]]}},"alternative-id":["10.1145\/3696379"],"URL":"https:\/\/doi.org\/10.1145\/3696379","relation":{},"ISSN":["2834-5509"],"issn-type":[{"value":"2834-5509","type":"electronic"}],"subject":[],"published":{"date-parts":[[2024,11,25]]},"assertion":[{"value":"2024-11-25","order":3,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}