{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,25]],"date-time":"2026-02-25T17:51:29Z","timestamp":1772041889393,"version":"3.50.1"},"reference-count":56,"publisher":"Association for Computing Machinery (ACM)","issue":"OOPSLA2","content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["Proc. ACM Program. Lang."],"published-print":{"date-parts":[[2025,10,9]]},"abstract":"<jats:p>Generative AI has shown its value for many software engineering tasks. Still in its infancy, large language model (LLM)-based proof generation lags behind LLM-based code generation. In this paper, we present AutoVerus. AutoVerus uses LLMs to automatically generate correctness proof for Rust code. AutoVerus is designed to match the unique features of Verus, a verification tool that can prove the correctness of Rust code using proofs and specifications also written in Rust. AutoVerus consists of a network of agents that are crafted and orchestrated to mimic human experts' three phases of proof construction: preliminary proof generation, proof refinement guided by generic tips, and proof debugging guided by verification errors. To thoroughly evaluate AutoVerus and help foster future research in this direction, we have built a benchmark suite of 150 non-trivial proof tasks, based on existing code-generation benchmarks and verification benchmarks. Our evaluation shows that AutoVerus can automatically generate correct proof for more than 90% of them, with more than half of them tackled in less than 30 seconds or 3 LLM calls.<\/jats:p>","DOI":"10.1145\/3763174","type":"journal-article","created":{"date-parts":[[2025,10,9]],"date-time":"2025-10-09T08:49:50Z","timestamp":1759999790000},"page":"3454-3482","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":3,"title":["AutoVerus: Automated Proof Generation for Rust Code"],"prefix":"10.1145","volume":"9","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-7976-5086","authenticated-orcid":false,"given":"Chenyuan","family":"Yang","sequence":"first","affiliation":[{"name":"University of Illinois at Urbana-Champaign, Champaign0009-0000-1371-2179, USA"}]},{"ORCID":"https:\/\/orcid.org\/0009-0000-1371-2179","authenticated-orcid":false,"given":"Xuheng","family":"Li","sequence":"additional","affiliation":[{"name":"Columbia University, New York, USA"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-7931-6782","authenticated-orcid":false,"given":"Md Rakib Hossain","family":"Misu","sequence":"additional","affiliation":[{"name":"University of California at Irvine, Irvine, USA"}]},{"ORCID":"https:\/\/orcid.org\/0009-0008-4675-8980","authenticated-orcid":false,"given":"Jianan","family":"Yao","sequence":"additional","affiliation":[{"name":"University of Toronto, Toronto, Canada"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-2871-9485","authenticated-orcid":false,"given":"Weidong","family":"Cui","sequence":"additional","affiliation":[{"name":"Microsoft Research, Redmond, USA"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-9954-9674","authenticated-orcid":false,"given":"Yeyun","family":"Gong","sequence":"additional","affiliation":[{"name":"Microsoft Research Asia, Beijing, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-5676-0362","authenticated-orcid":false,"given":"Chris","family":"Hawblitzel","sequence":"additional","affiliation":[{"name":"Microsoft Research, Redmond, USA"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-4446-4777","authenticated-orcid":false,"given":"Shuvendu","family":"Lahiri","sequence":"additional","affiliation":[{"name":"Microsoft Research, Redmond, USA"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-7269-2769","authenticated-orcid":false,"given":"Jacob R.","family":"Lorch","sequence":"additional","affiliation":[{"name":"Microsoft Research, Redmond, USA"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-7466-2064","authenticated-orcid":false,"given":"Shuai","family":"Lu","sequence":"additional","affiliation":[{"name":"Microsoft Research Asia, Beijing, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-0378-060X","authenticated-orcid":false,"given":"Fan","family":"Yang","sequence":"additional","affiliation":[{"name":"Microsoft Research Asia, Beijing, China"}]},{"ORCID":"https:\/\/orcid.org\/0009-0007-2762-0989","authenticated-orcid":false,"given":"Ziqiao","family":"Zhou","sequence":"additional","affiliation":[{"name":"Microsoft Research, Redmond, USA"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-0757-4600","authenticated-orcid":false,"given":"Shan","family":"Lu","sequence":"additional","affiliation":[{"name":"Microsoft Research, Redmond, USA"},{"name":"University of Chicago, Chicago, USA"}]}],"member":"320","published-online":{"date-parts":[[2025,10,9]]},"reference":[{"key":"e_1_2_1_1_1","doi-asserted-by":"publisher","DOI":"10.1145\/3360573"},{"key":"e_1_2_1_2_1","volume-title":"Program Synthesis with Large Language Models. CoRR, abs\/2108.07732","author":"Austin Jacob","year":"2021","unstructured":"Jacob Austin, Augustus Odena, Maxwell I. Nye, Maarten Bosma, Henryk Michalewski, David Dohan, Ellen Jiang, Carrie J. Cai, Michael Terry, Quoc V. Le, and Charles Sutton. 2021. Program Synthesis with Large Language Models. CoRR, abs\/2108.07732 (2021), arxiv:2108.07732"},{"key":"e_1_2_1_3_1","volume-title":"Proceedings of the 36th International Conference on Machine Learning, ICML 2019","volume":"463","author":"Bansal Kshitij","year":"2019","unstructured":"Kshitij Bansal, Sarah M. Loos, Markus N. Rabe, Christian Szegedy, and Stewart Wilcox. 2019. HOList: An Environment for Machine Learning of Higher Order Logic Theorem Proving. In Proceedings of the 36th International Conference on Machine Learning, ICML 2019, 9-15 June 2019, Long Beach, California, USA (Proceedings of Machine Learning Research, Vol. 97). PMLR, 454\u2013463. http:\/\/proceedings.mlr.press\/v97\/bansal19a.html"},{"key":"e_1_2_1_4_1","unstructured":"Bruno Barras Samuel Boutin Cristina Cornes Judica\u00ebl Courant Yann Coscoy David Delahaye Daniel de Rauglaudre Jean-Christophe Filli\u00e2tre Eduardo Gim\u00e9nez Hugo Herbelin et al. 1999. The Coq proof assistant reference manual. INRIA version 6 11 (1999) https:\/\/hal.science\/inria-00069919"},{"key":"e_1_2_1_5_1","doi-asserted-by":"publisher","DOI":"10.48550\/ARXIV.2405.01787"},{"key":"e_1_2_1_6_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-030-81688-9_42"},{"key":"e_1_2_1_7_1","volume-title":"Jared Kaplan, Harri Edwards, Yuri Burda, Nicholas Joseph, Greg Brockman, et al.","author":"Chen Mark","year":"2021","unstructured":"Mark Chen, Jerry Tworek, Heewoo Jun, Qiming Yuan, Henrique Ponde De Oliveira Pinto, Jared Kaplan, Harri Edwards, Yuri Burda, Nicholas Joseph, Greg Brockman, et al. 2021. Evaluating large language models trained on code. CoRR, abs\/2107.03374 (2021), arxiv:2107.03374"},{"key":"e_1_2_1_8_1","doi-asserted-by":"publisher","DOI":"10.1145\/3625275.3625401"},{"key":"e_1_2_1_9_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-030-79876-5_37"},{"key":"e_1_2_1_10_1","doi-asserted-by":"publisher","DOI":"10.1145\/3597926.3598067"},{"key":"e_1_2_1_11_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-031-17244-1_6"},{"key":"e_1_2_1_12_1","volume-title":"10th Competition on Software Verification (SV-COMP","author":"ETAPS.","year":"2021","unstructured":"ETAPS. 2024. 10th Competition on Software Verification (SV-COMP 2021). https:\/\/sv-comp.sosy-lab.org\/2021\/ [Online], [Accessed: 2024-09-01]"},{"key":"e_1_2_1_13_1","doi-asserted-by":"publisher","DOI":"10.48550\/ARXIV.2404.10362"},{"key":"e_1_2_1_14_1","doi-asserted-by":"publisher","DOI":"10.1145\/3510003.3510138"},{"key":"e_1_2_1_15_1","doi-asserted-by":"publisher","DOI":"10.1145\/3428299"},{"key":"e_1_2_1_16_1","doi-asserted-by":"publisher","DOI":"10.1145\/3611643.3616243"},{"key":"e_1_2_1_17_1","doi-asserted-by":"publisher","DOI":"10.1007\/3-540-45251-6_29"},{"key":"e_1_2_1_18_1","doi-asserted-by":"publisher","DOI":"10.1145\/3656422"},{"key":"e_1_2_1_19_1","doi-asserted-by":"publisher","DOI":"10.1145\/3547647"},{"key":"e_1_2_1_20_1","volume-title":"Shan Lu, and Sarah Fakhoury.","author":"Jain Rijul","year":"2025","unstructured":"Rijul Jain, Shraddha Barke, Gabriel Ebner, Md Rakib Hossain Misu, Shan Lu, and Sarah Fakhoury. 2025. What\u2019s in a Proof? Analyzing Expert Proof-Writing Processes in F* and Verus. arXiv preprint arXiv:2508.02733."},{"key":"e_1_2_1_21_1","volume-title":"Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022","author":"Jiang Albert Qiaochu","year":"2022","unstructured":"Albert Qiaochu Jiang, Wenda Li, Szymon Tworkowski, Konrad Czechowski, Tomasz Odrzyg\u00f3zdz, Piotr Milos, Yuhuai Wu, and Mateja Jamnik. 2022. Thor: Wielding Hammers to Integrate Language Models and Automated Theorem Provers. In Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, NeurIPS 2022, New Orleans, LA, USA, November 28 - December 9, 2022. http:\/\/papers.nips.cc\/paper_files\/paper\/2022\/hash\/377c25312668e48f2e531e2f2c422483-Abstract-Conference.html"},{"key":"e_1_2_1_22_1","doi-asserted-by":"publisher","DOI":"10.1145\/3158154"},{"key":"e_1_2_1_23_1","doi-asserted-by":"publisher","DOI":"10.48550\/ARXIV.2311.07948"},{"key":"e_1_2_1_24_1","doi-asserted-by":"publisher","DOI":"10.1145\/3694715.3695952"},{"key":"e_1_2_1_25_1","doi-asserted-by":"publisher","DOI":"10.1145\/3586037"},{"key":"e_1_2_1_26_1","unstructured":"Hayley LeBlanc Jacob R Lorch Chris Hawblitzel Cheng Huang Yiheng Tao Nickolai Zeldovich and Vijay Chidambaram. 2025. PoWER Never Corrupts: Tool-Agnostic Verification of Crash Consistency and Corruption Detection. OSDI (to appear)."},{"key":"e_1_2_1_27_1","doi-asserted-by":"publisher","unstructured":"K. Rustan M. Leino. 2010. Dafny: An Automatic Program Verifier for Functional Correctness. In Logic for Programming Artificial Intelligence and Reasoning - 16th International Conference LPAR-16 Dakar Senegal April 25-May 1 2010 Revised Selected Papers Edmund M. Clarke and Andrei Voronkov (Eds.) (Lecture Notes in Computer Science Vol. 6355). Springer 348\u2013370. https:\/\/doi.org\/10.1007\/978-3-642-17511-4_20 10.1007\/978-3-642-17511-4_20","DOI":"10.1007\/978-3-642-17511-4_20"},{"key":"e_1_2_1_28_1","doi-asserted-by":"publisher","DOI":"10.48550\/ARXIV.2404.09939"},{"key":"e_1_2_1_29_1","volume-title":"Rigorous Evaluation of Large Language Models for Code Generation. In Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023","author":"Liu Jiawei","year":"2023","unstructured":"Jiawei Liu, Chunqiu Steven Xia, Yuyao Wang, and Lingming Zhang. 2023. Is Your Code Generated by ChatGPT Really Correct? Rigorous Evaluation of Large Language Models for Code Generation. In Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, NeurIPS 2023, New Orleans, LA, USA, December 10 - 16, 2023. http:\/\/papers.nips.cc\/paper_files\/paper\/2023\/hash\/43e9d647ccd3e4b7b5baab53f0368686-Abstract-Conference.html"},{"key":"e_1_2_1_30_1","doi-asserted-by":"publisher","DOI":"10.48550\/ARXIV.2406.08467"},{"key":"e_1_2_1_31_1","doi-asserted-by":"publisher","DOI":"10.1145\/3519939.3523704"},{"key":"e_1_2_1_32_1","unstructured":"Microsoft. 2024. Storage systems with verified correctness properties. https:\/\/github.com\/microsoft\/verified-storage [Online] [Accessed: 2024-09-01]"},{"key":"e_1_2_1_33_1","doi-asserted-by":"publisher","DOI":"10.1145\/3643763"},{"key":"e_1_2_1_34_1","doi-asserted-by":"publisher","DOI":"10.48550\/ARXIV.2405.16792"},{"key":"e_1_2_1_35_1","volume-title":"BACK TO THE BUILDING BLOCKS: A PATH TOWARD SECURE AND MEASURABLE SOFTWARE. https:\/\/www.whitehouse.gov\/wp-content\/uploads\/2024\/02\/Final-ONCD-Technical-Report.pdf","author":"Office of the National Cyber Director.","year":"2024","unstructured":"Office of the National Cyber Director. 2024. BACK TO THE BUILDING BLOCKS: A PATH TOWARD SECURE AND MEASURABLE SOFTWARE. https:\/\/www.whitehouse.gov\/wp-content\/uploads\/2024\/02\/Final-ONCD-Technical-Report.pdf"},{"key":"e_1_2_1_36_1","doi-asserted-by":"publisher","DOI":"10.1007\/BFB0030541"},{"key":"e_1_2_1_37_1","volume-title":"Generative Language Modeling for Automated Theorem Proving. CoRR, abs\/2009.03393","author":"Polu Stanislas","year":"2020","unstructured":"Stanislas Polu and Ilya Sutskever. 2020. Generative Language Modeling for Automated Theorem Proving. CoRR, abs\/2009.03393 (2020), arxiv:2009.03393"},{"key":"e_1_2_1_38_1","unstructured":"MIT Technology Review. 2023. How Rust went from a side project to the world\u2019s most-loved programming language. https:\/\/www.technologyreview.com\/2023\/02\/14\/1067869\/rust-worlds-fastest-growing-programming-language\/ [Online] [Accessed: 2024-09-01]"},{"key":"e_1_2_1_39_1","doi-asserted-by":"publisher","DOI":"10.1145\/3394450.3397466"},{"key":"e_1_2_1_40_1","volume-title":"AI tools at development process","author":"Survey Developer","year":"2024","unstructured":"Stackoverflow. 2024. \"2024 Developer Survey: AI tools at development process; Challenges with AI at work\". https:\/\/survey.stackoverflow.co\/2024\/ai\/ [Online], [Accessed: 2025-03-01]"},{"key":"e_1_2_1_41_1","unstructured":"Stackoverflow. 2024. \"Rust continues to be the most-admired programming language\". https:\/\/survey.stackoverflow.co\/2024\/technology [Online] [Accessed: 2024-09-01]"},{"key":"e_1_2_1_42_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-031-65112-0_7"},{"key":"e_1_2_1_43_1","volume-title":"Anvil: Verifying Liveness of Cluster Management Controllers. In 18th USENIX Symposium on Operating Systems Design and Implementation, OSDI 2024","author":"Sun Xudong","year":"2024","unstructured":"Xudong Sun, Wenjie Ma, Jiawei Tyler Gu, Zicheng Ma, Tej Chajed, Jon Howell, Andrea Lattuada, Oded Padon, Lalith Suresh, Adriana Szekeres, and Tianyin Xu. 2024. Anvil: Verifying Liveness of Cluster Management Controllers. In 18th USENIX Symposium on Operating Systems Design and Implementation, OSDI 2024, Santa Clara, CA, USA, July 10-12, 2024. USENIX Association, 649\u2013666. https:\/\/www.usenix.org\/conference\/osdi24\/presentation\/sun-xudong"},{"key":"e_1_2_1_44_1","doi-asserted-by":"publisher","DOI":"10.1145\/2837614.2837655"},{"key":"e_1_2_1_45_1","unstructured":"The Kani Team. 2024. How Open Source Projects are Using Kani to Write Better Software in Rust | AWS Open Source Blog.. https:\/\/aws.amazon.com\/blogs\/opensource\/how-open-source-projects-are-using-kani-to-write-better-software-in-rust\/ [Online] [Accessed: 2024-09-01]"},{"key":"e_1_2_1_46_1","doi-asserted-by":"crossref","unstructured":"Runchu Tian Yining Ye Yujia Qin Xin Cong Yankai Lin Yinxu Pan Yesai Wu Haotian Hui Weichuan Liu Zhiyuan Liu and Maosong Sun. 2024. DebugBench: Evaluating Debugging Capability of Large Language Models. 4173\u20134198. https:\/\/aclanthology.org\/2024.findings-acl.247","DOI":"10.18653\/v1\/2024.findings-acl.247"},{"key":"e_1_2_1_47_1","unstructured":"Verus. [n. d.]. Verus Tutorial and Reference. https:\/\/verus-lang.github.io\/verus\/guide\/."},{"key":"e_1_2_1_48_1","volume-title":"LEGO-Prover: Neural Theorem Proving with Growing Libraries. In The Twelfth International Conference on Learning Representations, ICLR 2024","author":"Wang Haiming","year":"2024","unstructured":"Haiming Wang, Huajian Xin, Chuanyang Zheng, Zhengying Liu, Qingxing Cao, Yinya Huang, Jing Xiong, Han Shi, Enze Xie, Jian Yin, Zhenguo Li, and Xiaodan Liang. 2024. LEGO-Prover: Neural Theorem Proving with Growing Libraries. In The Twelfth International Conference on Learning Representations, ICLR 2024, Vienna, Austria, May 7-11, 2024. OpenReview.net. https:\/\/openreview.net\/forum?id=3f5PALef5B"},{"key":"e_1_2_1_49_1","volume-title":"Lemur: Integrating Large Language Models in Automated Program Verification. In The Twelfth International Conference on Learning Representations, ICLR 2024","author":"Wu Haoze","year":"2024","unstructured":"Haoze Wu, Clark W. Barrett, and Nina Narodytska. 2024. Lemur: Integrating Large Language Models in Automated Program Verification. In The Twelfth International Conference on Learning Representations, ICLR 2024, Vienna, Austria, May 7-11, 2024. OpenReview.net. https:\/\/openreview.net\/forum?id=Q3YaCghZNt"},{"key":"e_1_2_1_50_1","doi-asserted-by":"publisher","DOI":"10.48550\/ARXIV.2304.00385"},{"key":"e_1_2_1_51_1","doi-asserted-by":"publisher","DOI":"10.5281\/zenodo.16891950"},{"key":"e_1_2_1_52_1","doi-asserted-by":"publisher","DOI":"10.1145\/3689736"},{"key":"e_1_2_1_53_1","doi-asserted-by":"publisher","DOI":"10.1145\/3676641.3716022"},{"key":"e_1_2_1_54_1","volume-title":"Proceedings of the 36th International Conference on Machine Learning, ICML 2019","volume":"6994","author":"Yang Kaiyu","year":"2019","unstructured":"Kaiyu Yang and Jia Deng. 2019. Learning to Prove Theorems via Interacting with Proof Assistants. In Proceedings of the 36th International Conference on Machine Learning, ICML 2019, 9-15 June 2019, Long Beach, California, USA (Proceedings of Machine Learning Research, Vol. 97). PMLR, 6984\u20136994. http:\/\/proceedings.mlr.press\/v97\/yang19a.html"},{"key":"e_1_2_1_55_1","volume-title":"LeanDojo: Theorem Proving with Retrieval-Augmented Language Models. In Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023","author":"Yang Kaiyu","year":"2023","unstructured":"Kaiyu Yang, Aidan M. Swope, Alex Gu, Rahul Chalamala, Peiyang Song, Shixing Yu, Saad Godil, Ryan J. Prenger, and Animashree Anandkumar. 2023. LeanDojo: Theorem Proving with Retrieval-Augmented Language Models. In Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, NeurIPS 2023, New Orleans, LA, USA, December 10 - 16, 2023. http:\/\/papers.nips.cc\/paper_files\/paper\/2023\/hash\/4441469427094f8873d0fecb0c4e1cee-Abstract-Datasets_and_Benchmarks.html"},{"key":"e_1_2_1_56_1","volume-title":"VeriSMo: A Verified Security Module for Confidential VMs. In 18th USENIX Symposium on Operating Systems Design and Implementation, OSDI 2024","author":"Zhou Ziqiao","year":"2024","unstructured":"Ziqiao Zhou, Anjali, Weiteng Chen, Sishuai Gong, Chris Hawblitzel, and Weidong Cui. 2024. VeriSMo: A Verified Security Module for Confidential VMs. In 18th USENIX Symposium on Operating Systems Design and Implementation, OSDI 2024, Santa Clara, CA, USA, July 10-12, 2024. USENIX Association, 599\u2013614. https:\/\/www.usenix.org\/conference\/osdi24\/presentation\/zhou"}],"container-title":["Proceedings of the ACM on Programming Languages"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3763174","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,9]],"date-time":"2025-10-09T17:42:11Z","timestamp":1760031731000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3763174"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,10,9]]},"references-count":56,"journal-issue":{"issue":"OOPSLA2","published-print":{"date-parts":[[2025,10,9]]}},"alternative-id":["10.1145\/3763174"],"URL":"https:\/\/doi.org\/10.1145\/3763174","relation":{},"ISSN":["2475-1421"],"issn-type":[{"value":"2475-1421","type":"electronic"}],"subject":[],"published":{"date-parts":[[2025,10,9]]},"assertion":[{"value":"2025-03-26","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2025-08-12","order":2,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2025-10-09","order":3,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}