{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,27]],"date-time":"2026-02-27T22:56:12Z","timestamp":1772232972231,"version":"3.50.1"},"publisher-location":"New York, NY, USA","reference-count":40,"publisher":"ACM","license":[{"start":{"date-parts":[[2024,4,14]],"date-time":"2024-04-14T00:00:00Z","timestamp":1713052800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2024,4,14]]},"DOI":"10.1145\/3650105.3652299","type":"proceedings-article","created":{"date-parts":[[2024,6,12]],"date-time":"2024-06-12T16:01:35Z","timestamp":1718208095000},"page":"86-90","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":22,"title":["Fine Tuning Large Language Model for Secure Code Generation"],"prefix":"10.1145","author":[{"ORCID":"https:\/\/orcid.org\/0000-0001-8828-632X","authenticated-orcid":false,"given":"Junjie","family":"Li","sequence":"first","affiliation":[{"name":"Concordia University, Montreal, Canada"}]},{"ORCID":"https:\/\/orcid.org\/0009-0005-9002-7841","authenticated-orcid":false,"given":"Aseem","family":"Sangalay","sequence":"additional","affiliation":[{"name":"Delhi Technological University, Delhi, India"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-6962-6923","authenticated-orcid":false,"given":"Cheng","family":"Cheng","sequence":"additional","affiliation":[{"name":"Concordia University, Montreal, Canada"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-2208-3893","authenticated-orcid":false,"given":"Yuan","family":"Tian","sequence":"additional","affiliation":[{"name":"Queen's University, Ontario, Canada"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-4282-406X","authenticated-orcid":false,"given":"Jinqiu","family":"Yang","sequence":"additional","affiliation":[{"name":"Concordia University, Montreal, Canada"}]}],"member":"320","published-online":{"date-parts":[[2024,6,12]]},"reference":[{"key":"e_1_3_2_1_1_1","unstructured":"[n.d.]. 2022 CWE Top 25 Most Dangerous Software Weaknesses. https:\/\/cwe.mitre.org\/top25\/archive\/2022\/2022_cwe_top25.html. Accessed: 2024-01-16."},{"key":"e_1_3_2_1_2_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2021.naacl-main.211"},{"key":"e_1_3_2_1_3_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.jocs.2017.11.011"},{"key":"e_1_3_2_1_4_1","doi-asserted-by":"publisher","DOI":"10.1145\/3475960.3475985"},{"key":"e_1_3_2_1_5_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2022.bigscience-1.9"},{"key":"e_1_3_2_1_6_1","volume-title":"Proceedings of the Twelveth ACM Conference on Data and Application Security and Privacy. 101--106","author":"Challande Alexis","year":"2022","unstructured":"Alexis Challande, Robin David, and Gu\u00e9na\u00ebl Renault. 2022. Building a Commit-level Dataset of Real-world Vulnerabilities. In Proceedings of the Twelveth ACM Conference on Data and Application Security and Privacy. 101--106."},{"key":"e_1_3_2_1_7_1","unstructured":"Mark Chen Jerry Tworek Heewoo Jun Qiming Yuan Henrique Ponde de Oliveira Pinto Jared Kaplan Harri Edwards Yuri Burda Nicholas Joseph Greg Brockman Alex Ray Raul Puri Gretchen Krueger Michael Petrov Heidy Khlaaf Girish Sastry Pamela Mishkin Brooke Chan Scott Gray Nick Ryder Mikhail Pavlov Alethea Power Lukasz Kaiser Mohammad Bavarian Clemens Winter Philippe Tillet Felipe Petroski Such Dave Cummings Matthias Plappert Fotios Chantzis Elizabeth Barnes Ariel Herbert-Voss William Hebgen Guss Alex Nichol Alex Paino Nikolas Tezak Jie Tang Igor Babuschkin Suchir Balaji Shantanu Jain William Saunders Christopher Hesse Andrew N. Carr Jan Leike Josh Achiam Vedant Misra Evan Morikawa Alec Radford Matthew Knight Miles Brundage Mira Murati Katie Mayer Peter Welinder Bob McGrew Dario Amodei Sam McCandlish Ilya Sutskever and Wojciech Zaremba. 2021. Evaluating Large Language Models Trained on Code. (2021). arXiv:2107.03374 [cs.LG]"},{"key":"e_1_3_2_1_8_1","unstructured":"CodeQL [n. d.]. codeql. https:\/\/codeql.github.com. Accessed: 2010-09-30."},{"key":"e_1_3_2_1_9_1","volume-title":"A coefficient of agreement for nominal scales. Educational and psychological measurement 20, 1","author":"Cohen Jacob","year":"1960","unstructured":"Jacob Cohen. 1960. A coefficient of agreement for nominal scales. Educational and psychological measurement 20, 1 (1960), 37--46."},{"key":"e_1_3_2_1_10_1","unstructured":"CVE-MIRTE. 2022. Common Vulnerabilities and Exposures. https:\/\/www.cve.org\/About\/Overview"},{"key":"e_1_3_2_1_11_1","unstructured":"CWE-MIRTE. 2022. Common Weakness Enumeration. https:\/\/cwe.mitre.org\/index.html"},{"key":"e_1_3_2_1_12_1","doi-asserted-by":"publisher","DOI":"10.1145\/3379597.3387501"},{"key":"e_1_3_2_1_13_1","doi-asserted-by":"publisher","unstructured":"Zhiyu Fan Xiang Gao Abhik Roychoudhury and Shin Hwei Tan. 2022. Improving automatically generated code from Codex via Automated Program Repair. arXiv. 10.48550\/ARXIV.2205.10583","DOI":"10.48550\/ARXIV.2205.10583"},{"key":"e_1_3_2_1_14_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2020.findings-emnlp.139"},{"key":"e_1_3_2_1_15_1","doi-asserted-by":"publisher","unstructured":"Daniel Fried Armen Aghajanyan Jessy Lin Sida Wang Eric Wallace Freda Shi Ruiqi Zhong Wen-tau Yih Luke Zettlemoyer and Mike Lewis. 2022. InCoder: A Generative Model for Code Infilling and Synthesis. arXiv. 10.48550\/ARXIV.2204.05999","DOI":"10.48550\/ARXIV.2204.05999"},{"key":"e_1_3_2_1_16_1","unstructured":"Leo Gao Stella Biderman Sid Black Laurence Golding Travis Hoppe Charles Foster Jason Phang Horace He Anish Thite Noa Nabeshima et al. 2020. The pile: An 800gb dataset of diverse text for language modeling. arXiv preprint arXiv:2101.00027 (2020)."},{"key":"e_1_3_2_1_17_1","doi-asserted-by":"publisher","unstructured":"Hazim Hanif and Sergio Maffeis. 2022. VulBERTa: Simplified Source Code Pre-Training for Vulnerability Detection. arXiv. 10.48550\/ARXIV.2205.12424","DOI":"10.48550\/ARXIV.2205.12424"},{"key":"e_1_3_2_1_18_1","doi-asserted-by":"publisher","DOI":"10.1145\/3576915.3623175"},{"key":"e_1_3_2_1_19_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.cose.2021.102308"},{"key":"e_1_3_2_1_20_1","volume-title":"Impact of code language models on automated program repair. arXiv preprint arXiv:2302.05020","author":"Jiang Nan","year":"2023","unstructured":"Nan Jiang, Kevin Liu, Thibaud Lutellier, and Lin Tan. 2023. Impact of code language models on automated program repair. arXiv preprint arXiv:2302.05020 (2023)."},{"key":"e_1_3_2_1_21_1","volume-title":"Yangtian Zi, Niklas Muennighoff, Denis Kocetkov, Chenghao Mou, Marc Marone, Christopher Akiki, Jia Li, Jenny Chim, et al.","author":"Li Raymond","year":"2023","unstructured":"Raymond Li, Loubna Ben Allal, Yangtian Zi, Niklas Muennighoff, Denis Kocetkov, Chenghao Mou, Marc Marone, Christopher Akiki, Jia Li, Jenny Chim, et al. 2023. StarCoder: may the source be with you! arXiv preprint arXiv:2305.06161 (2023)."},{"key":"e_1_3_2_1_22_1","volume-title":"Rigorous Evaluation of Large Language Models for Code Generation. In Thirty-seventh Conference on Neural Information Processing Systems. https:\/\/openreview.net\/forum?id=1qvx610Cu7","author":"Liu Jiawei","year":"2023","unstructured":"Jiawei Liu, Chunqiu Steven Xia, Yuyao Wang, and Lingming Zhang. 2023. Is Your Code Generated by ChatGPT Really Correct? Rigorous Evaluation of Large Language Models for Code Generation. In Thirty-seventh Conference on Neural Information Processing Systems. https:\/\/openreview.net\/forum?id=1qvx610Cu7"},{"key":"e_1_3_2_1_23_1","volume-title":"Shengyu Fu, and Shujie LIU.","author":"Lu Shuai","year":"2021","unstructured":"Shuai Lu, Daya Guo, Shuo Ren, Junjie Huang, Alexey Svyatkovskiy, Ambrosio Blanco, Colin Clement, Dawn Drain, Daxin Jiang, Duyu Tang, Ge Li, Lidong Zhou, Linjun Shou, Long Zhou, Michele Tufano, MING GONG, Ming Zhou, Nan Duan, Neel Sundaresan, Shao Kun Deng, Shengyu Fu, and Shujie LIU. 2021. CodeXGLUE: A Machine Learning Benchmark Dataset for Code Understanding and Generation. In Thirty-fifth Conference on Neural Information Processing Systems Datasets and Benchmarks Track (Round 1). https:\/\/openreview.net\/forum?id=6lE4dQXaUcb"},{"key":"e_1_3_2_1_24_1","volume-title":"CodeGen2: Lessons for Training LLMs on Programming and Natural Languages. ICLR","author":"Nijkamp Erik","year":"2023","unstructured":"Erik Nijkamp, Hiroaki Hayashi, Caiming Xiong, Silvio Savarese, and Yingbo Zhou. 2023. CodeGen2: Lessons for Training LLMs on Programming and Natural Languages. ICLR (2023)."},{"key":"e_1_3_2_1_25_1","volume-title":"CodeGen: An Open Large Language Model for Code with Multi-Turn Program Synthesis. ICLR","author":"Nijkamp Erik","year":"2023","unstructured":"Erik Nijkamp, Bo Pang, Hiroaki Hayashi, Lifu Tu, Huan Wang, Yingbo Zhou, Silvio Savarese, and Caiming Xiong. 2023. CodeGen: An Open Large Language Model for Code with Multi-Turn Program Synthesis. ICLR (2023)."},{"key":"e_1_3_2_1_26_1","doi-asserted-by":"publisher","DOI":"10.1145\/3468264.3473122"},{"key":"e_1_3_2_1_27_1","doi-asserted-by":"publisher","unstructured":"Changan Niu Chuanyi Li Vincent Ng Jidong Ge Liguo Huang and Bin Luo. 2022. SPT-Code: Sequence-to-Sequence Pre-Training for Learning Source Code Representations. arXiv. 10.48550\/ARXIV.2201.01549","DOI":"10.48550\/ARXIV.2201.01549"},{"key":"e_1_3_2_1_28_1","unstructured":"NVD. 2022. https:\/\/nvd.nist.gov\/"},{"key":"e_1_3_2_1_29_1","doi-asserted-by":"publisher","DOI":"10.1109\/SP46214.2022.9833571"},{"key":"e_1_3_2_1_30_1","doi-asserted-by":"publisher","DOI":"10.1109\/SP46215.2023.10179324"},{"key":"e_1_3_2_1_31_1","volume-title":"Proceedings of the 2023 ACM SIGSAC Conference on Computer and Communications Security. 2785--2799","author":"Perry Neil","year":"2023","unstructured":"Neil Perry, Megha Srivastava, Deepak Kumar, and Dan Boneh. 2023. Do users write more insecure code with AI assistants?. In Proceedings of the 2023 ACM SIGSAC Conference on Computer and Communications Security. 2785--2799."},{"key":"e_1_3_2_1_32_1","doi-asserted-by":"publisher","DOI":"10.1109\/QRS54544.2021.00076"},{"key":"e_1_3_2_1_33_1","volume-title":"Yossi Adi, Jingyu Liu, Tal Remez, J\u00e9r\u00e9my Rapin, et al.","author":"Roziere Baptiste","year":"2023","unstructured":"Baptiste Roziere, Jonas Gehring, Fabian Gloeckle, Sten Sootla, Itai Gat, Xiaoqing Ellen Tan, Yossi Adi, Jingyu Liu, Tal Remez, J\u00e9r\u00e9my Rapin, et al. 2023. Code llama: Open foundation models for code. arXiv preprint arXiv:2308.12950 (2023)."},{"key":"e_1_3_2_1_34_1","doi-asserted-by":"publisher","DOI":"10.48550\/ARXIV.2206.05239"},{"key":"e_1_3_2_1_35_1","unstructured":"Ben Wang and Aran Komatsuzaki. 2021. GPT-J-6B: A 6 Billion Parameter Autoregressive Language Model. https:\/\/github.com\/kingoflolz\/mesh-transformer-jax."},{"key":"e_1_3_2_1_36_1","volume-title":"Nghi DQ Bui, Junnan Li, and Steven CH Hoi.","author":"Wang Yue","year":"2023","unstructured":"Yue Wang, Hung Le, Akhilesh Deepak Gotmare, Nghi DQ Bui, Junnan Li, and Steven CH Hoi. 2023. Codet5+: Open code large language models for code understanding and generation. arXiv preprint arXiv:2305.07922 (2023)."},{"key":"e_1_3_2_1_37_1","volume-title":"Codet5: Identifier-aware unified pre-trained encoder-decoder models for code understanding and generation. arXiv preprint arXiv:2109.00859","author":"Wang Yue","year":"2021","unstructured":"Yue Wang, Weishi Wang, Shafiq Joty, and Steven CH Hoi. 2021. Codet5: Identifier-aware unified pre-trained encoder-decoder models for code understanding and generation. arXiv preprint arXiv:2109.00859 (2021)."},{"key":"e_1_3_2_1_38_1","doi-asserted-by":"publisher","DOI":"10.1145\/3520312.3534862"},{"key":"e_1_3_2_1_39_1","doi-asserted-by":"publisher","DOI":"10.48550\/ARXIV.2202.13169"},{"key":"e_1_3_2_1_40_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICSE-SEIP52600.2021.00020"}],"event":{"name":"FORGE '24: 2024 IEEE\/ACM First International Conference on AI Foundation Models and Software Engineering","location":"Lisbon Portugal","acronym":"FORGE '24","sponsor":["SIGSOFT ACM Special Interest Group on Software Engineering"]},"container-title":["Proceedings of the 2024 IEEE\/ACM First International Conference on AI Foundation Models and Software Engineering"],"original-title":[],"link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3650105.3652299","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3650105.3652299","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,19]],"date-time":"2025-06-19T00:03:43Z","timestamp":1750291423000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3650105.3652299"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,4,14]]},"references-count":40,"alternative-id":["10.1145\/3650105.3652299","10.1145\/3650105"],"URL":"https:\/\/doi.org\/10.1145\/3650105.3652299","relation":{},"subject":[],"published":{"date-parts":[[2024,4,14]]},"assertion":[{"value":"2024-06-12","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}