{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,11]],"date-time":"2026-04-11T09:52:06Z","timestamp":1775901126554,"version":"3.50.1"},"reference-count":39,"publisher":"Association for Computing Machinery (ACM)","issue":"FSE","license":[{"start":{"date-parts":[[2024,7,12]],"date-time":"2024-07-12T00:00:00Z","timestamp":1720742400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Proc. ACM Softw. Eng."],"published-print":{"date-parts":[[2024,7,12]]},"abstract":"<jats:p>\n                    Generative LLMs have been shown to effectively power AI-based code authoring tools that can suggest entire statements or blocks of code during code authoring. In this paper we present\n                    <jats:sc>CodeCompose<\/jats:sc>\n                    , an AI-assisted code authoring tool developed and deployed at Meta internally.\n                    <jats:sc>CodeCompose<\/jats:sc>\n                    is based on the InCoder LLM that merges generative capabilities with bi-directionality. We have scaled up\n                    <jats:sc>CodeCompose<\/jats:sc>\n                    to serve tens of thousands of developers at Meta, across 9 programming languages and several coding surfaces. We present our experience in making design decisions about the model and system architecture for\n                    <jats:sc>CodeCompose<\/jats:sc>\n                    that addresses these challenges.\n                  <\/jats:p>\n                  <jats:p>To release a LLM model at this scale, we needed to first ensure that it is sufficiently accurate. In a random sample of 20K source code files, depending on the language, we are able to reproduce hidden lines between 40% and 58% of the time, an improvement of 1.4\u00d7 and 4.1\u00d7 over a model trained only on public data.<\/jats:p>\n                  <jats:p>\n                    We gradually rolled\n                    <jats:sc>CodeCompose<\/jats:sc>\n                    out to developers. At the time of this writing, 16K developers have used it with 8% of their code coming directly from\n                    <jats:sc>CodeCompose<\/jats:sc>\n                    .\n                  <\/jats:p>\n                  <jats:p>\n                    To triangulate our numerical findings, we conduct a thematic analysis on the feedback from 70 developers. We find that 91.5% of the feedback is positive, with the most common themes being discovering APIs, dealing with boilerplate code, and accelerating coding. Meta continues to integrate this feedback into\n                    <jats:sc>CodeCompose<\/jats:sc>\n                    .\n                  <\/jats:p>","DOI":"10.1145\/3643774","type":"journal-article","created":{"date-parts":[[2024,7,12]],"date-time":"2024-07-12T10:22:09Z","timestamp":1720779729000},"page":"1066-1085","source":"Crossref","is-referenced-by-count":18,"title":["AI-Assisted Code Authoring at Scale: Fine-Tuning, Deploying, and Mixed Methods Evaluation"],"prefix":"10.1145","volume":"1","author":[{"ORCID":"https:\/\/orcid.org\/0009-0009-1374-7334","authenticated-orcid":false,"given":"Vijayaraghavan","family":"Murali","sequence":"first","affiliation":[{"name":"Meta Platforms, Menlo Park, USA"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-9432-1045","authenticated-orcid":false,"given":"Chandra","family":"Maddila","sequence":"additional","affiliation":[{"name":"Meta Platforms, Menlo Park, USA"}]},{"ORCID":"https:\/\/orcid.org\/0009-0005-8606-3946","authenticated-orcid":false,"given":"Imad","family":"Ahmad","sequence":"additional","affiliation":[{"name":"Meta Platforms, Menlo Park, USA"}]},{"ORCID":"https:\/\/orcid.org\/0009-0002-6955-1970","authenticated-orcid":false,"given":"Michael","family":"Bolin","sequence":"additional","affiliation":[{"name":"Meta Platforms, Menlo Park, USA"}]},{"ORCID":"https:\/\/orcid.org\/0009-0007-1708-8132","authenticated-orcid":false,"given":"Daniel","family":"Cheng","sequence":"additional","affiliation":[{"name":"Meta Platforms, Menlo Park, USA"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-0528-6138","authenticated-orcid":false,"given":"Negar","family":"Ghorbani","sequence":"additional","affiliation":[{"name":"Meta Platforms, Menlo Park, USA"}]},{"ORCID":"https:\/\/orcid.org\/0009-0007-4734-4328","authenticated-orcid":false,"given":"Renuka","family":"Fernandez","sequence":"additional","affiliation":[{"name":"Meta Platforms, Menlo Park, USA"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-1358-4124","authenticated-orcid":false,"given":"Nachiappan","family":"Nagappan","sequence":"additional","affiliation":[{"name":"Meta Platforms, Menlo Park, USA"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-1137-4297","authenticated-orcid":false,"given":"Peter C.","family":"Rigby","sequence":"additional","affiliation":[{"name":"Meta, Menlo Park, USA"},{"name":"Concordia University, Montreal, Canada"}]}],"member":"320","published-online":{"date-parts":[[2024,7,12]]},"reference":[{"key":"e_1_3_1_2_2","volume-title":"How GitHub Copilot helps improve developer productivity","author":"Github","year":"2022","unstructured":"Github 2022. How GitHub Copilot helps improve developer productivity. Github. https:\/\/github.blog\/2022-07-14-research-how-github-copilot-helps-improve-developer-productivity\/"},{"key":"e_1_3_1_3_2","volume-title":"Github Copilot","author":"Github Accessed","year":"2021","unstructured":"Github Accessed 2021. Github Copilot. Github. https:\/\/github.com\/features\/copilot"},{"key":"e_1_3_1_4_2","unstructured":"Accessed 2021. GitHub CoPilot\u2019s economic impact. https:\/\/github.blog\/2023-06-27-the-economic-impact-of-the-ai-powered-developer-lifecycle-and-lessons-from-github-copilot\/"},{"key":"e_1_3_1_5_2","unstructured":"Microsoft Accessed 2021. Microsoft Intellicode. Microsoft. https:\/\/visualstudio.microsoft.com\/services\/intellicode"},{"key":"e_1_3_1_6_2","unstructured":"Google Accessed 2021. ML Enhanced Code Completion. Google. https:\/\/ai.googleblog.com\/2022\/07\/ml-enhanced-code-completion-improves.html"},{"key":"e_1_3_1_7_2","unstructured":"Amazon Accessed 2023. Amazon CodeWhisperer. Amazon. https:\/\/aws.amazon.com\/codewhisperer"},{"key":"e_1_3_1_8_2","unstructured":"Accessed 2023. Flow Javascript. https:\/\/engineering.fb.com\/2014\/11\/18\/web\/flow-a-new-static-type-checker-for-javascript"},{"key":"e_1_3_1_9_2","unstructured":"Microsoft Accessed 2023. Language Server Protocol (LSP). Microsoft. https:\/\/github.com\/microsoft\/language-server-protocol"},{"key":"e_1_3_1_10_2","unstructured":"Jacob Austin Augustus Odena Maxwell Nye Maarten Bosma Henryk Michalewski David Dohan Ellen Jiang Carrie Cai Michael Terry Quoc Le and Charles Sutton. 2021. Program Synthesis with Large Language Models. arXiv:2108.07732 [cs.PL]"},{"key":"e_1_3_1_11_2","doi-asserted-by":"publisher","DOI":"10.1145\/3586030"},{"key":"e_1_3_1_12_2","unstructured":"Mohammad Bavarian Heewoo Jun Nikolas Tezak John Schulman Christine McLeavey Jerry Tworek and Mark Chen. 2022. Efficient Training of Language Models to Fill in the Middle. arXiv:2207.14255 [cs.CL]"},{"key":"e_1_3_1_13_2","unstructured":"Tom B. Brown Benjamin Mann Nick Ryder Melanie Subbiah Jared Kaplan Prafulla Dhariwal Arvind Neelakantan Pranav Shyam Girish Sastry Amanda Askell Sandhini Agarwal Ariel Herbert-Voss Gretchen Krueger Tom Henighan Rewon Child Aditya Ramesh Daniel M. Ziegler Jeffrey Wu Clemens Winter Christopher Hesse Mark Chen Eric Sigler Mateusz Litwin Scott Gray Benjamin Chess Jack Clark Christopher Berner Sam McCandlish Alec Radford Ilya Sutskever and Dario Amodei. 2020. Language Models are Few-Shot Learners. arXiv:2005.14165 [cs.CL]"},{"key":"e_1_3_1_14_2","doi-asserted-by":"crossref","unstructured":"Marcel Bruch Martin Monperrus and Mira Mezini. 2009. Learning from examples to improve code completion systems. In Proceedings of the 7th joint meeting of the European software engineering conference and the ACM SIGSOFT symposium on The foundations of software engineering (ESEC\/FSE \u201909).","DOI":"10.1145\/1595696.1595728"},{"key":"e_1_3_1_15_2","unstructured":"Mark Chen Jerry Tworek Heewoo Jun Qiming Yuan Henrique Ponde de Oliveira Pinto Jared Kaplan Harri Edwards Yuri Burda Nicholas Joseph Greg Brockman Alex Ray Raul Puri Gretchen Krueger Michael Petrov Heidy Khlaaf Girish Sastry Pamela Mishkin Brooke Chan Scott Gray Nick Ryder Mikhail Pavlov Alethea Power Lukasz Kaiser Mohammad Bavarian Clemens Winter Philippe Tillet Felipe Petroski Such Dave Cummings Matthias Plappert Fotios Chantzis Elizabeth Barnes Ariel Herbert-Voss William Hebgen Guss Alex Nichol Alex Paino Nikolas Tezak Jie Tang Igor Babuschkin Suchir Balaji Shantanu Jain William Saunders Christopher Hesse Andrew N. Carr Jan Leike Josh Achiam Vedant Misra Evan Morikawa Alec Radford Matthew Knight Miles Brundage Mira Murati Katie Mayer Peter Welinder Bob McGrew Dario Amodei Sam McCandlish Ilya Sutskever and Wojciech Zaremba. 2021. Evaluating Large Language Models Trained on Code. arXiv:2107.03374 [cs.LG]"},{"key":"e_1_3_1_16_2","volume-title":"Designing and conducting mixed methods research","author":"Creswell John W","year":"2017","unstructured":"John W Creswell and Vicki L Plano Clark. 2017. Designing and conducting mixed methods research. Sage publications."},{"key":"e_1_3_1_17_2","unstructured":"Jacob Devlin Ming-Wei Chang Kenton Lee and Kristina Toutanova. 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. arXiv:1810.04805 [cs.CL]"},{"key":"e_1_3_1_18_2","doi-asserted-by":"crossref","unstructured":"Omer Dunay Daniel Cheng Adam Tait Parth Thakkar Peter C Rigby Andy Chiu Imad Ahmad Arun Ganesan Chandra Maddila Vijayaraghavan Murali Ali Tayyebi and Nachiappan Nagappan. 2024. Multi-line AI-assisted Code Authoring.","DOI":"10.1145\/3663529.3663836"},{"key":"e_1_3_1_19_2","unstructured":"Daniel Fried Armen Aghajanyan Jessy Lin Sida Wang Eric Wallace Freda Shi Ruiqi Zhong Wen tau Yih Luke Zettle-moyer and Mike Lewis. 2023. InCoder: A Generative Model for Code Infilling and Synthesis. arXiv:2204.05999 [cs.SE]"},{"key":"e_1_3_1_20_2","doi-asserted-by":"crossref","unstructured":"Nuno M. Guerreiro Duarte Alves Jonas Waldendorf Barry Haddow Alexandra Birch Pierre Colombo and Andr\u00e9 F. T. Martins. 2023. Hallucinations in Large Multilingual Translation Models. arXiv:2303.16104 [cs.CL]","DOI":"10.1162\/tacl_a_00615"},{"key":"e_1_3_1_21_2","doi-asserted-by":"publisher","unstructured":"Vincent J. Hellendoorn Sebastian Proksch Harald C. Gall and Alberto Bacchelli. 2019. When Code Completion Fails: A Case Study on Real-World Completions. In 2019 IEEE\/ACM 41st International Conference on Software Engineering (ICSE). 960\u2013970. https:\/\/doi.org\/10.1109\/ICSE.2019.00101 10.1109\/ICSE.2019.00101","DOI":"10.1109\/ICSE.2019.00101"},{"key":"e_1_3_1_22_2","doi-asserted-by":"publisher","DOI":"10.1145\/2661829.2661922"},{"key":"e_1_3_1_23_2","unstructured":"Hamel Husain Ho-Hsiang Wu Tiferet Gazit Miltiadis Allamanis and Marc Brockschmidt. 2020. CodeSearchNet Challenge: Evaluating the State of Semantic Code Search. arXiv:1909.09436 [cs.LG]"},{"key":"e_1_3_1_24_2","doi-asserted-by":"publisher","DOI":"10.1145\/3571730"},{"key":"e_1_3_1_25_2","doi-asserted-by":"publisher","unstructured":"Seohyun Kim Jinman Zhao Yuchi Tian and Satish Chandra. 2021. Code Prediction by Feeding Trees to Transformers. In 2021 IEEE\/ACM 43rd International Conference on Software Engineering (ICSE). 150\u2013162. https:\/\/doi.org\/10.1109\/ICSE43902.2021.00026 10.1109\/ICSE43902.2021.00026","DOI":"10.1109\/ICSE43902.2021.00026"},{"key":"e_1_3_1_26_2","unstructured":"Zihao Li. 2023. The Dark Side of ChatGPT: Legal and Ethical Challenges from Stochastic Parrots and Hallucination. arXiv:2304.14347 [cs.CY]"},{"key":"e_1_3_1_27_2","unstructured":"Chao Liu Xuanlin Bao Hongyu Zhang Neng Zhang Haibo Hu Xiaohong Zhang and Meng Yan. 2023. Improving ChatGPT Prompt for Code Generation. arXiv:2305.08360 [cs.SE]"},{"key":"e_1_3_1_28_2","unstructured":"Shuai Lu Daya Guo Shuo Ren Junjie Huang Alexey Svyatkovskiy Ambrosio Blanco Colin Clement Dawn Drain Daxin Jiang Duyu Tang Ge Li Lidong Zhou Linjun Shou Long Zhou Michele Tufano Ming Gong Ming Zhou Nan Duan Neel Sundaresan Shao Kun Deng Shengyu Fu and Shujie Liu. 2021. CodeXGLUE: A Machine Learning Benchmark Dataset for Code Understanding and Generation. arXiv:2102.04664 [cs.SE]"},{"key":"e_1_3_1_29_2","doi-asserted-by":"crossref","unstructured":"Nhan Nguyen and Sarah Nadi. 2022. An empirical evaluation of GitHub copilot\u2019s code suggestions. In Proceedings of the 19th International Conference on Mining Software Repositories (MSR \u201922).","DOI":"10.1145\/3524842.3528470"},{"key":"e_1_3_1_30_2","unstructured":"Erik Nijkamp Bo Pang Hiroaki Hayashi Lifu Tu Huan Wang Yingbo Zhou Silvio Savarese and Caiming Xiong. 2023. CodeGen: An Open Large Language Model for Code with Multi-Turn Program Synthesis. arXiv:2203.13474 [cs.LG]"},{"key":"e_1_3_1_31_2","doi-asserted-by":"publisher","DOI":"10.3115\/1073083.1073135"},{"issue":"1","key":"e_1_3_1_32_2","article-title":"Intelligent Code Completion with Bayesian Networks","volume":"25","author":"Proksch Sebastian","year":"2015","unstructured":"Sebastian Proksch, Johannes Lerch, and Mira Mezini. 2015. Intelligent Code Completion with Bayesian Networks. ACM Transactions on Software Engineering and Methodology (TOSEM) 25, 1, Article 3 (12 2015).","journal-title":"ACM Transactions on Software Engineering and Methodology (TOSEM)"},{"key":"e_1_3_1_33_2","doi-asserted-by":"publisher","unstructured":"R. Robles and M. Lanza. 2008. How Program History Can Improve Code Completion. In 2008 23rd IEEE\/ACM International Conference on Automated Software Engineering. 317\u2013326. https:\/\/doi.org\/10.1109\/ASE.2008.42 10.1109\/ASE.2008.42","DOI":"10.1109\/ASE.2008.42"},{"key":"e_1_3_1_34_2","unstructured":"Baptiste Rozi\u00e8re Jonas Gehring Fabian Gloeckle Sten Sootla Itai Gat Xiaoqing Ellen Tan Yossi Adi Jingyu Liu Tal Remez J\u00e9r\u00e9my Rapin Artyom Kozhevnikov Ivan Evtimov Joanna Bitton Manish Bhatt Cristian Canton Ferrer Aaron Grattafiori Wenhan Xiong Alexandre D\u00e9fossez Jade Copet Faisal Azhar Hugo Touvron Louis Martin Nicolas Usunier Thomas Scialom and Gabriel Synnaeve. 2023. Code Llama: Open Foundation Models for Code. arXiv:2308.12950 [cs.CL]"},{"issue":"8","key":"e_1_3_1_35_2","first-page":"127","article-title":"Thrift: Scalable cross-language services implementation","volume":"5","author":"Slee Mark","year":"2007","unstructured":"Mark Slee, Aditya Agarwal, and Marc Kwiatkowski. 2007. Thrift: Scalable cross-language services implementation. Facebook white paper 5, 8 (2007), 127.","journal-title":"Facebook white paper"},{"key":"e_1_3_1_36_2","doi-asserted-by":"crossref","unstructured":"Priyan Vaithilingam Tianyi Zhang and Elena L. Glassman. 2022. Expectation vs. Experience: Evaluating the Usability of Code Generation Tools Powered by Large Language Models. In Extended Abstracts of the 2022 CHI Conference on Human Factors in Computing Systems (CHI EA \u201922).","DOI":"10.1145\/3491101.3519665"},{"key":"e_1_3_1_37_2","unstructured":"Ashish Vaswani Noam Shazeer Niki Parmar Jakob Uszkoreit Llion Jones Aidan N. Gomez Lukasz Kaiser and Illia Polosukhin. 2017. Attention Is All You Need. arXiv:1706.03762 [cs.CL]"},{"key":"e_1_3_1_38_2","first-page":"220","volume-title":"Design Science Research in Information Systems. Advances in Theory and Practice","author":"Wieringa Roel","year":"2012","unstructured":"Roel Wieringa and Ay\u015fe Morali. 2012. Technical Action Research as a Validation Method in Information Systems Design Science. In Design Science Research in Information Systems. Advances in Theory and Practice, Ken Peffers, Marcus Rothenberger, and Bill Kuechler (Eds.). Springer Berlin Heidelberg, Berlin, Heidelberg, 220\u2013238."},{"key":"e_1_3_1_39_2","doi-asserted-by":"publisher","unstructured":"Wen Zhou Seohyun Kim Vijayaraghavan Murali and Gareth Ari Aye. 2022. Improving Code Autocompletion with Transfer Learning. In 2022 IEEE\/ACM 44th International Conference on Software Engineering: Software Engineering in Practice (ICSE-SEIP). 161\u2013162. https:\/\/doi.org\/10.1145\/3510457.3513061 10.1145\/3510457.3513061","DOI":"10.1145\/3510457.3513061"},{"key":"e_1_3_1_40_2","doi-asserted-by":"publisher","DOI":"10.1145\/3520312.3534864"}],"container-title":["Proceedings of the ACM on Software Engineering"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3643774","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3643774","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2026,2,4]],"date-time":"2026-02-04T08:04:16Z","timestamp":1770192256000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3643774"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,7,12]]},"references-count":39,"journal-issue":{"issue":"FSE","published-print":{"date-parts":[[2024,7,12]]}},"alternative-id":["10.1145\/3643774"],"URL":"https:\/\/doi.org\/10.1145\/3643774","relation":{},"ISSN":["2994-970X"],"issn-type":[{"value":"2994-970X","type":"electronic"}],"subject":[],"published":{"date-parts":[[2024,7,12]]}}}