{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,7,18]],"date-time":"2026-07-18T16:24:46Z","timestamp":1784391886475,"version":"3.55.0"},"reference-count":53,"publisher":"Association for Computing Machinery (ACM)","issue":"6","funder":[{"name":"Project of Science and Technology Research and Development Plan of China Railway Corporation","award":["N2023J044"],"award-info":[{"award-number":["N2023J044"]}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Softw. Eng. Methodol."],"published-print":{"date-parts":[[2025,7,31]]},"abstract":"<jats:p>In software development, the raw requirements proposed by users are frequently incomplete, which impedes the complete implementation of software functionalities. With the emergence of large language models, the exploration of generating software through user requirements has attracted attention. Recent methods with the top-down waterfall model employ a questioning approach for requirement completion, attempting to explore further user requirements. However, users, constrained by their domain knowledge, result in a lack of effective acceptance criteria during the requirement completion, failing to fully capture the implicit needs of the user. Moreover, the cumulative errors of the waterfall model can lead to discrepancies between the generated code and user requirements. The Agile methodologies reduce cumulative errors of the waterfall model through lightweight iteration and collaboration with users, but the challenge lies in ensuring semantic consistency between user requirements and the code generated by the agent. To address these challenges, we propose AgileGen, an agile-based generative software development through human-AI teamwork. Unlike existing questioning agents, AgileGen adopts a novel collaborative approach that breaks free from the constraints of domain knowledge by initiating the end-user perspective to complete the acceptance criteria. By introducing the Gherkin language, AgileGen attempts for the first time to use testable requirement descriptions as a bridge for semantic consistency between requirements and code, aiming to ensure that software products meet actual user requirements by defining user scenarios that include acceptance criteria. Additionally, we innovate in the human-AI teamwork model, allowing users to participate in decision-making processes they do well and significantly enhancing the completeness of software functionality. To ensure semantic consistency between requirements and generated code, we derive consistency factors from Gherkin to drive the subsequent software code generation. Finally, to improve the reliability of user scenarios, we also introduce a memory pool mechanism, collecting user decision-making scenarios and recommending them to new users with similar requirements. AgileGen, as a user-friendly interactive system, significantly outperformed existing best methods by 16.4% and garnered higher user satisfaction.<\/jats:p>","DOI":"10.1145\/3702987","type":"journal-article","created":{"date-parts":[[2025,1,22]],"date-time":"2025-01-22T13:26:55Z","timestamp":1737552415000},"page":"1-46","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":19,"title":["Empowering Agile-Based Generative Software Development through Human-AI Teamwork"],"prefix":"10.1145","volume":"34","author":[{"ORCID":"https:\/\/orcid.org\/0000-0001-8972-2824","authenticated-orcid":false,"given":"Sai","family":"Zhang","sequence":"first","affiliation":[{"name":"College of Intelligence and Computing, Tianjin University, Tianjin, China and CSIRO\u2019s Data61, Canberra, Australia"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-7663-1421","authenticated-orcid":false,"given":"Zhenchang","family":"Xing","sequence":"additional","affiliation":[{"name":"CSIRO\u2019s Data61, Canberra, Australia and Australian National University, Canberra, Australia"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-1586-7040","authenticated-orcid":false,"given":"Ronghui","family":"Guo","sequence":"additional","affiliation":[{"name":"College of Intelligence and Computing, Tianjin University, Tianjin, China"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0009-0000-0667-4653","authenticated-orcid":false,"given":"Fangzhou","family":"Xu","sequence":"additional","affiliation":[{"name":"College of Intelligence and Computing, Tianjin University, Tianjin, China"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0009-0006-8455-7253","authenticated-orcid":false,"given":"Lei","family":"Chen","sequence":"additional","affiliation":[{"name":"College of Intelligence and Computing, Tianjin University, Tianjin, China"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0009-0000-9183-1173","authenticated-orcid":false,"given":"Zhaoyuan","family":"Zhang","sequence":"additional","affiliation":[{"name":"College of Intelligence and Computing, Tianjin University, Tianjin, China"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-3931-3886","authenticated-orcid":false,"given":"Xiaowang","family":"Zhang","sequence":"additional","affiliation":[{"name":"College of Intelligence and Computing, Tianjin University, Tianjin, China"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-8158-7453","authenticated-orcid":false,"given":"Zhiyong","family":"Feng","sequence":"additional","affiliation":[{"name":"College of Intelligence and Computing, Tianjin University, Tianjin, China"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-0081-1703","authenticated-orcid":false,"given":"Zhiqiang","family":"Zhuang","sequence":"additional","affiliation":[{"name":"College of Intelligence and Computing, Tianjin University, Tianjin, China"}],"role":[{"vocabulary":"crossref","role":"author"}]}],"member":"320","published-online":{"date-parts":[[2025,7,3]]},"reference":[{"key":"e_1_3_2_2_2","unstructured":"AutoGPT. 2023. AutoGPT. Retrieved from https:\/\/autogpt.net\/"},{"key":"e_1_3_2_3_2","doi-asserted-by":"crossref","unstructured":"Yejin Bang Samuel Cahyawijaya Nayeon Lee Wenliang Dai Dan Su Bryan Wilie Holy Lovenia Ziwei Ji Tiezheng Yu Willy Chung Quyet V. Do Yan Xu and Pascale Fung. 2023. A multitask multilingual multimodal evaluation of chatGPT on reasoning hallucination and interactivity. arXiv:2302.04023. Retrieved from https:\/\/arxiv.org\/abs\/2302.04023","DOI":"10.18653\/v1\/2023.ijcnlp-main.45"},{"key":"e_1_3_2_4_2","doi-asserted-by":"publisher","DOI":"10.1109\/MiSE.2017.1"},{"key":"e_1_3_2_5_2","doi-asserted-by":"publisher","DOI":"10.1145\/3540250.3558914"},{"key":"e_1_3_2_6_2","doi-asserted-by":"publisher","DOI":"10.1145\/3540250.3558914"},{"key":"e_1_3_2_7_2","unstructured":"S\u00e9bastien Bubeck Varun Chandrasekaran Ronen Eldan Johannes Gehrke Eric Horvitz Ece Kamar Peter Lee Yin Tat Lee Yuanzhi Li Scott M. Lundberg Harsha Nori Hamid Palangi Marco T\u00falio Ribeiro and Yi Zhang. 2023. Sparks of artificial general intelligence: Early experiments with GPT-4. arXiv:2303.12712. Retrieved from https:\/\/arxiv.org\/abs\/2303.12712"},{"key":"e_1_3_2_8_2","unstructured":"Mark Chen Jerry Tworek Heewoo Jun Qiming Yuan Henrique Ponde de Oliveira Pinto Jared Kaplan Harri Edwards Yuri Burda Nicholas Joseph Greg Brockman et al. 2021. Evaluating large language models trained on code. arXiv:2107.03374. Retrieved from https:\/\/arxiv.org\/abs\/2107.03374"},{"key":"e_1_3_2_9_2","doi-asserted-by":"publisher","DOI":"10.1145\/3638247"},{"key":"e_1_3_2_10_2","unstructured":"GitHub Copilot. 2024. GitHub Copilot Your AI Pair Programmer. Retrieved from https:\/\/github.com\/features\/copilot"},{"key":"e_1_3_2_11_2","unstructured":"Ian Dees Matt Wynne and Aslak Hellesoy. 2013. Cucumber Recipes: Automate Anything with BDD Tools and Techniques. Pragmatic Bookshelf. Retrieved from https:\/\/dl.acm.org\/doi\/book\/10.5555\/2509724"},{"key":"e_1_3_2_12_2","doi-asserted-by":"publisher","DOI":"10.18653\/V1\/P16-1004"},{"key":"e_1_3_2_13_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2020.findings-emnlp.139"},{"key":"e_1_3_2_14_2","unstructured":"Snippet Generator. 2023. Python Snippet Generator. Retrieved from https:\/\/huggingface.co\/spaces\/nullzero-live\/python-project-generator"},{"key":"e_1_3_2_15_2","unstructured":"GitHub 2023. Gpt-Engineer. Retrieved from https:\/\/github.com\/gpt-engineer-org\/gpt-engineer"},{"key":"e_1_3_2_16_2","doi-asserted-by":"publisher","DOI":"10.1145\/3546943"},{"key":"e_1_3_2_17_2","unstructured":"Sirui Hong Yizhang Lin Bang Liu Bangbang Liu Binhao Wu Danyang Li Jiaqi Chen Jiayi Zhang Jinlin Wang Li Zhang et al. 2024. Data interpreter: An LLM agent for data science. arXiv:2402.18679. Retrieved from https:\/\/arxiv.org\/abs\/2402.18679"},{"key":"e_1_3_2_18_2","unstructured":"Sirui Hong Xiawu Zheng Jonathan Chen Yuheng Cheng Jinlin Wang Ceyao Zhang Zili Wang Steven Ka Shing Yau Zijuan Lin Liyang Zhou Chenyu Ran Lingfeng Xiao and Chenglin Wu. 2023. MetaGPT: Meta programming for multi-agent collaborative framework. arXiv:2308.00352. Retrieved from https:\/\/arxiv.org\/abs\/2308.00352"},{"key":"e_1_3_2_19_2","doi-asserted-by":"publisher","DOI":"10.1109\/MS.2022.3147692"},{"key":"e_1_3_2_20_2","doi-asserted-by":"publisher","DOI":"10.1201\/9781003129509"},{"key":"e_1_3_2_21_2","volume-title":"Research-Based Web Design & Usability Guidelines","author":"Leavitt Michael","year":"2006","unstructured":"Michael Leavitt and Ben Shneiderman. 2006. Research-Based Web Design & Usability Guidelines. US Department of Health and Human Services."},{"key":"e_1_3_2_22_2","unstructured":"Jia Li Yunfei Zhao Yongmin Li Ge Li and Zhi Jin. 2023. Towards enhancing in-context learning for code generation. arXiv:2303.17780. Retrieved from https:\/\/arxiv.org\/abs\/2303.17780"},{"key":"e_1_3_2_23_2","doi-asserted-by":"publisher","DOI":"10.1145\/3643674"},{"key":"e_1_3_2_24_2","doi-asserted-by":"publisher","DOI":"10.1109\/AERO.2014.6836450"},{"key":"e_1_3_2_25_2","unstructured":"Shuai Lu Daya Guo Shuo Ren Junjie Huang Alexey Svyatkovskiy Ambrosio Blanco Colin B. Clement Dawn Drain Daxin Jiang Duyu Tang et al. 2021. CodeXGLUE: A machine learning benchmark dataset for code understanding and generation. arXiv:2102.04664. Retrieved from https:\/\/arxiv.org\/abs\/2102.04664"},{"key":"e_1_3_2_26_2","doi-asserted-by":"publisher","DOI":"10.1145\/3551349.3559544"},{"key":"e_1_3_2_27_2","doi-asserted-by":"publisher","DOI":"10.1016\/J.INFSOF.2009.04.001"},{"key":"e_1_3_2_28_2","doi-asserted-by":"publisher","DOI":"10.3390\/computers9030056"},{"key":"e_1_3_2_29_2","doi-asserted-by":"publisher","DOI":"10.1145\/3586183.3606763"},{"key":"e_1_3_2_30_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-031-22057-9_3"},{"key":"e_1_3_2_31_2","unstructured":"Chen Qian Xin Cong Cheng Yang Weize Chen Yusheng Su Juyuan Xu Zhiyuan Liu and Maosong Sun. 2023. Communicative agents for software development. arXiv:2307.07924. Retrieved from https:\/\/arxiv.org\/abs\/2307.07924"},{"key":"e_1_3_2_32_2","unstructured":"Chen Qian Yufan Dang Jiahao Li Wei Liu Weize Chen Cheng Yang Zhiyuan Liu and Maosong Sun. 2023. Experiential co-learning of software-developing agents. arXiv:2312.17025. Retrieved from https:\/\/arxiv.org\/abs\/2312.17025"},{"key":"e_1_3_2_33_2","unstructured":"Chen Qian Jiahao Li Yufan Dang Wei Liu YiFei Wang Zihao Xie Weize Chen Cheng Yang Yingli Zhang Zhiyuan Liu and Maosong Sun. 2024. Iterative experience refinement of software-developing agents. arXiv:2405.04219. Retrieved from https:\/\/arxiv.org\/abs\/2405.04219"},{"key":"e_1_3_2_34_2","unstructured":"Chen Qian Zihao Xie Yifei Wang Wei Liu Yufan Dang Zhuoyun Du Weize Chen Cheng Yang Zhiyuan Liu and Maosong Sun. 2024. Scaling large-language-model-based multi-agent collaboration. arXiv:2406.07155. Retrieved from https:\/\/arxiv.org\/abs\/2406.07155"},{"key":"e_1_3_2_35_2","doi-asserted-by":"publisher","DOI":"10.1145\/3641399.3641403"},{"key":"e_1_3_2_36_2","unstructured":"Shuo Ren Daya Guo Shuai Lu Long Zhou Shujie Liu Duyu Tang Neel Sundaresan Ming Zhou Ambrosio Blanco and Shuai Ma. 2020. CodeBLEU: A method for automatic evaluation of code synthesis. arXiv:2009.10297. Retrieved from https:\/\/arxiv.org\/abs\/2009.10297"},{"key":"e_1_3_2_37_2","doi-asserted-by":"publisher","DOI":"10.1109\/ASE56229.2023.00143"},{"key":"e_1_3_2_38_2","doi-asserted-by":"publisher","DOI":"10.1145\/3617169"},{"key":"e_1_3_2_39_2","doi-asserted-by":"publisher","DOI":"10.13140\/RG.2.1.2815.0245"},{"key":"e_1_3_2_40_2","doi-asserted-by":"publisher","DOI":"10.1145\/3540250.3558965"},{"key":"e_1_3_2_41_2","doi-asserted-by":"publisher","DOI":"10.1007\/S00766-017-0279-5"},{"key":"e_1_3_2_42_2","unstructured":"Daniel Tang Zhenghan Chen Kisub Kim Yewei Song Haoye Tian Saad Ezzini Yongfeng Huang and Jacques Klein Tegawende F. Bissyande. 2024. Collaborative agents for software engineering. arXiv:2402.02172. Retrieved from https:\/\/arxiv.org\/abs\/2402.02172"},{"key":"e_1_3_2_43_2","doi-asserted-by":"publisher","DOI":"10.1109\/FIT57066.2022.00067"},{"key":"e_1_3_2_44_2","doi-asserted-by":"publisher","DOI":"10.1109\/MedAI59581.2023.00044"},{"key":"e_1_3_2_45_2","doi-asserted-by":"publisher","DOI":"10.1145\/3551349.3559549"},{"key":"e_1_3_2_46_2","doi-asserted-by":"publisher","DOI":"10.18653\/V1\/2022.FINDINGS-ACL.2"},{"key":"e_1_3_2_47_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2021.emnlp-main.685"},{"key":"e_1_3_2_48_2","first-page":"24824","volume-title":"Advances in Neural Information Processing Systems","volume":"35","author":"Wei Jason","year":"2022","unstructured":"Jason Wei, Xuezhi Wang, Dale Schuurmans, Maarten Bosma, Brian Ichter, Fei Xia, Ed H. Chi, Quoc V. Le, and Denny Zhou. 2022. Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35 (2022), 24824\u201324837. Retrieved from https:\/\/proceedings.neurips.cc\/paper_files\/paper\/2022\/file\/9d5609613524ecf4f15af0f7b31abca4-Paper-Conference.pdf"},{"key":"e_1_3_2_49_2","doi-asserted-by":"publisher","DOI":"10.1145\/3491102.3517582"},{"key":"e_1_3_2_50_2","doi-asserted-by":"publisher","DOI":"10.1145\/3487569"},{"key":"e_1_3_2_51_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICSE48619.2023.00167"},{"key":"e_1_3_2_52_2","doi-asserted-by":"publisher","DOI":"10.18653\/V1\/D18-2002"},{"key":"e_1_3_2_53_2","doi-asserted-by":"publisher","DOI":"10.3390\/math11020332"},{"key":"e_1_3_2_54_2","doi-asserted-by":"crossref","unstructured":"Qinkai Zheng Xiao Xia Xu Zou Yuxiao Dong Shan Wang Yufei Xue Zihan Wang Lei Shen Andi Wang Yang Li Teng Su Zhilin Yang and Jie Tang. 2023. CodeGeeX: A pre-trained model for code generation with multilingual evaluations on humaneval-X. arXiv:2303.17568. Retrieved from https:\/\/arxiv.org\/abs\/2303.17568","DOI":"10.1145\/3580305.3599790"}],"container-title":["ACM Transactions on Software Engineering and Methodology"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3702987","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,7,4]],"date-time":"2025-07-04T06:37:46Z","timestamp":1751611066000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3702987"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,7,3]]},"references-count":53,"journal-issue":{"issue":"6","published-print":{"date-parts":[[2025,7,31]]}},"alternative-id":["10.1145\/3702987"],"URL":"https:\/\/doi.org\/10.1145\/3702987","relation":{},"ISSN":["1049-331X","1557-7392"],"issn-type":[{"value":"1049-331X","type":"print"},{"value":"1557-7392","type":"electronic"}],"subject":[],"published":{"date-parts":[[2025,7,3]]},"assertion":[{"value":"2024-03-13","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2024-10-05","order":2,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2025-07-03","order":3,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}