{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,6,15]],"date-time":"2026-06-15T11:25:14Z","timestamp":1781522714036,"version":"3.54.1"},"reference-count":40,"publisher":"Springer Science and Business Media LLC","issue":"8","license":[{"start":{"date-parts":[[2025,1,13]],"date-time":"2025-01-13T00:00:00Z","timestamp":1736726400000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/www.springernature.com\/gp\/researchers\/text-and-data-mining"},{"start":{"date-parts":[[2025,1,13]],"date-time":"2025-01-13T00:00:00Z","timestamp":1736726400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.springernature.com\/gp\/researchers\/text-and-data-mining"}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Front. Comput. Sci."],"published-print":{"date-parts":[[2025,8]]},"DOI":"10.1007\/s11704-024-40415-9","type":"journal-article","created":{"date-parts":[[2025,1,13]],"date-time":"2025-01-13T02:14:04Z","timestamp":1736734444000},"update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":8,"title":["Top Pass: improve code generation by pass@k-maximized code ranking"],"prefix":"10.1007","volume":"19","author":[{"given":"Zhicun","family":"Lyu","sequence":"first","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Xinye","family":"Li","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Zheng","family":"Xie","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Ming","family":"Li","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]}],"member":"297","published-online":{"date-parts":[[2025,1,13]]},"reference":[{"key":"40415_CR1","unstructured":"Li R, Allal L B, Zi Y, Muennighoff N, Kocetkov D, et al. StarCoder: may the source be with you! 2016, arXiv preprint arXiv: 2305.06161"},{"key":"40415_CR2","volume-title":"Proceedings of the 11th International Conference on Learning Representations","author":"E Nijkamp","year":"2023","unstructured":"Nijkamp E, Pang B, Hayashi H, Tu L, Wang H, Zhou Y, Savarese S, Xiong C. CodeGen: an open large language model for code with multi-turn program synthesis. In: Proceedings of the 11th International Conference on Learning Representations. 2023"},{"key":"40415_CR3","unstructured":"Rozi\u00e8re B, Gehring J, Gloeckle F, Sootla S, Gat I, Tan X E, Adi Y, Liu J, Sauvestre R, Remez T, Rapin J, Kozhevnikov A, Evtimov I, Bitton J, Bhatt M, Ferrer C C, Grattafiori A, Xiong W, D\u00e9fossez A, Copet J, Azhar F, Touvron H, Martin L, Usunier N, Scialom T, Synnaeve G. Code LLaMa: open foundation models for code. 2024, arXiv preprint arXiv: 2308.12950"},{"issue":"6","key":"40415_CR4","doi-asserted-by":"publisher","first-page":"176214","DOI":"10.1007\/s11704-022-2313-0","volume":"17","author":"Y Hu","year":"2023","unstructured":"Hu Y, Jiang H, Hu Z. Measuring code maintainability with deep neural networks. Frontiers of Computer Science, 2023, 17(6): 176214","journal-title":"Frontiers of Computer Science"},{"key":"40415_CR5","volume-title":"Proceedings of the 11th International Conference on Learning Representations","author":"B Chen","year":"2023","unstructured":"Chen B, Zhang F, Nguyen A, Zan D, Lin Z, Lou J G, Chen W. CodeT: code generation with generated tests. In: Proceedings of the 11th International Conference on Learning Representations. 2023"},{"key":"40415_CR6","first-page":"54769","volume-title":"Proceedings of the 37th Conference on Neural Information Processing Systems","author":"K Zhang","year":"2023","unstructured":"Zhang K, Wang D, Xia J, Wang W Y, Li L. ALGO: synthesizing algorithmic programs with LLM-generated oracle verifiers. In: Proceedings of the 37th Conference on Neural Information Processing Systems. 2023, 54769\u201354784"},{"key":"40415_CR7","unstructured":"Chen M, Tworek J, Jun H, Yuan Q, de Oliveira Pinto H P, et al. Evaluating large language models trained on code. 2021, arXiv preprint arXiv: 2107.03374"},{"key":"40415_CR8","first-page":"13419","volume-title":"Proceedings of the 36th Conference on Neural Information Processing Systems","author":"J P Inala","year":"2022","unstructured":"Inala J P, Wang C, Yang M, Codas A, Encarnaci\u00f3n M, Lahiri S K, Musuvathi M, Gao J. Fault-aware neural code rankers. In: Proceedings of the 36th Conference on Neural Information Processing Systems. 2022, 13419\u201313432"},{"key":"40415_CR9","first-page":"6000","volume-title":"Proceedings of the 31st International Conference on Neural Information Processing Systems","author":"A Vaswani","year":"2017","unstructured":"Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez A N, Kaiser L, Polosukhin I. Attention is all you need. In: Proceedings of the 31st International Conference on Neural Information Processing Systems. 2017, 6000\u20136010"},{"key":"40415_CR10","volume-title":"Improving language understanding by generative pre-training","author":"A Radford","year":"2018","unstructured":"Radford A, Narasimhan K, Salimans T, Sutskever I. Improving language understanding by generative pre-training. 2018"},{"key":"40415_CR11","volume-title":"GPT-Neo: large scale autoregressive language modeling with mesh-tensorflow","author":"S Black","year":"2021","unstructured":"Black S, Leo G, Wang P, Leahy C, Biderman S. GPT-Neo: large scale autoregressive language modeling with mesh-tensorflow. 2021"},{"key":"40415_CR12","unstructured":"Achiam J, Adler S, Agarwal S, Ahmad L, Akkaya I, et al. GPT-4 technical report. 2024, arXiv preprint arXiv: 2303.08774"},{"key":"40415_CR13","unstructured":"Anil R, Dai A M, Firat O, Johnson M, Lepikhin D, et al. PaLM 2 technical report. 2023, arXiv preprint arXiv: 2305.10403"},{"issue":"1","key":"40415_CR14","first-page":"240","volume":"24","author":"A Chowdhery","year":"2024","unstructured":"Chowdhery A, Narang S, Devlin J, Bosma M, Mishra G, et al. PaLM: scaling language modeling with pathways. The Journal of Machine Learning Research, 2024, 24(1): 240","journal-title":"The Journal of Machine Learning Research"},{"key":"40415_CR15","doi-asserted-by":"publisher","first-page":"8696","DOI":"10.18653\/v1\/2021.emnlp-main.685","volume-title":"Proceedings of 2021 Conference on Empirical Methods in Natural Language Processing","author":"Y Wang","year":"2021","unstructured":"Wang Y, Wang W, Joty S, Hoi S C H. CodeT5: identifier-aware unified pre-trained encoder-decoder models for code understanding and generation. In: Proceedings of 2021 Conference on Empirical Methods in Natural Language Processing. 2021, 8696\u20138708"},{"issue":"1","key":"40415_CR16","first-page":"140","volume":"21","author":"C Raffel","year":"2020","unstructured":"Raffel C, Shazeer N, Roberts A, Lee K, Narang S, Matena M, Zhou Y, Li W, Liu P J. Exploring the limits of transfer learning with a unified text-to-text transformer. The Journal of Machine Learning Research, 2020, 21(1): 140","journal-title":"The Journal of Machine Learning Research"},{"issue":"6624","key":"40415_CR17","doi-asserted-by":"publisher","first-page":"1092","DOI":"10.1126\/science.abq1158","volume":"378","author":"Y Li","year":"2022","unstructured":"Li Y, Choi D, Chung J, Kushman N, Schrittwieser J, Leblond R, Eccles T, Keeling J, Gimeno F, Dal Lago A, Hubert T, Choy P, de Masson d\u2019Autume C, Babuschkin I, Chen X, Huang P S, Welbl J, Gowal S, Cherepanov A, Molloy J, Mankowitz D J, Sutherland Robson E, Kohli P, de Freitas N, Kavukcuoglu K, Vinyals O. Competition-level code generation with AlphaCode. Science, 2022, 378(6624): 1092\u20131097","journal-title":"Science"},{"key":"40415_CR18","volume-title":"Proceedings of the 12th International Conference on Learning Representations","author":"Z Luo","year":"2024","unstructured":"Luo Z, Xu C, Zhao P, Sun Q, Geng X, Hu W, Tao C, Ma J, Lin Q, Jiang D. WizardCoder: Empowering code large language models with Evol-Instruct. In: Proceedings of the 12th International Conference on Learning Representations. 2024"},{"key":"40415_CR19","unstructured":"Gunasekar S, Zhang Y, Aneja J, Mendes C C T, Del Giorno A, Gopi S, Javaheripi M, Kauffmann P, de Rosa G, Saarikivi O, Salim A, Shah S, Behl H S, Wang X, Bubeck S, Eldan R, Kalai A T, Lee Y T, Li Y. Textbooks are all you need. 2023, arXiv preprint arXiv: 2306.11644"},{"key":"40415_CR20","unstructured":"Bi X, Chen D, Chen G, Chen S, Dai D, et al. DeepSeek LLM: scaling open-source language models with longtermism. 2024, arXiv preprint arXiv: 2401.02954"},{"key":"40415_CR21","doi-asserted-by":"publisher","first-page":"5673","DOI":"10.1145\/3580305.3599790","volume-title":"Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining","author":"Q Zheng","year":"2023","unstructured":"Zheng Q, Xia X, Zou X, Dong Y, Wang S, Xue Y, Shen L, Wang Z, Wang A, Li Y, Su T, Yang Z, Tang J. CodeGeeX: a pre-trained model for code generation with multilingual benchmarking on HumanEval-X. In: Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining. 2023, 5673\u20135684"},{"key":"40415_CR22","volume-title":"Proceedings of the 11th International Conference on Learning Representations","author":"D Fried","year":"2023","unstructured":"Fried D, Aghajanyan A, Lin J, Wang S, Wallace E, Shi F, Zhong R, Yih S, Zettlemoyer L, Lewis M. InCoder: a generative model for code infilling and synthesis. In: Proceedings of the 11th International Conference on Learning Representations. 2023"},{"key":"40415_CR23","volume-title":"Proceedings of the 12th International Conference on Learning Representations","author":"X Chen","year":"2024","unstructured":"Chen X, Lin M, Sch\u00e4rli N, Zhou D. Teaching large language models to self-debug. In: Proceedings of the 12th International Conference on Learning Representations. 2024"},{"key":"40415_CR24","first-page":"943","volume-title":"Proceedings of the 37th International Conference on Neural Information Processing Systems","author":"J Liu","year":"2024","unstructured":"Liu J, Xia C S, Wang Y, Zhang L. Is your code generated by ChatGPT really correct? Rigorous evaluation of large language models for code generation. In: Proceedings of the 37th International Conference on Neural Information Processing Systems. 2024, 943"},{"key":"40415_CR25","doi-asserted-by":"publisher","first-page":"423","DOI":"10.1145\/3597926.3598067","volume-title":"Proceedings of the 32nd ACM SIGSOFT International Symposium on Software Testing and Analysis","author":"Y Deng","year":"2023","unstructured":"Deng Y, Xia C S, Peng H, Yang C, Zhang L. Large language models are zero-shot fuzzers: Fuzzing deep-learning libraries via large language models. In: Proceedings of the 32nd ACM SIGSOFT International Symposium on Software Testing and Analysis. 2023, 423\u2013435"},{"key":"40415_CR26","first-page":"261","volume-title":"Proceedings of the 27th IEEE International Conference on Software Analysis, Evolution and Reengineering","author":"W Wang","year":"2020","unstructured":"Wang W, Li G, Ma B, Xia X, Jin Z. Detecting code clones with graph neural network and flow-augmented abstract syntax tree. In: Proceedings of the 27th IEEE International Conference on Software Analysis, Evolution and Reengineering. 2020, 261\u2013271"},{"key":"40415_CR27","first-page":"483","volume-title":"Proceedings of 2021 IEEE International Conference on Software Maintenance and Evolution","author":"J Gu","year":"2021","unstructured":"Gu J, Chen Z, Monperrus M. Multimodal representation for neural code search. In: Proceedings of 2021 IEEE International Conference on Software Maintenance and Evolution. 2021, 483\u2013494"},{"key":"40415_CR28","first-page":"761","volume-title":"Proceedings of the 36th International Conference on Neural Information Processing Systems","author":"S Arakelyan","year":"2024","unstructured":"Arakelyan S, Hakhverdyan A, Allamanis M, Garcia L, Hauser C, Ren X. NS3: neuro-symbolic semantic code search. In: Proceedings of the 36th International Conference on Neural Information Processing Systems. 2024, 761"},{"issue":"5","key":"40415_CR29","doi-asserted-by":"publisher","first-page":"185207","DOI":"10.1007\/s11704-023-2771-z","volume":"18","author":"Z Li","year":"2024","unstructured":"Li Z, Pan M, Pei Y, Zhang T, Wang L, Li X. Empirically revisiting and enhancing automatic classification of bug and non-bug issues. Frontiers of Computer Science, 2024, 18(5): 185207","journal-title":"Frontiers of Computer Science"},{"key":"40415_CR30","first-page":"474","volume-title":"Proceedings of the 37th International Conference on Machine Learning","author":"A Kanade","year":"2020","unstructured":"Kanade A, Maniatis P, Balakrishnan G, Shi K. Learning and evaluating contextual embedding of source code. In: Proceedings of the 37th International Conference on Machine Learning. 2020, 474"},{"key":"40415_CR31","doi-asserted-by":"publisher","first-page":"1536","DOI":"10.18653\/v1\/2020.findings-emnlp.139","volume-title":"Findings of the Association for Computational Linguistics: EMNLP 2020","author":"Z Feng","year":"2020","unstructured":"Feng Z, Guo D, Tang D, Duan N, Feng X, Gong M, Shou L, Qin B, Liu T, Jiang D, Zhou M. CodeBERT: a pre-trained model for programming and natural languages. In: Findings of the Association for Computational Linguistics: EMNLP 2020. 2020, 1536\u20131547"},{"key":"40415_CR32","volume-title":"Proceedings of the 9th International Conference on Learning Representations","author":"D Guo","year":"2021","unstructured":"Guo D, Ren S, Lu S, Feng Z, Tang D, Liu S, Zhou L, Duan N, Svyatkovskiy A, Fu S, Tufano M, Deng S K, Clement C B, Drain D, Sundaresan N, Yin J, Jiang D, Zhou M. GraphCodeBERT: pre-training code representations with data flow. In: Proceedings of the 9th International Conference on Learning Representations. 2021"},{"key":"40415_CR33","first-page":"7212","volume-title":"Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics","author":"D Guo","year":"2022","unstructured":"Guo D, Lu S, Duan N, Wang Y, Zhou M, Yin J. UniXcoder: unified cross-modal pre-training for code representation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics. 2022, 7212\u20137225"},{"key":"40415_CR34","first-page":"2655","volume-title":"Proceedings of 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies","author":"W Ahmad","year":"2021","unstructured":"Ahmad W, Chakraborty S, Ray B, Chang K W. Unified pre-training for program understanding and generation. In: Proceedings of 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 2021, 2655\u20132668"},{"key":"40415_CR35","unstructured":"Wang X, Wang Y, Mi F, Zhou P, Wan Y, Liu X, Li L, Wu H, Liu J, Jiang X. SynCoBERT: Syntax-guided multi-modal contrastive pre-training for code representation. 2021, arXiv preprint arXiv: 2108.04556"},{"key":"40415_CR36","volume-title":"Proceedings of the 8th International Conference on Learning Representations","author":"K Clark","year":"2020","unstructured":"Clark K, Luong M T, Le Q V, Manning C D. Electra: pre-training text encoders as discriminators rather than generators. In: Proceedings of the 8th International Conference on Learning Representations. 2020"},{"key":"40415_CR37","volume-title":"Proceedings of the 35th Conference on Neural Information Processing Systems","author":"D Hendrycks","year":"2021","unstructured":"Hendrycks D, Basart S, Kadavath S, Mazeika M, Arora A, Guo E, Burns C, Puranik S, He H, Song D, Steinhardt J. Measuring coding challenge competence with APPS. In: Proceedings of the 35th Conference on Neural Information Processing Systems. 2021"},{"key":"40415_CR38","unstructured":"Austin J, Odena A, Nye M, Bosma M, Michalewski H, Dohan D, Jiang E, Cai C, Terry M, Le Q, Sutton C. Program synthesis with large language models. 2021, arXiv preprint arXiv: 2108.07732"},{"key":"40415_CR39","volume-title":"ChatGPT: optimizing language models for dialogue","author":"OpenAI","year":"2022","unstructured":"OpenAI. ChatGPT: optimizing language models for dialogue. 2022"},{"key":"40415_CR40","first-page":"41832","volume-title":"Proceedings of the 40th International Conference on Machine Learning","author":"T Zhang","year":"2023","unstructured":"Zhang T, Yu T, Hashimoto T B, Lewis M, Yih W T, Fried D, Wang S I. Coder reviewer reranking for code generation. In: Proceedings of the 40th International Conference on Machine Learning. 2023, 41832\u201341846"}],"container-title":["Frontiers of Computer Science"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s11704-024-40415-9.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1007\/s11704-024-40415-9\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s11704-024-40415-9.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,1,13]],"date-time":"2025-01-13T02:14:28Z","timestamp":1736734468000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/s11704-024-40415-9"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,1,13]]},"references-count":40,"journal-issue":{"issue":"8","published-print":{"date-parts":[[2025,8]]}},"alternative-id":["40415"],"URL":"https:\/\/doi.org\/10.1007\/s11704-024-40415-9","relation":{},"ISSN":["2095-2228","2095-2236"],"issn-type":[{"value":"2095-2228","type":"print"},{"value":"2095-2236","type":"electronic"}],"subject":[],"published":{"date-parts":[[2025,1,13]]},"assertion":[{"value":"28 April 2024","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"25 July 2024","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"13 January 2025","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"Competing interests Ming Li is an Editorial Board member of the journal and a co-author of this article. To minimize bias, they were excluded from all editorial decision-making related to the acceptance of this article for publication. The remaining authors declare no conflict of interest.","order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Ethics"}}],"article-number":"198341"}}