{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,8,2]],"date-time":"2025-08-02T14:29:25Z","timestamp":1754144965572,"version":"3.41.2"},"reference-count":60,"publisher":"Association for Computing Machinery (ACM)","issue":"PLDI","funder":[{"DOI":"10.13039\/501100001459","name":"Ministry of Education - Singapore","doi-asserted-by":"publisher","award":["MOE-T2EP20124-0007"],"award-info":[{"award-number":["MOE-T2EP20124-0007"]}],"id":[{"id":"10.13039\/501100001459","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["Proc. ACM Program. Lang."],"published-print":{"date-parts":[[2025,6,10]]},"abstract":"<jats:p>Translating software between programming languages is a challenging task, for which automated techniques have been elusive and hard to scale up to larger programs. A key difficulty in cross-language translation is that one has to re-express the intended behavior of the source program into idiomatic constructs of a different target language. This task needs abstracting away from the source language-specific details, while keeping the overall functionality the same. In this work, we propose a novel and systematic approach for making such translation amenable to automation based on a framework we call program skeletons. A program skeleton retains the high-level structure of the source program by abstracting away and effectively summarizing lower-level concrete code fragments, which can be mechanically translated to the target programming language. A skeleton, by design, permits many different ways of filling in the concrete implementation for fragments, which can work in conjunction with existing data-driven code synthesizers. Most importantly, skeletons can conceptually enable sound decomposition, i.e., if each individual fragment is correctly translated, taken together with the mechanically translated skeleton, the final translated program is deemed to be correct as a whole. We present a prototype system called SKEL embodying the idea of skeleton-based translation from Python to JavaScript. Our results show promising scalability compared to prior works. For 9 real-world Python programs, some with more than about 1k lines of code, 95% of their code fragments can be automatically translated, while about 5% require manual effort. All the final translations are correct with respect to whole-program test suites.<\/jats:p>","DOI":"10.1145\/3729287","type":"journal-article","created":{"date-parts":[[2025,6,13]],"date-time":"2025-06-13T16:02:27Z","timestamp":1749830547000},"page":"920-944","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":0,"title":["Program Skeletons for Automated Program Translation"],"prefix":"10.1145","volume":"9","author":[{"ORCID":"https:\/\/orcid.org\/0000-0003-1444-0237","authenticated-orcid":false,"given":"Bo","family":"Wang","sequence":"first","affiliation":[{"name":"National University of Singapore, Singapore, Singapore"}]},{"ORCID":"https:\/\/orcid.org\/0009-0001-6180-9060","authenticated-orcid":false,"given":"Tianyu","family":"Li","sequence":"additional","affiliation":[{"name":"National University of Singapore, Singapore, Singapore"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-2513-1704","authenticated-orcid":false,"given":"Ruishi","family":"Li","sequence":"additional","affiliation":[{"name":"National University of Singapore, Singapore, Singapore"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-7610-0660","authenticated-orcid":false,"given":"Umang","family":"Mathur","sequence":"additional","affiliation":[{"name":"National University of Singapore, Singapore, Singapore"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-1875-8675","authenticated-orcid":false,"given":"Prateek","family":"Saxena","sequence":"additional","affiliation":[{"name":"National University of Singapore, Singapore, Singapore"}]}],"member":"320","published-online":{"date-parts":[[2025,6,13]]},"reference":[{"key":"e_1_2_2_1_1","doi-asserted-by":"publisher","unstructured":"2002. Chapter 15 - Microsoft Says JUMP\u2014Java User Migration Path. In C# For Java Programmers Brian Bagnall Philip Chen Stephen Goldberg Jeremy Fairdoth and Harold Cabrera (Eds.). isbn:978-1-931836-54-8 https:\/\/doi.org\/10.1016\/B978-193183654-8\/50019-0 10.1016\/B978-193183654-8\/50019-0","DOI":"10.1016\/B978-193183654-8\/50019-0"},{"key":"e_1_2_2_2_1","doi-asserted-by":"publisher","unstructured":"2024. Lost in Translation: A Study of Bugs Introduced by Large Language Models while Translating Code. https:\/\/doi.org\/10.1145\/3597503.3639226 arXiv:2308.03109 [cs] 10.1145\/3597503.3639226","DOI":"10.1145\/3597503.3639226"},{"key":"e_1_2_2_3_1","unstructured":"2025. SKEL: Program Skeletons for Automated Program Translation (GitHub Repository). https:\/\/github.com\/lty12b9b0a1\/SKEL"},{"key":"e_1_2_2_4_1","doi-asserted-by":"publisher","DOI":"10.1145\/2914770.2837628"},{"key":"e_1_2_2_5_1","doi-asserted-by":"publisher","unstructured":"Mar\u00eda Alpuente Daniel Pardo and Alicia Villanueva. 2015. Automatic inference of specifications in the K framework. arXiv preprint arXiv:1512.06941 https:\/\/doi.org\/10.4204\/EPTCS.200.1 10.4204\/EPTCS.200.1","DOI":"10.4204\/EPTCS.200.1"},{"key":"e_1_2_2_6_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-21668-3_10"},{"key":"e_1_2_2_7_1","unstructured":"Anthropic. [n. d.]. Introducing Claude 3.5 Sonnet. https:\/\/www.anthropic.com\/news\/claude-3-5-sonnet"},{"key":"e_1_2_2_8_1","doi-asserted-by":"publisher","DOI":"10.1109\/DSN.2018.00074"},{"key":"e_1_2_2_9_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-17524-9_1"},{"key":"e_1_2_2_10_1","doi-asserted-by":"publisher","DOI":"10.1145\/1480881.1480917"},{"key":"e_1_2_2_11_1","doi-asserted-by":"publisher","DOI":"10.1145\/3689735"},{"key":"e_1_2_2_12_1","unstructured":"Xinyun Chen Chang Liu and Dawn Song. 2018. Tree-to-tree Neural Networks for Program Translation. arxiv:1802.03691."},{"key":"e_1_2_2_13_1","first-page":"1","article-title":"Palm: Scaling language modeling with pathways","volume":"24","author":"Chowdhery Aakanksha","year":"2023","unstructured":"Aakanksha Chowdhery, Sharan Narang, Jacob Devlin, Maarten Bosma, Gaurav Mishra, Adam Roberts, Paul Barham, Hyung Won Chung, Charles Sutton, and Sebastian Gehrmann. 2023. Palm: Scaling language modeling with pathways. Journal of Machine Learning Research, 24, 240 (2023), 1\u2013113.","journal-title":"Journal of Machine Learning Research"},{"key":"e_1_2_2_14_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICSM.2013.85"},{"key":"e_1_2_2_15_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.scico.2006.04.002"},{"key":"e_1_2_2_16_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-642-35873-9_10"},{"key":"e_1_2_2_17_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICECCS.2019.00031"},{"key":"e_1_2_2_18_1","unstructured":"J\u00e9r\u00f4me Dohrau. 2022. Automatic Inference of Permission Specifications. Ph. D. Dissertation. ETH Zurich. https:\/\/pm.inf.ethz.ch\/publications\/Dohrau2022.pdf"},{"key":"e_1_2_2_19_1","doi-asserted-by":"publisher","DOI":"10.1145\/3660791"},{"key":"e_1_2_2_20_1","doi-asserted-by":"publisher","unstructured":"Hasan Ferit Eniser Hanliang Zhang Cristina David Meng Wang Brandon Paulsen Joey Dodds and Daniel Kroening. 2024. Towards Translating Real-World Code with LLMs: A Study of Translating to Rust. arXiv preprint arXiv:2405.11514 https:\/\/doi.org\/10.48550\/arXiv.2405.11514 10.48550\/arXiv.2405.11514","DOI":"10.48550\/arXiv.2405.11514"},{"key":"e_1_2_2_21_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.scico.2007.01.015"},{"key":"e_1_2_2_22_1","unstructured":"gotranspile. [n. d.]. cxgo: C to Go Translators. https:\/\/github.com\/gotranspile\/cxgo"},{"key":"e_1_2_2_23_1","doi-asserted-by":"publisher","DOI":"10.48550\/arXiv.2410.24117"},{"volume-title":"d.]. c2rust: Migrate C code to Rust. https:\/\/github.com\/immunant\/c2rust Accessed","year":"2024","key":"e_1_2_2_24_1","unstructured":"Immunant. [n. d.]. c2rust: Migrate C code to Rust. https:\/\/github.com\/immunant\/c2rust Accessed: Nov 1, 2024"},{"key":"e_1_2_2_25_1","unstructured":"Anna Irrera. 2017. Banks scramble to fix old systems as IT \u2019cowboys\u2019 ride into sunset. https:\/\/www.reuters.com\/article\/us-usa-banks-cobol-idUSKBN17C0D8"},{"key":"e_1_2_2_26_1","doi-asserted-by":"publisher","DOI":"10.1145\/2908080.2908117"},{"key":"e_1_2_2_27_1","volume-title":"Unsupervised Translation of Programming Languages. CoRR, abs\/2006.03511","author":"Lachaux Marie-Anne","year":"2020","unstructured":"Marie-Anne Lachaux, Baptiste Rozi\u00e8re, Lowik Chanussot, and Guillaume Lample. 2020. Unsupervised Translation of Programming Languages. CoRR, abs\/2006.03511 (2020), arXiv:2006.03511. arxiv:2006.03511"},{"key":"e_1_2_2_28_1","unstructured":"Marie-Anne Lachaux Baptiste Roziere Lowik Chanussot and Guillaume Lample. 2020. Unsupervised Translation of Programming Languages. arxiv:2006.03511."},{"key":"e_1_2_2_29_1","doi-asserted-by":"publisher","DOI":"10.1145\/3314221.3314634"},{"key":"e_1_2_2_30_1","doi-asserted-by":"publisher","unstructured":"Nelson F. Liu Kevin Lin John Hewitt Ashwin Paranjape Michele Bevilacqua Fabio Petroni and Percy Liang. 2023. Lost in the Middle: How Language Models Use Long Contexts. https:\/\/doi.org\/10.48550\/arXiv.2307.03172 arxiv:2307.03172. 10.48550\/arXiv.2307.03172","DOI":"10.48550\/arXiv.2307.03172"},{"key":"e_1_2_2_31_1","doi-asserted-by":"publisher","DOI":"10.48550\/arXiv.2402.19173"},{"key":"e_1_2_2_32_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICSE55347.2025.00129"},{"key":"e_1_2_2_33_1","doi-asserted-by":"publisher","DOI":"10.1007\/3-540-10235-3"},{"key":"e_1_2_2_34_1","doi-asserted-by":"publisher","DOI":"10.1145\/2556782"},{"key":"e_1_2_2_35_1","doi-asserted-by":"publisher","DOI":"10.48550\/arXiv.2203.13474"},{"key":"e_1_2_2_36_1","doi-asserted-by":"publisher","DOI":"10.1145\/2908080.2908099"},{"key":"e_1_2_2_37_1","doi-asserted-by":"publisher","unstructured":"Jialing Pan Adrien Sad\u00e9 Jin Kim Eric Soriano Guillem Sole and Sylvain Flamant. 2023. SteloCoder: a Decoder-Only LLM for Multi-Language to Python Code Translation. arXiv preprint arXiv:2310.15539 https:\/\/doi.org\/10.48550\/arXiv.2310.15539 10.48550\/arXiv.2310.15539","DOI":"10.48550\/arXiv.2310.15539"},{"key":"e_1_2_2_38_1","doi-asserted-by":"publisher","DOI":"10.1145\/3597503.3639226"},{"key":"e_1_2_2_39_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICSE.2012.6227137"},{"key":"e_1_2_2_40_1","unstructured":"Terence Parr. 2024. StringTemplate. https:\/\/www.stringtemplate.org\/"},{"key":"e_1_2_2_41_1","doi-asserted-by":"publisher","DOI":"10.1145\/2858965.2814310"},{"volume-title":"Transcrypt: Python to JavaScript compiler. https:\/\/github.com\/qquick\/Transcrypt","year":"2022","key":"e_1_2_2_42_1","unstructured":"QualityQuick. 2022. Transcrypt: Python to JavaScript compiler. https:\/\/github.com\/qquick\/Transcrypt"},{"key":"e_1_2_2_43_1","doi-asserted-by":"publisher","DOI":"10.48550\/arXiv.2308.12950"},{"key":"e_1_2_2_44_1","doi-asserted-by":"publisher","unstructured":"Baptiste Roziere Jie M Zhang Francois Charton Mark Harman Gabriel Synnaeve and Guillaume Lample. 2021. Leveraging automated unit tests for unsupervised code translation. arXiv preprint arXiv:2110.06773 https:\/\/doi.org\/10.48550\/arXiv.2110.06773 10.48550\/arXiv.2110.06773","DOI":"10.48550\/arXiv.2110.06773"},{"key":"e_1_2_2_45_1","doi-asserted-by":"publisher","DOI":"10.1145\/1375581.1375599"},{"key":"e_1_2_2_46_1","doi-asserted-by":"publisher","DOI":"10.1145\/3656413"},{"key":"e_1_2_2_47_1","volume-title":"USENIX Security Symposium. 379\u2013394","author":"Tan Lin","year":"2008","unstructured":"Lin Tan, Xiaolan Zhang, Xiao Ma, Weiwei Xiong, and Yuanyuan Zhou. 2008. AutoISES: Automatically Inferring Security Specification and Detecting Violations.. In USENIX Security Symposium. 379\u2013394. https:\/\/www.usenix.org\/legacy\/events\/sec08\/tech\/full_papers\/tan_l\/tan_l.pdf"},{"key":"e_1_2_2_48_1","doi-asserted-by":"publisher","DOI":"10.1109\/52.895180"},{"key":"e_1_2_2_49_1","doi-asserted-by":"publisher","unstructured":"Vasudev Vikram Caroline Lemieux Joshua Sunshine and Rohan Padhye. 2023. Can large language models write good property-based tests? arXiv preprint arXiv:2307.04346 https:\/\/doi.org\/10.48550\/arXiv.2307.04346 10.48550\/arXiv.2307.04346","DOI":"10.48550\/arXiv.2307.04346"},{"key":"e_1_2_2_50_1","doi-asserted-by":"publisher","DOI":"10.1145\/3586034"},{"key":"e_1_2_2_51_1","doi-asserted-by":"publisher","DOI":"10.1145\/3611643.3616322"},{"key":"e_1_2_2_52_1","doi-asserted-by":"publisher","unstructured":"Bo Wang Tianyu Li Ruishi Li Umang Mathur and Prateek Saxena. 2025. Program Skeletons for Automated Program Translation (Artifact). https:\/\/doi.org\/10.5281\/zenodo.14994890 10.5281\/zenodo.14994890","DOI":"10.5281\/zenodo.14994890"},{"key":"e_1_2_2_53_1","doi-asserted-by":"publisher","unstructured":"Bo Wang Tianyu Li Ruishi Li Umang Mathur and Prateek Saxena. 2025. Program Skeletons for Automated Program Translation (Technical Report). arXiv preprint arXiv:2504.07483 https:\/\/doi.org\/10.48550\/arXiv.2504.07483 10.48550\/arXiv.2504.07483","DOI":"10.48550\/arXiv.2504.07483"},{"key":"e_1_2_2_54_1","doi-asserted-by":"publisher","DOI":"10.1145\/3397481.3450656"},{"key":"e_1_2_2_55_1","doi-asserted-by":"publisher","DOI":"10.48550\/arXiv.2404.18852"},{"key":"e_1_2_2_56_1","doi-asserted-by":"publisher","DOI":"10.48550\/arXiv.2407.07472"},{"key":"e_1_2_2_57_1","doi-asserted-by":"publisher","DOI":"10.1145\/3368089.3409716"},{"key":"e_1_2_2_58_1","doi-asserted-by":"publisher","unstructured":"Xing Zhang Jiaheng Wen Fangkai Yang Pu Zhao Yu Kang Junhao Wang Maoquan Wang Yufan Huang Elsie Nallipogu Qingwei Lin Yingnong Dang Saravan Rajmohan Dongmei Zhang and Qi Zhang. 2025. Skeleton-Guided-Translation: A Benchmarking Framework for Code Repository Translation with Fine-Grained Quality Evaluation. https:\/\/doi.org\/10.48550\/arXiv.2501.16050 arxiv:2501.16050. 10.48550\/arXiv.2501.16050","DOI":"10.48550\/arXiv.2501.16050"},{"key":"e_1_2_2_59_1","doi-asserted-by":"publisher","unstructured":"Zibin Zheng Kaiwen Ning Yanlin Wang Jingwen Zhang Dewu Zheng Mingxi Ye and Jiachi Chen. 2023. A survey of large language models for code: Evolution benchmarking and future trends. arXiv preprint arXiv:2311.10372 https:\/\/doi.org\/10.48550\/arXiv.2311.10372 10.48550\/arXiv.2311.10372","DOI":"10.48550\/arXiv.2311.10372"},{"key":"e_1_2_2_60_1","doi-asserted-by":"publisher","unstructured":"Qihao Zhu Daya Guo Zhihong Shao Dejian Yang Peiyi Wang Runxin Xu Y Wu Yukun Li Huazuo Gao and Shirong Ma. 2024. Deepseek-coder-v2: Breaking the barrier of closed-source models in code intelligence. arXiv preprint arXiv:2406.11931 https:\/\/doi.org\/10.48550\/arXiv.2406.11931 10.48550\/arXiv.2406.11931","DOI":"10.48550\/arXiv.2406.11931"}],"container-title":["Proceedings of the ACM on Programming Languages"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3729287","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,7,16]],"date-time":"2025-07-16T06:06:06Z","timestamp":1752645966000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3729287"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,6,10]]},"references-count":60,"journal-issue":{"issue":"PLDI","published-print":{"date-parts":[[2025,6,10]]}},"alternative-id":["10.1145\/3729287"],"URL":"https:\/\/doi.org\/10.1145\/3729287","relation":{},"ISSN":["2475-1421"],"issn-type":[{"type":"electronic","value":"2475-1421"}],"subject":[],"published":{"date-parts":[[2025,6,10]]},"assertion":[{"value":"2024-11-15","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2025-03-06","order":2,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2025-06-13","order":3,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}