{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,26]],"date-time":"2026-03-26T06:53:41Z","timestamp":1774508021193,"version":"3.50.1"},"reference-count":35,"publisher":"Springer Science and Business Media LLC","issue":"1","license":[{"start":{"date-parts":[[2023,12,6]],"date-time":"2023-12-06T00:00:00Z","timestamp":1701820800000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2023,12,6]],"date-time":"2023-12-06T00:00:00Z","timestamp":1701820800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"DOI":"10.13039\/501100000781","name":"European Research Council","doi-asserted-by":"publisher","award":["741278"],"award-info":[{"award-number":["741278"]}],"id":[{"id":"10.13039\/501100000781","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Empir Software Eng"],"published-print":{"date-parts":[[2024,1]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>Source-to-source code translation automatically translates a program from one programming language to another. The existing research on code translation evaluates the effectiveness of their approaches by using either syntactic similarities (e.g., BLEU score), or test execution results. The former does not consider semantics, the latter considers semantics but falls short on the problem of insufficient data and tests. In this paper, we propose <jats:bold>MBTA<\/jats:bold> (<jats:bold>M<\/jats:bold>utation-<jats:bold>b<\/jats:bold>ased Code <jats:bold>T<\/jats:bold>ranslation <jats:bold>A<\/jats:bold>nalysis), a novel application of mutation analysis for code translation assessment. We also introduce <jats:bold>MTS<\/jats:bold> (<jats:bold>M<\/jats:bold>utation-based <jats:bold>T<\/jats:bold>ranslation <jats:bold>S<\/jats:bold>core), a measure to compute the level of trustworthiness of a translator. If a mutant of an input program shows different test execution results from its translated version, the mutant is killed and a translation bug is revealed. Fewer killed mutants indicate better code translation. MBTA is novel in the sense that mutants are compared to their translated counterparts, and not to their original program\u2019s translation. We conduct a proof-of-concept case study with 612 Java-Python program pairs and 75,082 mutants on the code translators TransCoder and j2py to evaluate the feasibility of MBTA. The results reveal that TransCoder and j2py fail to translate 70.44% and 70.64% of the mutants, respectively, i.e., more than two-thirds of all mutants are incorrectly translated by these translators. By analysing the MTS results more closely, we were able to reveal translation bugs not captured by the conventional comparison between the original and translated programs.<\/jats:p>","DOI":"10.1007\/s10664-023-10385-w","type":"journal-article","created":{"date-parts":[[2023,12,6]],"date-time":"2023-12-06T12:02:31Z","timestamp":1701864151000},"update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":7,"title":["Mutation analysis for evaluating code translation"],"prefix":"10.1007","volume":"29","author":[{"ORCID":"https:\/\/orcid.org\/0000-0001-5361-2973","authenticated-orcid":false,"given":"Giovani","family":"Guizzo","sequence":"first","affiliation":[]},{"given":"Jie M.","family":"Zhang","sequence":"additional","affiliation":[]},{"given":"Federica","family":"Sarro","sequence":"additional","affiliation":[]},{"given":"Christoph","family":"Treude","sequence":"additional","affiliation":[]},{"given":"Mark","family":"Harman","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2023,12,6]]},"reference":[{"key":"10385_CR1","doi-asserted-by":"publisher","unstructured":"Aggarwal K, Salameh M, Hindle A (2015) Using machine translation for converting Python 2 to Python 3 code. PeerJ PrePrints 3:e1459v1. https:\/\/doi.org\/10.7287\/peerj.preprints.1459v1","DOI":"10.7287\/peerj.preprints.1459v1"},{"issue":"5","key":"10385_CR2","doi-asserted-by":"publisher","first-page":"507","DOI":"10.1109\/TSE.2014.2372785","volume":"41","author":"ET Barr","year":"2015","unstructured":"Barr ET, Harman M, McMinn P, Shahbaz M, Yoo S (2015) The oracle problem in software testing: a survey. IEEE Trans Softw Eng 41(5):507\u2013525. https:\/\/doi.org\/10.1109\/TSE.2014.2372785","journal-title":"IEEE Trans Softw Eng"},{"key":"10385_CR3","doi-asserted-by":"crossref","unstructured":"Chen Q, Zhou M (2018) A neural framework for retrieval and summarization of source code. In: 2018 33rd IEEE\/ACM international conference on automated software engineering (ASE), IEEE, pp 826\u2013831","DOI":"10.1145\/3238147.3240471"},{"issue":"1","key":"10385_CR4","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1145\/3143561","volume":"51","author":"TY Chen","year":"2018","unstructured":"Chen TY, Kuo FC, Liu H, Poon PL, Towey D, Tse T, Zhou ZQ (2018) Metamorphic testing: a review of challenges and opportunities. ACM Computing Surveys (CSUR) 51(1):1\u201327","journal-title":"ACM Computing Surveys (CSUR)"},{"key":"10385_CR5","unstructured":"Chen X, Liu C, Song D (2018b) Tree-to-tree neural networks for program translation. arXiv:1802.03691"},{"key":"10385_CR6","unstructured":"Guizzo G, Sarro F, Krinke J, Vergilio SR (2020) Sentinel: a hyper-heuristic for the generation of mutant reduction strategies. IEEE Transactions on Software Engineering"},{"key":"10385_CR7","doi-asserted-by":"crossref","unstructured":"Hort M, Zhang JM, Sarro F, Harman M (2021) Fairea: a model behaviour mutation approach to benchmarking bias mitigation methods. In: Proceedings of the 29th ACM joint meeting on European software engineering conference and symposium on the foundations of software engineering, pp 994\u20131006","DOI":"10.1145\/3468264.3468565"},{"key":"10385_CR8","doi-asserted-by":"crossref","unstructured":"Hu X, Li G, Xia X, Lo D, Jin Z (2018) Deep code comment generation. In: 2018 IEEE\/ACM 26th international conference on program comprehension (ICPC), IEEE, pp 200\u201320010","DOI":"10.1145\/3196321.3196334"},{"issue":"5","key":"10385_CR9","doi-asserted-by":"publisher","first-page":"649","DOI":"10.1109\/tse.2010.62","volume":"37","author":"Y Jia","year":"2011","unstructured":"Jia Y, Harman M (2011) An analysis and survey of the development of mutation testing. IEEE Trans Softw Eng 37(5):649\u2013678. https:\/\/doi.org\/10.1109\/tse.2010.62","journal-title":"IEEE Trans Softw Eng"},{"key":"10385_CR10","doi-asserted-by":"publisher","unstructured":"Karaivanov S, Raychev V, Vechev M (2014) Phrase-based statistical translation of programming languages. In: Proceedings of the 2014 ACM international symposium on new ideas, new paradigms, and reflections on programming & software, association for computing machinery, New York, NY, USA, Onward! 2014, pp 173\u2013184. https:\/\/doi.org\/10.1145\/2661136.2661148","DOI":"10.1145\/2661136.2661148"},{"issue":"2","key":"10385_CR11","doi-asserted-by":"publisher","first-page":"97","DOI":"10.1002\/stvr.308","volume":"15","author":"YS Ma","year":"2005","unstructured":"Ma YS, Offutt J, Kwon YR (2005) Mujava: an automated class mutation system: Research articles. Softw Test Verif Reliab 15(2):97\u2013133","journal-title":"Softw Test Verif Reliab"},{"key":"10385_CR12","unstructured":"Melhase T (2022) java2python. https:\/\/github.com\/natural\/java2python. Accessed 13 Nov 2023"},{"key":"10385_CR13","doi-asserted-by":"publisher","unstructured":"Nguyen AT, Nguyen TT, Nguyen TN (2013) Lexical statistical machine translation for language migration. In: Proceedings of the 2013 9th joint meeting on foundations of software engineering, association for computing machinery, New York, NY, USA, ESEC\/FSE 2013, pp 651\u2013654. https:\/\/doi.org\/10.1145\/2491411.2494584","DOI":"10.1145\/2491411.2494584"},{"key":"10385_CR14","doi-asserted-by":"crossref","unstructured":"Offutt AJ, Untch RH (2001) Mutation testing for the new century, Springer, chap Mutation 2000: Uniting the Orthogonal, pp 34\u201344","DOI":"10.1007\/978-1-4757-5939-6_7"},{"key":"10385_CR15","unstructured":"Offutt J (2014) Mujava home page. https:\/\/cs.gmu.edu\/~offutt\/mujava\/, Accessed 05 June 2023"},{"key":"10385_CR16","doi-asserted-by":"publisher","unstructured":"Papadakis M, Shin D, Yoo S, Bae DH (2018) Are mutation scores correlated with real fault detection? A large scale empirical study on the relationship between mutants and real faults. In: Proceedings of the 40th international conference on software engineering, association for computing machinery, New York, NY, USA, ICSE \u201918, pp 537\u2013548. https:\/\/doi.org\/10.1145\/3180155.3180183","DOI":"10.1145\/3180155.3180183"},{"key":"10385_CR17","doi-asserted-by":"publisher","unstructured":"Papadakis M, Kintis M, Zhang J, Jia Y, Traon YL, Harman M (2019) Chapter six - mutation testing advances: an analysis and survey. In: Memon AM (ed) Advances in computers, vol 112, Elsevier, pp 275\u2013378. https:\/\/doi.org\/10.1016\/bs.adcom.2018.03.015","DOI":"10.1016\/bs.adcom.2018.03.015"},{"key":"10385_CR18","doi-asserted-by":"crossref","unstructured":"Papineni K, Roukos S, Ward T, Zhu WJ (2002) Bleu: a method for automatic evaluation of machine translation. In: Proceedings of the 40th annual meeting of the association for computational linguistics, pp 311\u2013318","DOI":"10.3115\/1073083.1073135"},{"key":"10385_CR19","unstructured":"Roziere B, Lachaux MA, Chanussot L, Lample G (2020) Unsupervised translation of programming languages. Adv Neural Info Process Syst 33"},{"key":"10385_CR20","unstructured":"Rozi\u00e8re B, Lachaux M, Szafraniec M, Lample G (2021a) DOBF: a deobfuscation pre-training objective for programming languages. CoRR arXiv:2102.07492, https:\/\/arxiv.org\/abs\/2102.07492"},{"key":"10385_CR21","unstructured":"Rozi\u00e8re B, Zhang JM, Charton F, Harman M, Synnaeve G, Lample G (2021b) Leveraging automated unit tests for unsupervised code translation. CoRR arXiv:2110.06773, https:\/\/arxiv.org\/abs\/2110.06773"},{"issue":"1","key":"10385_CR22","doi-asserted-by":"publisher","first-page":"72","DOI":"10.2307\/1422689","volume":"15","author":"C Spearman","year":"1904","unstructured":"Spearman C (1904) The proof and measurement of association between two things. Am J Psychol 15(1):72\u2013101. https:\/\/doi.org\/10.2307\/1422689","journal-title":"Am J Psychol"},{"key":"10385_CR23","doi-asserted-by":"crossref","unstructured":"Sun Z, Zhang JM, Harman M, Papadakis M, Zhang L (2020a) Automatic testing and improvement of machine translation. In: Proceedings of the ACM\/IEEE 42nd international conference on software engineering, pp 974\u2013985","DOI":"10.1145\/3377811.3380420"},{"key":"10385_CR24","first-page":"8984","volume":"34","author":"Z Sun","year":"2020","unstructured":"Sun Z, Zhu Q, Xiong Y, Sun Y, Mou L, Zhang L (2020) Treegen: a tree-based transformer architecture for code generation. Proc AAAI Conf Artif Intel 34:8984\u20138991","journal-title":"Proc AAAI Conf Artif Intel"},{"key":"10385_CR25","doi-asserted-by":"crossref","unstructured":"Sun Z, Zhang JM, Xiong Y, Harman M, Papadakis M, Zhang L (2022) Improving machine translation systems via isotopic replacement. In: Proceedings of the 44th international conference on software engineering, pp 1181\u20131192","DOI":"10.1145\/3510003.3510206"},{"issue":"6","key":"10385_CR26","doi-asserted-by":"publisher","first-page":"111","DOI":"10.1109\/52.895180","volume":"17","author":"AA Terekhov","year":"2000","unstructured":"Terekhov AA, Verhoef C (2000) The realities of language conversions. IEEE Softw 17(6):111\u2013124","journal-title":"IEEE Softw"},{"issue":"8","key":"10385_CR27","doi-asserted-by":"publisher","first-page":"1207","DOI":"10.1109\/32.7629","volume":"14","author":"RC Waters","year":"1988","unstructured":"Waters RC (1988) Program translation via abstraction and reimplementation. IEEE Trans Softw Eng 14(8):1207\u20131228. https:\/\/doi.org\/10.1109\/32.7629","journal-title":"IEEE Trans Softw Eng"},{"key":"10385_CR28","doi-asserted-by":"publisher","unstructured":"Yasumatsu K, Doi N (1995) Spice: a system for translating smalltalk programs into a c environment. IEEE Trans Softw Eng 21(11):902-912. https:\/\/doi.org\/10.1109\/32.473219","DOI":"10.1109\/32.473219"},{"key":"10385_CR29","doi-asserted-by":"crossref","unstructured":"Zhang J, Chen J, Hao D, Xiong Y, Xie B, Zhang L, Mei H (2014a) Search-based inference of polynomial metamorphic relations. In: Proceedings of the 29th ACM\/IEEE international conference on automated software engineering, pp 701\u2013712","DOI":"10.1145\/2642937.2642994"},{"key":"10385_CR30","doi-asserted-by":"crossref","unstructured":"Zhang J, Zhu M, Hao D, Zhang L (2014b) An empirical study on the scalability of selective mutation testing. In: 2014 IEEE 25th international symposium on software reliability engineering, IEEE, pp 277\u2013287","DOI":"10.1109\/ISSRE.2014.27"},{"key":"10385_CR31","doi-asserted-by":"crossref","unstructured":"Zhang J, Lou Y, Zhang L, Hao D, Zhang L, Mei H (2016a) Isomorphic regression testing: executing uncovered branches without test augmentation. In: Proceedings of the 2016 24th ACM SIGSOFT international symposium on foundations of software engineering, pp 883\u2013894","DOI":"10.1145\/2950290.2950313"},{"key":"10385_CR32","doi-asserted-by":"crossref","unstructured":"Zhang J, Wang Z, Zhang L, Hao D, Zang L, Cheng S, Zhang L (2016b) Predictive mutation testing. In: Proceedings of the 25th international symposium on software testing and analysis, pp 342\u2013353","DOI":"10.1145\/2931037.2931038"},{"key":"10385_CR33","doi-asserted-by":"publisher","unstructured":"Zhang J, Hao D, Zhang L, Zhang L (2018) To detect abnormal program behaviours via mutation deduction. In: 2018 IEEE international conference on software testing, verification and validation workshops (ICSTW), pp 11\u201317. https:\/\/doi.org\/10.1109\/ICSTW.2018.00022","DOI":"10.1109\/ICSTW.2018.00022"},{"key":"10385_CR34","unstructured":"Zhang JM, Harman M, Ma L, Liu Y (2020) Machine learning testing: survey, landscapes and horizons. IEEE Trans Softw Eng"},{"key":"10385_CR35","doi-asserted-by":"crossref","unstructured":"Zhang JM, Harman M, Guedj B, Barr ET, Shawe-Taylor J (2023) Model validation using mutated training labels: an exploratory study. Neurocomputing 539:126116","DOI":"10.1016\/j.neucom.2023.02.042"}],"container-title":["Empirical Software Engineering"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s10664-023-10385-w.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1007\/s10664-023-10385-w\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s10664-023-10385-w.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,3,27]],"date-time":"2024-03-27T13:29:44Z","timestamp":1711546184000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/s10664-023-10385-w"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,12,6]]},"references-count":35,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2024,1]]}},"alternative-id":["10385"],"URL":"https:\/\/doi.org\/10.1007\/s10664-023-10385-w","relation":{},"ISSN":["1382-3256","1573-7616"],"issn-type":[{"value":"1382-3256","type":"print"},{"value":"1573-7616","type":"electronic"}],"subject":[],"published":{"date-parts":[[2023,12,6]]},"assertion":[{"value":"14 August 2023","order":1,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"6 December 2023","order":2,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Declarations"}},{"value":"The authors declared that they have no conflict of interest.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Conflicts of interest"}}],"article-number":"19"}}