{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,5,18]],"date-time":"2026-05-18T16:06:10Z","timestamp":1779120370055,"version":"3.51.4"},"reference-count":39,"publisher":"Oxford University Press (OUP)","issue":"5","license":[{"start":{"date-parts":[[2026,1,21]],"date-time":"2026-01-21T00:00:00Z","timestamp":1768953600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/academic.oup.com\/pages\/standard-publication-reuse-rights"}],"funder":[{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"publisher","award":["62202146"],"award-info":[{"award-number":["62202146"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]},{"name":"Young and Middle-aed Scientific and Technotgical Innovation Team Plan in Higher Education Institutions of Hubei Province, China","award":["T2023007"],"award-info":[{"award-number":["T2023007"]}]},{"name":"College Students' Innovative Entrepreneurial Training Plan Program","award":["202410500008"],"award-info":[{"award-number":["202410500008"]}]},{"name":"China NSF","award":["62202146"],"award-info":[{"award-number":["62202146"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2026,5,16]]},"abstract":"<jats:title>Abstract<\/jats:title>\n                  <jats:p>Code clones are similar code fragments at the syntactic or semantic level, commonly seen in software development. Excessive cloning harms maintainability and may introduce persistent bugs. We analyze cross-language code clone detection at the accurate semantic level. Most existing clone detection approaches target single-language environments and focus mainly on syntactic similarity. However, complex software systems are often developed using multiple programming languages, resulting in semantically similar cross-language code clones. These clones pose challenges beyond the capabilities of current detection tools. In this paper, we propose a novel flow-enhanced graph attention network approach, called FEGAT, to effectively detect cross-language code clones at the semantic level. First, we design a flow-enhanced code graph using abstract syntax tree along with the added control and data flow edges. Then, we input this code graph into the pre-trained model CodeBERT to learn the initial flow-enhanced node representation with semantic information. Third, we design FEGAT to learn flow-enhanced graph representation of cross-language codes from their semantic information and detect clones by computing the similarity score. Finally, we conduct experiments on the AtCoder and CodeChef datasets to evaluate the performance of FEGAT in terms of precision, recall, and F1-score. The experimental results demonstrate that FEGAT outperforms existing cross-language code clone detection tools.<\/jats:p>","DOI":"10.1093\/comjnl\/bxaf146","type":"journal-article","created":{"date-parts":[[2025,12,23]],"date-time":"2025-12-23T12:44:16Z","timestamp":1766493856000},"page":"800-815","source":"Crossref","is-referenced-by-count":0,"title":["Cross-language code clone detection via flow-enhanced graph attention network"],"prefix":"10.1093","volume":"69","author":[{"given":"Mengyao","family":"Hu","sequence":"first","affiliation":[{"name":"School of Computer Science and Artificial Intelligence, Hubei University of Technology , No. 28 Nanli Road, Hongshan District, Wuhan 430068 ,","place":["China"]}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Jia","family":"Yang","sequence":"additional","affiliation":[{"name":"School of Computer Science and Artificial Intelligence, Hubei University of Technology , No. 28 Nanli Road, Hongshan District, Wuhan 430068 ,","place":["China"]},{"name":"Hubei Provincial Key Laboratory of Green Intelligent Computing Power Network, Hubei University of Technology , No. 28 Nanli Road, Hongshan District, Wuhan 430068 ,","place":["China"]}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Weiqi","family":"Zhou","sequence":"additional","affiliation":[{"name":"School of Computer Science and Artificial Intelligence, Hubei University of Technology , No. 28 Nanli Road, Hongshan District, Wuhan 430068 ,","place":["China"]}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"286","published-online":{"date-parts":[[2026,1,21]]},"reference":[{"key":"2026051811065358200_ref1","first-page":"64","article-title":"A survey on software clone detection research","volume":"541","author":"Roy","year":"2007","journal-title":"Queen\u2019s Sch Comput TR"},{"key":"2026051811065358200_ref2","first-page":"109","article-title":"A language independent approach for detecting duplicated code","volume-title":"Proceedings of the IEEE International Conference on Software Maintenance (ICSM\u201999), Oxford, UK, 30 August-3 September","author":"Ducasse","year":"1999"},{"key":"2026051811065358200_ref3","doi-asserted-by":"crossref","first-page":"18","DOI":"10.1109\/CSMR-WCRE.2014.6747168","article-title":"The vision of software clone management: Past, present, and future (keynote paper)","volume-title":"2014 Software Evolution Week - IEEE Conference on Software Maintenance, Reengineering, and Reverse Engineering (CSMR-WCRE), Antwerp, Belgium, 3-6 February","author":"Roy","year":"2014"},{"key":"2026051811065358200_ref4","doi-asserted-by":"publisher","first-page":"654","DOI":"10.1109\/TSE.2002.1019480","article-title":"CCFinder: a multilinguistic token-based code clone detection system for large scale source code","volume":"28","author":"Kamiya","year":"2002","journal-title":"IEEE Trans Softw Eng"},{"key":"2026051811065358200_ref5","first-page":"96","article-title":"DECKARD: scalable and accurate tree-based detection of code clones","volume-title":"29th International Conference on Software Engineering (ICSE\u201907), Minneapolis, MN, 20-26 May","author":"Jiang","year":"2007"},{"key":"2026051811065358200_ref6","first-page":"70","article-title":"Neural detection of semantic code clones via tree-based convolution","volume-title":"2019 IEEE\/ACM 27th International Conference on Program Comprehension (ICPC), Montreal, QC, Canada, 25-26 May","author":"Yu","year":"2019"},{"key":"2026051811065358200_ref7","first-page":"261","article-title":"Detecting code clones with graph neural network and flow-augmented abstract syntax tree","volume-title":"2020 IEEE 27th International Conference on Software Analysis, Evolution and Reengineering (SANER), London, ON, Canada, 18-21 February","author":"Wang","year":"2020"},{"key":"2026051811065358200_ref8","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1186\/s40411-017-0035-z","article-title":"On multi-language software development, cross-language links and accompanying tools: A survey of professional software developers","volume":"5","author":"Mayer","year":"2017","journal-title":"J Softw Eng Res Dev"},{"key":"2026051811065358200_ref9","first-page":"494","article-title":"Semantic based cross-language clone related bug detection","volume-title":"2021 2nd International Seminar on Artificial Intelligence, Networking and Information Technology (AINIT), Shanghai, China, 15-17 October","author":"Chen","year":"2021"},{"key":"2026051811065358200_ref10","first-page":"1026","article-title":"CLCDSA: cross language code clone detection using syntactical features and API documentation","volume-title":"2019 34th IEEE\/ACM International Conference on Automated Software Engineering (ASE), San Diego, CA, USA, 11-15 November","author":"Nafi","year":"2019"},{"key":"2026051811065358200_ref11","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1109\/ICCA62237.2024.10927826","article-title":"Advanced cross-language clone detection using modified AST and graph neural network","volume-title":"2024 International Conference on Computer and Applications (ICCA), Cairo, Egypt, 17-19 December","author":"Swilam","year":"2024"},{"key":"2026051811065358200_ref12","doi-asserted-by":"publisher","first-page":"4846","DOI":"10.1109\/TSE.2023.3311796","article-title":"Improving cross-language code clone detection via code representation learning and graph neural networks","volume":"49","author":"Mehrotra","year":"2023","journal-title":"IEEE Trans. Softw. Eng."},{"key":"2026051811065358200_ref13","first-page":"476","article-title":"Towards a big data curated benchmark of inter-project code clones","volume-title":"2014 IEEE International Conference on Software Maintenance and Evolution, Victoria, BC, Canada, 29 September-3 October","author":"Svajlenko","year":"2014"},{"key":"2026051811065358200_ref14","doi-asserted-by":"publisher","first-page":"86121","DOI":"10.1109\/ACCESS.2019.2918202","article-title":"A systematic review on code clone detection","volume":"7","author":"Ain","year":"2019","journal-title":"IEEE Access"},{"key":"2026051811065358200_ref15","first-page":"1","article-title":"Using compilation\/decompilation to enhance clone detection","volume-title":"2017 IEEE 11th International Workshop on Software Clones (IWSC), Klagenfurt, Austria, 21 February","author":"Ragkhitwetsagul","year":"2017"},{"key":"2026051811065358200_ref16","doi-asserted-by":"crossref","DOI":"10.1109\/COMPSAC.2017.104","article-title":"Detecting java code clones with multi-granularities based on bytecode","volume-title":"2017 IEEE 41st Annual Computer Software and Applications Conference (COMPSAC), Turin, Italy, 4-8 July","author":"Yu","year":"2017"},{"key":"2026051811065358200_ref17","first-page":"1157","article-title":"SourcererCC: scaling code clone detection to big-code","volume-title":"2016 IEEE\/ACM 38th International Conference on Software Engineering (ICSE), Austin, TX, USA, 14-22 May","author":"Sajnani","year":"2016"},{"key":"2026051811065358200_ref18","first-page":"286","article-title":"Structural function based code clone detection using a new hybrid technique","volume-title":"2018 IEEE 42nd Annual Computer Software and Applications Conference (COMPSAC), Tokyo, Japan, 23-27 July","author":"Yang","year":"2018"},{"key":"2026051811065358200_ref19","first-page":"27","article-title":"Fast and flexible large-scale clone detection with CloneWorks","volume-title":"2017 IEEE\/ACM 39th International Conference on Software Engineering Companion (ICSE-C), Buenos Aires, Argentina, 20-28 May","author":"Svajlenko","year":"2017"},{"key":"2026051811065358200_ref20","first-page":"59","article-title":"Code clone detection based on order and content of control statements","volume-title":"2016 2nd International Conference on Contemporary Computing and Informatics (IC3I), Greater Noida, India, 14-17 December","author":"Sudhamani","year":"2016"},{"key":"2026051811065358200_ref21","first-page":"1","article-title":"Rearranging the order of program statements for code clone detection","volume-title":"2017 IEEE 11th International Workshop on Software Clones (IWSC), Klagenfurt, Austria, 21 February","author":"Sabi","year":"2017"},{"key":"2026051811065358200_ref22","first-page":"321","article-title":"Scalable detection of semantic clones","volume-title":"2008 ACM\/IEEE 30th International Conference on Software Engineering, Leipzig, Germany, 10-18 May","author":"Gabel","year":"2008"},{"key":"2026051811065358200_ref23","first-page":"249","article-title":"CCLearner: A deep learning-based clone detection approach","volume-title":"2017 IEEE International Conference on Software Maintenance and Evolution (ICSME), Shanghai, China, 17-22 September","author":"Li","year":"2017"},{"key":"2026051811065358200_ref24","first-page":"783","article-title":"A novel neural source code representation based on abstract syntax tree","volume-title":"2019 IEEE\/ACM 41st International Conference on Software Engineering (ICSE), Montreal, QC, Canada, 25-31 May","author":"Zhang","year":"2019"},{"key":"2026051811065358200_ref25","first-page":"512","article-title":"LICCA: a tool for cross-language clone detection","volume-title":"2018 IEEE 25th International Conference on Software Analysis, Evolution and Reengineering (SANER), Campobasso, Italy, 20-23 March","author":"Vislavski","year":"2018"},{"key":"2026051811065358200_ref26","first-page":"696","article-title":"Mining revision histories to detect cross-language clones without intermediates","volume-title":"2016 31st IEEE\/ACM International Conference on Automated Software Engineering (ASE), Singapore, 3-7 September","author":"Cheng","year":"2016"},{"key":"2026051811065358200_ref27","first-page":"413","article-title":"C4: contrastive cross-language code clone detection","volume-title":"2022 IEEE\/ACM 30th International Conference on Program Comprehension (ICPC), Pittsburgh, PA, USA, 16-17 May","author":"Tao","year":"2022"},{"key":"2026051811065358200_ref28","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1109\/ICCA59364.2023.10401783","article-title":"Cross-language code clone detection using abstract syntax tree and graph neural network","volume-title":"2023 International Conference on Computer and Applications (ICCA), Cairo, Egypt, 28-30 November","author":"Swilam","year":"2023"},{"key":"2026051811065358200_ref29","first-page":"277","article-title":"Survey of pre-trained models for natural language processing","volume-title":"2021 International Conference on Electronic Communications, Internet of Things and Big Data (ICEIB), Yilan County, Taiwan, 10-12 December","author":"Peng","year":"2021"},{"key":"2026051811065358200_ref30","first-page":"505","article-title":"Applying CodeBERT for automated program repair of Java simple bugs","volume-title":"2021 IEEE\/ACM 18th International Conference on Mining Software Repositories (MSR). Madrid, Spain, 17-19 May","author":"Mashhadi","year":"2021"},{"key":"2026051811065358200_ref31","first-page":"935","article-title":"Assemble foundation models for automatic code summarization","volume-title":"2022 IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER), Honolulu, HI, USA, 15-18 March","author":"Gu","year":"2022"},{"key":"2026051811065358200_ref32","first-page":"39","article-title":"CodeBERT for code clone detection: a replication study","volume-title":"2022 IEEE 16th International Workshop on Software Clones (IWSC), Limassol, Cyprus, 2 October","author":"Arshad","year":"2022"},{"key":"2026051811065358200_ref33","first-page":"4171","article-title":"BERT: pre-training of deep bidirectional transformers for language understanding","volume-title":"2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL-HLT), Minneapolis, Minnesota, June","author":"Devlin","year":"2019"},{"key":"2026051811065358200_ref34","doi-asserted-by":"crossref","DOI":"10.18653\/v1\/2020.findings-emnlp.139","article-title":"CodeBERT: a pre-trained model for programming and natural languages","author":"Feng","year":"2020"},{"key":"2026051811065358200_ref35","article-title":"Graph attention networks","author":"Velickovic","year":"2017"},{"key":"2026051811065358200_ref36","first-page":"3844","article-title":"Convolutional neural networks on graphs with fast localized spectral filtering","volume-title":"30th International Conference on Neural Information Processing Systems (NIPS\u201916), Barcelona, Spain, 5-10 December","author":"Defferrard","year":"2016"},{"key":"2026051811065358200_ref37","article-title":"Efficient estimation of word representations in vector space","author":"Mikolov","year":"2013"},{"key":"2026051811065358200_ref38","first-page":"518","article-title":"Cross-language clone detection by learning over abstract syntax trees","volume-title":"2019 IEEE\/ACM 16th International Conference on Mining Software Repositories (MSR), Montreal, QC, Canada, 25-31 May","author":"Perez","year":"2019"},{"key":"2026051811065358200_ref39","first-page":"87","article-title":"Deep learning code fragments for code clone detection","volume-title":"2016 31st IEEE\/ACM International Conference on Automated Software Engineering (ASE), Singapore, 3-7 September","author":"White","year":"2016"}],"container-title":["The Computer Journal"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/comjnl\/article-pdf\/69\/5\/800\/66513980\/bxaf146.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/comjnl\/article-pdf\/69\/5\/800\/66513980\/bxaf146.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2026,5,18]],"date-time":"2026-05-18T15:07:06Z","timestamp":1779116826000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/comjnl\/article\/69\/5\/800\/8435492"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2026,1,21]]},"references-count":39,"journal-issue":{"issue":"5","published-online":{"date-parts":[[2026,1,21]]},"published-print":{"date-parts":[[2026,5,16]]}},"URL":"https:\/\/doi.org\/10.1093\/comjnl\/bxaf146","relation":{},"ISSN":["0010-4620","1460-2067"],"issn-type":[{"value":"0010-4620","type":"print"},{"value":"1460-2067","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2026,5]]},"published":{"date-parts":[[2026,1,21]]}}}