{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,1,30]],"date-time":"2026-01-30T04:58:41Z","timestamp":1769749121706,"version":"3.49.0"},"reference-count":39,"publisher":"Springer Science and Business Media LLC","issue":"1","license":[{"start":{"date-parts":[[2021,3,5]],"date-time":"2021-03-05T00:00:00Z","timestamp":1614902400000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2021,3,5]],"date-time":"2021-03-05T00:00:00Z","timestamp":1614902400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"publisher","award":["U1836211"],"award-info":[{"award-number":["U1836211"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Cybersecur"],"published-print":{"date-parts":[[2021,12]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>Decompilation aims to analyze and transform low-level program language (PL) codes such as binary code or assembly code to obtain an equivalent high-level PL. Decompilation plays a vital role in the cyberspace security fields such as software vulnerability discovery and analysis, malicious code detection and analysis, and software engineering fields such as source code analysis, optimization, and cross-language cross-operating system migration. Unfortunately, the existing decompilers mainly rely on experts to write rules, which leads to bottlenecks such as low scalability, development difficulties, and long cycles. The generated high-level PL codes often violate the code writing specifications. Further, their readability is still relatively low. The problems mentioned above hinder the efficiency of advanced applications (e.g., vulnerability discovery) based on decompiled high-level PL codes.In this paper, we propose a decompilation approach based on the attention-based neural machine translation (NMT) mechanism, which converts low-level PL into high-level PL while acquiring legibility and keeping functionally similar. To compensate for the information asymmetry between the low-level and high-level PL, a translation method based on basic operations of low-level PL is designed. This method improves the generalization of the NMT model and captures the translation rules between PLs more accurately and efficiently. Besides, we implement a neural decompilation framework called Neutron. The evaluation of two practical applications shows that Neutron\u2019s average program accuracy is 96.96%, which is better than the traditional NMT model.<\/jats:p>","DOI":"10.1186\/s42400-021-00070-0","type":"journal-article","created":{"date-parts":[[2021,3,5]],"date-time":"2021-03-05T00:06:12Z","timestamp":1614902772000},"update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":20,"title":["Neutron: an attention-based neural decompiler"],"prefix":"10.1186","volume":"4","author":[{"given":"Ruigang","family":"Liang","sequence":"first","affiliation":[]},{"given":"Ying","family":"Cao","sequence":"additional","affiliation":[]},{"given":"Peiwei","family":"Hu","sequence":"additional","affiliation":[]},{"given":"Kai","family":"Chen","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2021,3,5]]},"reference":[{"key":"70_CR1","unstructured":"Allamanis, M, Tarlow D, Gordon A, Wei Y (2015) Bimodal modelling of source code and natural language In: International Conference on Machine Learning, 2123\u20132132."},{"key":"70_CR2","unstructured":"Avast Retargetable Decompiler IDA Plugin (2020). https:\/\/doi.org\/blog.fpmurphy.com\/2017\/12\/avast-retargetable-decompiler-ida-plugin.html."},{"key":"70_CR3","unstructured":"Brumley, D, Lee J, Schwartz E, Woo M (2013) Native x86 decompilation using semantics-preserving structural analysis and iterative control-flow structuring In: Presented as Part of the 22nd {USENIX} Security Symposium ({USENIX} Security 13), 353\u2013368."},{"key":"70_CR4","unstructured":"cfile (2020). https:\/\/doi.org\/github.com\/cogu\/cfile."},{"key":"70_CR5","unstructured":"\u010eurfina, L, K\u0159oustek J, Zemek P, Kol\u00e1\u0159 D, Hru\u0161ka T, Masa\u0159\u00edk K, Meduna A (2011) Design of an automatically generated retargetable decompiler In: Proceedings of the 2nd International Conference on Circuits, Systems, Communications & Computers, 199\u2013204."},{"key":"70_CR6","doi-asserted-by":"crossref","unstructured":"\u010eurfina, L, K\u0159oustek J, Zemek P (2013) Psybot malware: A step-by-step decompilation case study In: 2013 20th Working Conference on Reverse Engineering (WCRE), 449\u2013456. IEEE.","DOI":"10.1109\/WCRE.2013.6671321"},{"key":"70_CR7","unstructured":"Fu, C, Chen H, Liu H, Chen X, Tian Y, Koushanfar F, Zhao J (2019) Coda: An end-to-end neural program decompiler In: Advances in Neural Information Processing Systems, 3703\u20133714."},{"key":"70_CR8","unstructured":"Ghidra (2020). https:\/\/doi.org\/ghidra-sre.org."},{"key":"70_CR9","doi-asserted-by":"crossref","unstructured":"Heo, K, Oh H, Yi K (2017) Machine-learning-guided selectively unsound static analysis In: 2017 IEEE\/ACM 39th International Conference on Software Engineering (ICSE), 519\u2013529. IEEE.","DOI":"10.1109\/ICSE.2017.54"},{"key":"70_CR10","unstructured":"Hex-Rays (2020). https:\/\/doi.org\/.hex-rays.com\/products\/decompiler\/."},{"issue":"8","key":"70_CR11","doi-asserted-by":"publisher","first-page":"1735","DOI":"10.1162\/neco.1997.9.8.1735","volume":"9","author":"S Hochreiter","year":"1997","unstructured":"Hochreiter, S, Schmidhuber J (1997) Long short-term memory. Neural computation 9(8):1735\u20131780.","journal-title":"Neural computation"},{"key":"70_CR12","unstructured":"Karel the Robot (1995). https:\/\/doi.org\/.cs.mtsu.edu\/~untch\/karel\/."},{"key":"70_CR13","doi-asserted-by":"crossref","unstructured":"Katz, DS, Ruchti J, Schulte E (2018) Using recurrent neural networks for decompilation In: 2018 IEEE 25th International Conference on Software Analysis, Evolution and Reengineering (SANER), 346\u2013356. IEEE.","DOI":"10.1109\/SANER.2018.8330222"},{"key":"70_CR14","unstructured":"Katz, O, Olshaker Y, Goldberg Y, Yahav E (2019) Towards neural decompilation. CoRR abs\/1905.08325. https:\/\/doi.org\/arxiv.org\/abs\/1905.08325."},{"key":"70_CR15","unstructured":"K\u0159oustek, J, Matula P, Zemek P (2017) RetDec: An Open-Source Machine-Code Decompiler. December 2017, technick\u00e1 spr\u00e1va, prezentovan\u00e9 na konferenc\u00ed Botconf."},{"key":"70_CR16","unstructured":"Levy, D, Wolf L (2017) Learning to align the source code to the compiled object code In: International Conference on Machine Learning, 2043\u20132051."},{"key":"70_CR17","doi-asserted-by":"crossref","unstructured":"Li, Z, Zou D, Xu S, Ou X, Jin H, Wang S, Deng Z, Zhong Y (2018) Vuldeepecker: A deep learning-based system for vulnerability detection. arXiv preprint arXiv:1801.01681 abs\/1801.01681.","DOI":"10.14722\/ndss.2018.23158"},{"key":"70_CR18","doi-asserted-by":"crossref","unstructured":"Liu, Z, Wang S (2020) How far we have come: testing decompilation correctness of c decompilers In: Proceedings of the 29th ACM SIGSOFT International Symposium on Software Testing and Analysis, 475\u2013487.","DOI":"10.1145\/3395363.3397370"},{"key":"70_CR19","doi-asserted-by":"publisher","first-page":"287","DOI":"10.18653\/v1\/P17-2045","volume-title":"Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)","author":"P Loyola","year":"2017","unstructured":"Loyola, P, Marrese-Taylor E, Matsuo Y (2017) A Neural Architecture for Generating Natural Language Descriptions from Source Code Changes In: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), 287\u2013292.. Association for Computational Linguistics, Vancouver. https:\/\/doi.org\/10.18653\/v1\/P17-2045. https:\/\/www.aclweb.org\/anthology\/P17-2045."},{"key":"70_CR20","doi-asserted-by":"publisher","first-page":"1412","DOI":"10.18653\/v1\/D15-1166","volume-title":"Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing","author":"M Luong","year":"2015","unstructured":"Luong, M, Pham H, Manning CD (2015) Effective Approaches to Attention-based Neural Machine Translation In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, 1412\u20131421.. Association for Computational Linguistics, Lisbon. https:\/\/doi.org\/10.18653\/v1\/D15-1166. https:\/\/www.aclweb.org\/anthology\/D15-1166."},{"key":"70_CR21","unstructured":"Math c++ library (2020). https:\/\/doi.org\/.cplusplus.com\/reference\/cmath\/."},{"issue":"5","key":"70_CR22","doi-asserted-by":"publisher","first-page":"1212","DOI":"10.1109\/72.410363","volume":"6","author":"B Pearlmutter","year":"1995","unstructured":"Pearlmutter, B (1995) Gradient calculations for dynamic recurrent neural networks: A survey. IEEE Transactions on Neural networks 6(5):1212\u20131228.","journal-title":"IEEE Transactions on Neural networks"},{"key":"70_CR23","unstructured":"Peng, F, Deng Z, Zhang X, Xu D, Lin Z, Su Z (2014) X-force: Force-executing binary programs for security applications In: 23rd {USENIX} Security Symposium ({USENIX} Security 14), 829\u2013844."},{"key":"70_CR24","unstructured":"Reddy, D (1977) Speech understanding systems: report of a steering committee. Artificial Intelligence 9(3):307\u2013316. https:\/\/doi.org\/10.1016\/0004-3702(77)90026-1, http:\/\/www.sciencedirect.com\/science\/article\/pii\/0004370277900261."},{"issue":"1","key":"70_CR25","doi-asserted-by":"publisher","first-page":"305","DOI":"10.1145\/2490301.2451150","volume":"41","author":"E Schkufza","year":"2013","unstructured":"Schkufza, E, Sharma R, Aiken A (2013) Stochastic superoptimization. ACM SIGARCH Computer Architecture News 41(1):305\u2013316.","journal-title":"ACM SIGARCH Computer Architecture News"},{"issue":"1","key":"70_CR26","first-page":"p1","volume":"2","author":"T Shi","year":"2018","unstructured":"Shi, T, Keneshloo Y, Ramakrishnan N, Reddy C (2018) Neural abstractive text summarization with sequence-to-sequence models. arXiv preprint arXiv:1812.02303 2(1):p1\u2013p37.","journal-title":"arXiv preprint arXiv:1812.02303"},{"key":"70_CR27","doi-asserted-by":"crossref","unstructured":"Shoshitaishvili, Y, Wang R, Salls C, Stephens N, Polino M, Dutcher A, Grosen J, Feng S, Hauser C, Kruegel C, Vigna G (2016) SoK: (State of) The Art of War: Offensive Techniques in Binary Analysis In: IEEE Symposium on Security and Privacy.","DOI":"10.1109\/SP.2016.17"},{"key":"70_CR28","unstructured":"Sutskever, I, Vinyals O, Le Q (2014) Sequence to sequence learning with neural networks In: Advances in Neural Information Processing Systems, 3104\u20133112."},{"key":"70_CR29","unstructured":"tensor, 2tensor (2020). https:\/\/doi.org\/github.com\/tensorflow\/tensor2tensor."},{"key":"70_CR30","doi-asserted-by":"crossref","unstructured":"Wang, Y, Zhang C, Xiang X, Zhao Z, Li W, Gong X, Liu B, Chen K, Zou W (2018) Revery: From proof-of-concept to exploitable In: Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security, 1914\u20131927.","DOI":"10.1145\/3243734.3243847"},{"key":"70_CR31","unstructured":"Warren, HS (2012) Hacker\u2019s Delight, 2nd ed. Addison-Wesley Professional."},{"key":"70_CR32","doi-asserted-by":"publisher","unstructured":"Wu, Y, Schuster M, Chen Z, Le Q, Norouzi M, Macherey W, Krikun M, Cao Y, Gao Q, Macherey K, et al. (2016) Google\u2019s neural machine translation system: Bridging the gap between human and machine translation. arXiv preprint arXiv:1609.08144:11\u201320. https:\/\/doi.org\/10.1145\/3234150.","DOI":"10.1145\/3234150"},{"key":"70_CR33","doi-asserted-by":"crossref","unstructured":"Yadegari, B, Johannesmeyer B, Whitely B, Debray S (2015) A generic approach to automatic deobfuscation of executable code In: 2015 IEEE Symposium on Security and Privacy, 674\u2013691. IEEE.","DOI":"10.1109\/SP.2015.47"},{"key":"70_CR34","doi-asserted-by":"crossref","unstructured":"Yakdan, K, Dechand S, Gerhards-Padilla E, Smith M (2016) Helping johnny to analyze malware: A usability-optimized decompiler and malware analysis user study In: 2016 IEEE Symposium on Security and Privacy (SP), 158\u2013177. IEEE.","DOI":"10.1109\/SP.2016.18"},{"key":"70_CR35","doi-asserted-by":"crossref","unstructured":"Yakdan, K, Eschweiler S, Gerhards-Padilla E, Smith M (2015) No more gotos: Decompilation using pattern-independent control-flow structuring and semantic-preserving transformations In: NDSS.","DOI":"10.14722\/ndss.2015.23185"},{"key":"70_CR36","doi-asserted-by":"crossref","unstructured":"Yang, X, Chen Y, Eide E, Regehr J (2011) Finding and understanding bugs in c compilers In: Proceedings of the 32nd ACM SIGPLAN Conference on Programming Language Design and Implementation, 283\u2013294.","DOI":"10.1145\/1993316.1993532"},{"key":"70_CR37","doi-asserted-by":"crossref","unstructured":"Yang, Z, He X, Gao J, Deng L, Smola A (2016) Stacked attention networks for image question answering In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 21\u201329.","DOI":"10.1109\/CVPR.2016.10"},{"key":"70_CR38","doi-asserted-by":"crossref","unstructured":"You, W, Zong P, Chen K, Wang X, Liao X, Bian P, Liang B (2017) Semfuzz: Semantics-based automatic generation of proof-of-concept exploits In: Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security, 2139\u20132154.","DOI":"10.1145\/3133956.3134085"},{"key":"70_CR39","unstructured":"Zong, P, Lv T, Wang D, Deng Z, Liang R, Chen K (2020) Fuzzguard: Filtering out unreachable inputs in directed grey-box fuzzing through deep learning In: 29th {USENIX} Security Symposium ({USENIX} Security 20), 2255\u20132269."}],"container-title":["Cybersecurity"],"original-title":[],"language":"en","link":[{"URL":"http:\/\/link.springer.com\/content\/pdf\/10.1186\/s42400-021-00070-0.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"http:\/\/link.springer.com\/article\/10.1186\/s42400-021-00070-0\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"http:\/\/link.springer.com\/content\/pdf\/10.1186\/s42400-021-00070-0.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2021,3,5]],"date-time":"2021-03-05T00:09:06Z","timestamp":1614902946000},"score":1,"resource":{"primary":{"URL":"https:\/\/cybersecurity.springeropen.com\/articles\/10.1186\/s42400-021-00070-0"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2021,3,5]]},"references-count":39,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2021,12]]}},"alternative-id":["70"],"URL":"https:\/\/doi.org\/10.1186\/s42400-021-00070-0","relation":{},"ISSN":["2523-3246"],"issn-type":[{"value":"2523-3246","type":"electronic"}],"subject":[],"published":{"date-parts":[[2021,3,5]]},"assertion":[{"value":"16 November 2020","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"3 January 2021","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"5 March 2021","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"We confirm that none of the authors have any competing interests in the manuscript.","order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Competing interests"}}],"article-number":"5"}}