{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,6,13]],"date-time":"2026-06-13T05:38:27Z","timestamp":1781329107840,"version":"3.54.1"},"reference-count":42,"publisher":"Association for Computing Machinery (ACM)","issue":"3","content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Softw. Eng. Methodol."],"published-print":{"date-parts":[[2026,3,31]]},"abstract":"<jats:p>\n                    Binary reverse engineering is pivotal in the realm of cybersecurity, enabling critical applications such as malware analysis, legacy code hardening, and vulnerability detection. However, the challenge of recovering structural information from binaries, especially stripped ones, persists due to the significant loss of variable boundaries, types, names, and dataflow information during compilation. In this article, we introduce\n                    <jats:italic toggle=\"yes\">Hy<\/jats:italic>\n                    brid\n                    <jats:italic toggle=\"yes\">RE<\/jats:italic>\n                    asoning for\n                    <jats:italic toggle=\"yes\">S<\/jats:italic>\n                    tructure Recovery (\n                    <jats:monospace>HyRES<\/jats:monospace>\n                    ), an innovative hybrid reasoning technique that energizes static analysis, Large Language Model (LLM), and heuristic methods to recover data structures from stripped binaries. It analyzes the structure layout and proficiently infer its semantics via LLM, and utilizes semantics to perform semantic-enhanced structure aggregation, which overcomes the need for complete dataflow.\n                    <jats:monospace>HyRES<\/jats:monospace>\n                    outperforms State-of-the-Art (SOTA) solutions in terms of structure pointer identification and layout recovery. Specifically,\n                    <jats:monospace>HyRES<\/jats:monospace>\n                    achieves 65.1% higher recall and 33.4% higher accuracy than the SOTA, while also being 64.2% faster than existing SOTA solutions. Comprehensive experiments demonstrate\n                    <jats:monospace>HyRES<\/jats:monospace>\n                    \u2019s superior performance and practical utility in real-world reverse engineering tasks, marking a significant advancement in binary analysis.\n                  <\/jats:p>","DOI":"10.1145\/3736719","type":"journal-article","created":{"date-parts":[[2025,5,22]],"date-time":"2025-05-22T22:16:53Z","timestamp":1747952213000},"page":"1-28","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":2,"title":["<tt>HyRES<\/tt>\n                    : Recovering Data Structures in Binaries via Semantic Enhanced Hybrid Reasoning"],"prefix":"10.1145","volume":"35","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-1020-9006","authenticated-orcid":false,"given":"Zihan","family":"Sha","sequence":"first","affiliation":[{"name":"Key Laboratory of Cyberspace Security, Ministry of Education, Zhengzhou, China"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-2797-1355","authenticated-orcid":false,"given":"Hui","family":"Shu","sequence":"additional","affiliation":[{"name":"Key Laboratory of Cyberspace Security, Ministry of Education, Zhengzhou, China"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-0536-5039","authenticated-orcid":false,"given":"Hao","family":"Wang","sequence":"additional","affiliation":[{"name":"Tsinghua University, Beijing, China"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0009-0003-2318-9061","authenticated-orcid":false,"given":"Zeyu","family":"Gao","sequence":"additional","affiliation":[{"name":"Tsinghua University, Beijing, China"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0009-0009-3326-6378","authenticated-orcid":false,"given":"Yang","family":"Lan","sequence":"additional","affiliation":[{"name":"Key Laboratory of Cyberspace Security, Ministry of Education, Zhengzhou, China"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-7894-8828","authenticated-orcid":false,"given":"Chao","family":"Zhang","sequence":"additional","affiliation":[{"name":"Tsinghua University, Beijing, China"}],"role":[{"vocabulary":"crossref","role":"author"}]}],"member":"320","published-online":{"date-parts":[[2026,2,13]]},"reference":[{"key":"e_1_3_2_2_2","unstructured":"Ghidra. 2019. Ghidra Software Reverse Engineering Framework. Retrieved August 4 2019 from http:\/\/ghidra.net\/"},{"key":"e_1_3_2_3_2","doi-asserted-by":"publisher","DOI":"10.1109\/CGO57630.2024.10444788"},{"key":"e_1_3_2_4_2","unstructured":"Jinze Bai Shuai Bai Yunfei Chu Zeyu Cui Kai Dang Xiaodong Deng Yang Fan Wenbin Ge Yu Han Fei Huang et al. 2023. Qwen Technical Report. arXiv:2309.16609. Retrieved from https:\/\/arxiv.org\/abs\/2309.16609"},{"key":"e_1_3_2_5_2","first-page":"1877","article-title":"Language models are few-shot learners","volume":"33","author":"Brown Tom","year":"2020","unstructured":"Tom Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared D. Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, et al. 2020. Language models are few-shot learners. In Proceedings of the Advances in Neural Information Processing Systems, Vol. 33, 1877\u20131901.","journal-title":"Proceedings of the Advances in Neural Information Processing Systems"},{"key":"e_1_3_2_6_2","unstructured":"David Brumley and James Newsome. 2006. Alias Analysis for Assembly. Technical Report CMUCS-06-180. Carnegie Mellon University School of Computer Science."},{"key":"e_1_3_2_7_2","unstructured":"S\u00e9bastien Bubeck Varun Chandrasekaran Ronen Eldan Johannes Gehrke Eric Horvitz Ece Kamar Peter Lee Yin Tat Lee Yuanzhi Li Scott Lundberg et al. 2023. Sparks of Artificial General Intelligence: Early Experiments with GPT-4. Retrieved from https:\/\/www.microsoft.com\/en-us\/research\/publication\/sparks-of-artificial-general-intelligence-early-experiments-with-gpt-4\/"},{"key":"e_1_3_2_8_2","doi-asserted-by":"publisher","DOI":"10.1145\/1653662.1653729"},{"key":"e_1_3_2_9_2","doi-asserted-by":"publisher","DOI":"10.1109\/SANER53432.2022.00025"},{"key":"e_1_3_2_10_2","unstructured":"Qibin Chen Jeremy Lacomis Edward J. Schwartz Claire Le Goues Graham Neubig and Bogdan Vasilescu. 2022. Augmenting decompiler output with learned variable names and types. In Proceedings of the 31st USENIX Security Symposium. Retrieved from https:\/\/www.usenix.org\/conference\/usenixsecurity22\/presentation\/chen-qibin"},{"key":"e_1_3_2_11_2","doi-asserted-by":"publisher","DOI":"10.14722\/ndss.2020.24311"},{"key":"e_1_3_2_12_2","doi-asserted-by":"publisher","DOI":"10.1145\/3274694.3274739"},{"key":"e_1_3_2_13_2","doi-asserted-by":"publisher","DOI":"10.5555\/2831143.2831191"},{"key":"e_1_3_2_14_2","doi-asserted-by":"publisher","DOI":"10.5555\/3489212.3489273"},{"key":"e_1_3_2_15_2","doi-asserted-by":"publisher","DOI":"10.1109\/SP.2016.30"},{"key":"e_1_3_2_16_2","doi-asserted-by":"publisher","DOI":"10.1145\/3238147.3240480"},{"key":"e_1_3_2_17_2","unstructured":"Hex-Rays. 2023. IDA Pro Disassembler and Debugger. Retrieved May 10 2023 from https:\/\/hex-rays.com\/ida-pro\/"},{"key":"e_1_3_2_18_2","first-page":"422","volume-title":"Computer Aided Verification","author":"Jordan Herbert","year":"2016","unstructured":"Herbert Jordan, Bernhard Scholz, and Pavle Suboti\u0107. 2016. Souffl\u00e9: On synthesis of program analyzers. In Computer Aided Verification. Swarat Chaudhuri and Azadeh Farzan (Eds.), Springer International Publishing, Cham, 422\u2013430."},{"key":"e_1_3_2_19_2","doi-asserted-by":"publisher","DOI":"10.1109\/ASE.2017.8115648"},{"key":"e_1_3_2_20_2","doi-asserted-by":"publisher","DOI":"10.1109\/CGO.2004.1281665"},{"key":"e_1_3_2_21_2","doi-asserted-by":"publisher","unstructured":"JongHyup Lee Thanassis Avgerinos and David Brumley. 2011. TIE: Principled reverse engineering of types in binary programs. In Network and Distributed System Security Symposium. DOI: 10.20935\/AcadMatSci6230","DOI":"10.20935\/AcadMatSci6230"},{"key":"e_1_3_2_22_2","unstructured":"Zehan Li Xin Zhang Yanzhao Zhang Dingkun Long Pengjun Xie and Meishan Zhang. 2023. Towards general text embeddings with multi-stage contrastive learning. arXiv:2308.03281. Retrieved from https:\/\/arxiv.org\/abs\/2308.03281"},{"key":"e_1_3_2_23_2","doi-asserted-by":"publisher","DOI":"10.5555\/2788959.2788964"},{"key":"e_1_3_2_24_2","doi-asserted-by":"publisher","DOI":"10.1145\/1706299.1706351"},{"key":"e_1_3_2_25_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-81-322-2268-2_59"},{"key":"e_1_3_2_26_2","doi-asserted-by":"publisher","DOI":"10.1109\/SP40001.2021.00012"},{"key":"e_1_3_2_27_2","unstructured":"Alec Radford Karthik Narasimhan Tim Salimans and Ilya Sutskever. 2018. Improving Language Understanding by Generative Pre-Training. OpenAI. Retrieved from https:\/\/openai.com\/index\/language-unsupervised\/"},{"issue":"8","key":"e_1_3_2_28_2","first-page":"9","article-title":"Language models are unsupervised multitask learners","volume":"1","author":"Radford Alec","year":"2019","unstructured":"Alec Radford, Jeffrey Wu, Rewon Child, David Luan, Dario Amodei, and Ilya Sutskever. 2019. Language models are unsupervised multitask learners. OpenAI Blog 1, 8 (2019), 9.","journal-title":"OpenAI Blog"},{"key":"e_1_3_2_29_2","doi-asserted-by":"publisher","DOI":"10.1109\/eStream61684.2024.10542617"},{"key":"e_1_3_2_30_2","doi-asserted-by":"publisher","DOI":"10.1145\/2420950.2420962"},{"key":"e_1_3_2_31_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-031-09484-2_6"},{"key":"e_1_3_2_32_2","doi-asserted-by":"publisher","DOI":"10.1145\/3243734.3243793"},{"key":"e_1_3_2_33_2","doi-asserted-by":"publisher","DOI":"10.5555\/2534766.2534797"},{"key":"e_1_3_2_34_2","doi-asserted-by":"publisher","DOI":"10.1109\/SP.2016.17"},{"key":"e_1_3_2_35_2","volume-title":"Proceedings of the Network and Distributed System Security Symposium","author":"Slowinska Asia","year":"2011","unstructured":"Asia Slowinska, Traian Stancescu, and Herbert Bos. 2011. Howard: A dynamic excavator for reverse engineering data structures. In Proceedings of the Network and Distributed System Security Symposium. Retrieved from https:\/\/api.semanticscholar.org\/CorpusID:43281"},{"key":"e_1_3_2_36_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-642-24206-9_14"},{"key":"e_1_3_2_37_2","unstructured":"Hugo Touvron Thibaut Lavril Gautier Izacard Xavier Martinet Marie-Anne Lachaux Timoth\u00e9e Lacroix Baptiste Rozi\u00e8re Naman Goyal Eric Hambro Faisal Azhar et al. 2023. Llama: Open and efficient foundation language models. arXiv:2302.13971. Retrieved from https:\/\/arxiv.org\/abs\/2302.13971"},{"key":"e_1_3_2_38_2","doi-asserted-by":"publisher","DOI":"10.1109\/SecDev.2017.14"},{"key":"e_1_3_2_39_2","doi-asserted-by":"publisher","DOI":"10.1145\/3533767.3534367"},{"key":"e_1_3_2_40_2","doi-asserted-by":"publisher","DOI":"10.1145\/3658644.3670340"},{"key":"e_1_3_2_41_2","doi-asserted-by":"publisher","DOI":"10.1109\/SP40000.2020.00035"},{"key":"e_1_3_2_42_2","doi-asserted-by":"publisher","DOI":"10.1109\/ASE51524.2021.9678910"},{"key":"e_1_3_2_43_2","doi-asserted-by":"publisher","DOI":"10.1109\/SP40001.2021.00051"}],"container-title":["ACM Transactions on Software Engineering and Methodology"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3736719","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2026,2,13]],"date-time":"2026-02-13T14:35:31Z","timestamp":1770993331000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3736719"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2026,2,13]]},"references-count":42,"journal-issue":{"issue":"3","published-print":{"date-parts":[[2026,3,31]]}},"alternative-id":["10.1145\/3736719"],"URL":"https:\/\/doi.org\/10.1145\/3736719","relation":{},"ISSN":["1049-331X","1557-7392"],"issn-type":[{"value":"1049-331X","type":"print"},{"value":"1557-7392","type":"electronic"}],"subject":[],"published":{"date-parts":[[2026,2,13]]},"assertion":[{"value":"2024-10-13","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2025-05-18","order":2,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2026-02-13","order":3,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}