{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,24]],"date-time":"2026-02-24T17:44:25Z","timestamp":1771955065008,"version":"3.50.1"},"publisher-location":"New York, NY, USA","reference-count":41,"publisher":"ACM","license":[{"start":{"date-parts":[[2022,10,26]],"date-time":"2022-10-26T00:00:00Z","timestamp":1666742400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2022,10,26]]},"DOI":"10.1145\/3545948.3545956","type":"proceedings-article","created":{"date-parts":[[2022,10,17]],"date-time":"2022-10-17T11:21:49Z","timestamp":1666005709000},"page":"350-363","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":15,"title":["BinProv: Binary Code Provenance Identification without Disassembly"],"prefix":"10.1145","author":[{"given":"Xu","family":"He","sequence":"first","affiliation":[{"name":"George Mason University, United States"}]},{"given":"Shu","family":"Wang","sequence":"additional","affiliation":[{"name":"George Mason University, United States of America"}]},{"given":"Yunlong","family":"Xing","sequence":"additional","affiliation":[{"name":"George Mason University, United States of America"}]},{"given":"Pengbin","family":"Feng","sequence":"additional","affiliation":[{"name":"George Mason University, United States of America"}]},{"given":"Haining","family":"Wang","sequence":"additional","affiliation":[{"name":"Virginia Tech, United States of America"}]},{"given":"Qi","family":"Li","sequence":"additional","affiliation":[{"name":"Tsinghua University, China"}]},{"given":"Songqing","family":"Chen","sequence":"additional","affiliation":[{"name":"George Mason University, United States of America"}]},{"given":"Kun","family":"Sun","sequence":"additional","affiliation":[{"name":"George Mason University, United States of America"}]}],"member":"320","published-online":{"date-parts":[[2022,10,26]]},"reference":[{"key":"e_1_3_2_1_1_1","volume-title":"25th USENIX Security Symposium (USENIX Security 16)","author":"Andriesse Dennis","year":"2016","unstructured":"Dennis Andriesse , Xi Chen , Victor Van Der\u00a0Veen , Asia Slowinska , and Herbert Bos . 2016 . An in-depth analysis of disassembly on full-scale x86\/x64 binaries . In 25th USENIX Security Symposium (USENIX Security 16) . USENIX Association, USA, 583\u2013600. Dennis Andriesse, Xi Chen, Victor Van Der\u00a0Veen, Asia Slowinska, and Herbert Bos. 2016. An in-depth analysis of disassembly on full-scale x86\/x64 binaries. In 25th USENIX Security Symposium (USENIX Security 16). USENIX Association, USA, 583\u2013600."},{"key":"e_1_3_2_1_2_1","volume-title":"23rd USENIX Security Symposium (USENIX Security 14)","author":"Bao Tiffany","year":"2014","unstructured":"Tiffany Bao , Jonathan Burket , Maverick Woo , Rafael Turner , and David Brumley . 2014 . BYTEWEIGHT: Learning to recognize functions in binary code . In 23rd USENIX Security Symposium (USENIX Security 14) . USENIX Association, USA, 845\u2013860. Tiffany Bao, Jonathan Burket, Maverick Woo, Rafael Turner, and David Brumley. 2014. BYTEWEIGHT: Learning to recognize functions in binary code. In 23rd USENIX Security Symposium (USENIX Security 14). USENIX Association, USA, 845\u2013860."},{"key":"e_1_3_2_1_3_1","volume-title":"USA","author":"Brown Tom","year":"2020","unstructured":"Tom Brown , Benjamin Mann , Nick Ryder , Melanie Subbiah , Jared\u00a0 D Kaplan , Prafulla Dhariwal , Arvind Neelakantan , Pranav Shyam , Girish Sastry , Amanda Askell , Sandhini Agarwal , Ariel Herbert-Voss , Gretchen Krueger , Tom Henighan , Rewon Child , Aditya Ramesh , Daniel Ziegler , Jeffrey Wu , Clemens Winter , Chris Hesse , Mark Chen , Eric Sigler , Mateusz Litwin , Scott Gray , Benjamin Chess , Jack Clark , Christopher Berner , Sam McCandlish , Alec Radford , Ilya Sutskever , and Dario Amodei . 2020 . Language Models are Few-Shot Learners. In Advances in Neural Information Processing Systems, Vol.\u00a033. Curran Associates, Inc ., USA , 1877\u20131901. Tom Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared\u00a0D Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, Sandhini Agarwal, Ariel Herbert-Voss, Gretchen Krueger, Tom Henighan, Rewon Child, Aditya Ramesh, Daniel Ziegler, Jeffrey Wu, Clemens Winter, Chris Hesse, Mark Chen, Eric Sigler, Mateusz Litwin, Scott Gray, Benjamin Chess, Jack Clark, Christopher Berner, Sam McCandlish, Alec Radford, Ilya Sutskever, and Dario Amodei. 2020. Language Models are Few-Shot Learners. In Advances in Neural Information Processing Systems, Vol.\u00a033. Curran Associates, Inc., USA, 1877\u20131901."},{"key":"e_1_3_2_1_4_1","unstructured":"Clang Team. 2020. clang - the Clang C C++ and Objective-C compiler. https:\/\/clang.llvm.org\/docs\/CommandGuide\/clang.html.  Clang Team. 2020. clang - the Clang C C++ and Objective-C compiler. https:\/\/clang.llvm.org\/docs\/CommandGuide\/clang.html."},{"key":"e_1_3_2_1_5_1","volume-title":"12th USENIX Workshop on Cyber Security Experimentation and Test (CSET 19)","author":"Darki Ahmad","year":"2019","unstructured":"Ahmad Darki , Michalis Faloutsos , Nael Abu-Ghazaleh , Manu Sridharan , 2019 . IDAPro for IoT Malware analysis? . In 12th USENIX Workshop on Cyber Security Experimentation and Test (CSET 19) . USENIX Association, Santa Clara, CA, 15. Ahmad Darki, Michalis Faloutsos, Nael Abu-Ghazaleh, Manu Sridharan, 2019. IDAPro for IoT Malware analysis?. In 12th USENIX Workshop on Cyber Security Experimentation and Test (CSET 19). USENIX Association, Santa Clara, CA, 15."},{"key":"e_1_3_2_1_6_1","volume-title":"Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv e-prints abs\/1810.04805","author":"Devlin Jacob","year":"2018","unstructured":"Jacob Devlin , Ming-Wei Chang , Kenton Lee , and Kristina Toutanova . 2018 . Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv e-prints abs\/1810.04805 (2018), 1\u201322. Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2018. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv e-prints abs\/1810.04805 (2018), 1\u201322."},{"key":"e_1_3_2_1_7_1","doi-asserted-by":"publisher","DOI":"10.14722\/ndss.2016.23185"},{"key":"e_1_3_2_1_8_1","unstructured":"Facebook AI. 2020. RoBERTa implemented in Fairseq. https:\/\/github.com\/pytorch\/fairseq\/blob\/main\/examples\/roberta\/README.md.  Facebook AI. 2020. RoBERTa implemented in Fairseq. https:\/\/github.com\/pytorch\/fairseq\/blob\/main\/examples\/roberta\/README.md."},{"key":"e_1_3_2_1_9_1","doi-asserted-by":"publisher","DOI":"10.1145\/2976749.2978370"},{"key":"e_1_3_2_1_10_1","doi-asserted-by":"publisher","DOI":"10.1145\/3238147.3240480"},{"key":"e_1_3_2_1_11_1","unstructured":"GCC team. 2018. Options That Control Optimization. https:\/\/gcc.gnu.org\/onlinedocs\/gcc\/Optimize-Options.html.  GCC team. 2018. Options That Control Optimization. https:\/\/gcc.gnu.org\/onlinedocs\/gcc\/Optimize-Options.html."},{"key":"e_1_3_2_1_12_1","unstructured":"Hex Rays. 2008. IDA Pro. https:\/\/www.hex-rays.com\/ida-pro\/.  Hex Rays. 2008. IDA Pro. https:\/\/www.hex-rays.com\/ida-pro\/."},{"key":"e_1_3_2_1_13_1","unstructured":"Igor Pavlov. 2021. 7z format. https:\/\/www.7-zip.org\/7z.html.  Igor Pavlov. 2021. 7z format. https:\/\/www.7-zip.org\/7z.html."},{"key":"e_1_3_2_1_14_1","volume-title":"Vestige: Identifying Binary Code Provenance for Vulnerability Detection. In Applied Cryptography and Network Security (ACNS","author":"Ji Yuede","year":"2021","unstructured":"Yuede Ji , Lei Cui , and H.\u00a0 Howie Huang . 2021 . Vestige: Identifying Binary Code Provenance for Vulnerability Detection. In Applied Cryptography and Network Security (ACNS 2021). Springer International Publishing , Cham , 287\u2013310. Yuede Ji, Lei Cui, and H.\u00a0Howie Huang. 2021. Vestige: Identifying Binary Code Provenance for Vulnerability Detection. In Applied Cryptography and Network Security (ACNS 2021). Springer International Publishing, Cham, 287\u2013310."},{"key":"e_1_3_2_1_15_1","doi-asserted-by":"publisher","DOI":"10.1145\/3395363.3397377"},{"key":"e_1_3_2_1_16_1","doi-asserted-by":"publisher","DOI":"10.1109\/SPRO.2015.10"},{"key":"e_1_3_2_1_17_1","first-page":"1","article-title":"Revisiting Binary Code Similarity Analysis using Interpretable Feature Engineering and Lessons Learned","volume":"1","author":"Kim Dongkwan","year":"2022","unstructured":"Dongkwan Kim , Eunsoo Kim , Sang\u00a0Kil Cha , Sooel Son , and Yongdae Kim . 2022 . Revisiting Binary Code Similarity Analysis using Interpretable Feature Engineering and Lessons Learned . IEEE Transactions on Software Engineering 1 , 23 (2022), 1 \u2013 23 . Dongkwan Kim, Eunsoo Kim, Sang\u00a0Kil Cha, Sooel Son, and Yongdae Kim. 2022. Revisiting Binary Code Similarity Analysis using Interpretable Feature Engineering and Lessons Learned. IEEE Transactions on Software Engineering 1, 23 (2022), 1\u201323.","journal-title":"IEEE Transactions on Software Engineering"},{"key":"e_1_3_2_1_18_1","doi-asserted-by":"publisher","DOI":"10.1145\/948109.948149"},{"key":"e_1_3_2_1_19_1","volume-title":"Roberta: A robustly optimized bert pretraining approach. arXiv e-prints abs\/1907.11692","author":"Liu Yinhan","year":"2019","unstructured":"Yinhan Liu , Myle Ott , Naman Goyal , Jingfei Du , Mandar Joshi , Danqi Chen , Omer Levy , Mike Lewis , Luke Zettlemoyer , and Veselin Stoyanov . 2019 . Roberta: A robustly optimized bert pretraining approach. arXiv e-prints abs\/1907.11692 (2019), 1\u201313. Yinhan Liu, Myle Ott, Naman Goyal, Jingfei Du, Mandar Joshi, Danqi Chen, Omer Levy, Mike Lewis, Luke Zettlemoyer, and Veselin Stoyanov. 2019. Roberta: A robustly optimized bert pretraining approach. arXiv e-prints abs\/1907.11692 (2019), 1\u201313."},{"key":"e_1_3_2_1_20_1","doi-asserted-by":"publisher","DOI":"10.14722\/bar.2019.23020"},{"key":"e_1_3_2_1_21_1","unstructured":"MazeGen. 2017. X86 Opcode and Instruction Reference. http:\/\/ref.x86asm.net\/coder64.html.  MazeGen. 2017. X86 Opcode and Instruction Reference. http:\/\/ref.x86asm.net\/coder64.html."},{"key":"e_1_3_2_1_22_1","volume-title":"Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781","author":"Mikolov Tomas","year":"2013","unstructured":"Tomas Mikolov , Kai Chen , Greg Corrado , and Jeffrey Dean . 2013. Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 ( 2013 ), 1\u201312. Tomas Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean. 2013. Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013), 1\u201312."},{"key":"e_1_3_2_1_23_1","volume-title":"Advances in neural information processing systems (NIPS). Curran Associates","author":"Mikolov Tomas","unstructured":"Tomas Mikolov , Ilya Sutskever , Kai Chen , Greg\u00a0 S Corrado , and Jeff Dean . 2013. Distributed representations of words and phrases and their compositionality . In Advances in neural information processing systems (NIPS). Curran Associates , Inc., USA , 3111\u20133119. Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg\u00a0S Corrado, and Jeff Dean. 2013. Distributed representations of words and phrases and their compositionality. In Advances in neural information processing systems (NIPS). Curran Associates, Inc., USA, 3111\u20133119."},{"key":"e_1_3_2_1_24_1","unstructured":"National Security Agency. 2019. Ghidra. https:\/\/ghidra-sre.org\/.  National Security Agency. 2019. Ghidra. https:\/\/ghidra-sre.org\/."},{"key":"e_1_3_2_1_25_1","doi-asserted-by":"publisher","DOI":"10.14722\/bar.2020.23001"},{"key":"e_1_3_2_1_26_1","volume-title":"fairseq: A fast, extensible toolkit for sequence modeling. arXiv preprint arXiv:1904.01038","author":"Ott Myle","year":"2019","unstructured":"Myle Ott , Sergey Edunov , Alexei Baevski , Angela Fan , Sam Gross , Nathan Ng , David Grangier , and Michael Auli . 2019. fairseq: A fast, extensible toolkit for sequence modeling. arXiv preprint arXiv:1904.01038 ( 2019 ), 1\u20136. Myle Ott, Sergey Edunov, Alexei Baevski, Angela Fan, Sam Gross, Nathan Ng, David Grangier, and Michael Auli. 2019. fairseq: A fast, extensible toolkit for sequence modeling. arXiv preprint arXiv:1904.01038 (2019), 1\u20136."},{"key":"e_1_3_2_1_27_1","doi-asserted-by":"publisher","DOI":"10.14722\/ndss.2021.23112"},{"key":"e_1_3_2_1_28_1","doi-asserted-by":"publisher","DOI":"10.3115\/v1\/D14-1162"},{"key":"e_1_3_2_1_29_1","volume-title":"Deep contextualized word representations. arXiv preprint arXiv:1802.05365","author":"Peters E","year":"2018","unstructured":"Matthew\u00a0 E Peters , Mark Neumann , Mohit Iyyer , Matt Gardner , Christopher Clark , Kenton Lee , and Luke Zettlemoyer . 2018. Deep contextualized word representations. arXiv preprint arXiv:1802.05365 ( 2018 ), 1\u201315. Matthew\u00a0E Peters, Mark Neumann, Mohit Iyyer, Matt Gardner, Christopher Clark, Kenton Lee, and Luke Zettlemoyer. 2018. Deep contextualized word representations. arXiv preprint arXiv:1802.05365 (2018), 1\u201315."},{"key":"e_1_3_2_1_30_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.diin.2015.05.015"},{"key":"e_1_3_2_1_31_1","doi-asserted-by":"publisher","DOI":"10.1145\/3453483.3454035"},{"key":"e_1_3_2_1_32_1","doi-asserted-by":"publisher","DOI":"10.1145\/2001420.2001433"},{"key":"e_1_3_2_1_33_1","doi-asserted-by":"publisher","DOI":"10.1145\/1806672.1806678"},{"key":"e_1_3_2_1_34_1","volume-title":"Proceedings of the 23rd national conference on Artificial intelligence (AAAI\u201908)","author":"Rosenblum E","year":"2008","unstructured":"Nathan\u00a0 E Rosenblum , Xiaojin Zhu , Barton\u00a0 P Miller , and Karen Hunt . 2008 . Learning to Analyze Binary Computer Code .. In Proceedings of the 23rd national conference on Artificial intelligence (AAAI\u201908) . AAAI Press, Chicago, IL, USA, 798\u2013804. Nathan\u00a0E Rosenblum, Xiaojin Zhu, Barton\u00a0P Miller, and Karen Hunt. 2008. Learning to Analyze Binary Computer Code.. In Proceedings of the 23rd national conference on Artificial intelligence (AAAI\u201908). AAAI Press, Chicago, IL, USA, 798\u2013804."},{"key":"e_1_3_2_1_35_1","volume-title":"DisCo: Combining Disassemblers for Improved Performance. In 24th International Symposium on Research in Attacks, Intrusions and Defenses (RAID\u201921)","author":"Shaila Sri","year":"2021","unstructured":"Sri Shaila , Ahmad Darki , Michalis Faloutsos , Nael Abu-Ghazaleh , and Manu Sridharan . 2021 . DisCo: Combining Disassemblers for Improved Performance. In 24th International Symposium on Research in Attacks, Intrusions and Defenses (RAID\u201921) . Association for Computing Machinery, New York, NY, USA, 148\u2013161. Sri Shaila, Ahmad Darki, Michalis Faloutsos, Nael Abu-Ghazaleh, and Manu Sridharan. 2021. DisCo: Combining Disassemblers for Improved Performance. In 24th International Symposium on Research in Attacks, Intrusions and Defenses (RAID\u201921). Association for Computing Machinery, New York, NY, USA, 148\u2013161."},{"key":"e_1_3_2_1_36_1","doi-asserted-by":"publisher","DOI":"10.1109\/SP.2016.17"},{"key":"e_1_3_2_1_37_1","unstructured":"The Algorithms. 2021. Set of algorithms implemented in C. https:\/\/thealgorithms.github.io\/c.  The Algorithms. 2021. Set of algorithms implemented in C. https:\/\/thealgorithms.github.io\/c."},{"key":"e_1_3_2_1_38_1","unstructured":"UCSB. 2016. Angr. http:\/\/angr.io\/.  UCSB. 2016. Angr. http:\/\/angr.io\/."},{"key":"e_1_3_2_1_39_1","doi-asserted-by":"publisher","DOI":"10.5555\/3295222.3295349"},{"key":"e_1_3_2_1_40_1","unstructured":"Wikipedia contributors. 2021. Executable and Linkable Format \u2014 Wikipedia. https:\/\/en.wikipedia.org\/w\/index.php?title=Executable_and_Linkable_Format&oldid=1047842416.  Wikipedia contributors. 2021. Executable and Linkable Format \u2014 Wikipedia. https:\/\/en.wikipedia.org\/w\/index.php?title=Executable_and_Linkable_Format&oldid=1047842416."},{"key":"e_1_3_2_1_41_1","doi-asserted-by":"publisher","DOI":"10.1145\/3133956.3134018"}],"event":{"name":"RAID 2022: 25th International Symposium on Research in Attacks, Intrusions and Defenses","location":"Limassol Cyprus","acronym":"RAID 2022"},"container-title":["Proceedings of the 25th International Symposium on Research in Attacks, Intrusions and Defenses"],"original-title":[],"link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3545948.3545956","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3545948.3545956","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T19:30:27Z","timestamp":1750188627000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3545948.3545956"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,10,26]]},"references-count":41,"alternative-id":["10.1145\/3545948.3545956","10.1145\/3545948"],"URL":"https:\/\/doi.org\/10.1145\/3545948.3545956","relation":{},"subject":[],"published":{"date-parts":[[2022,10,26]]},"assertion":[{"value":"2022-10-26","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}