{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,1,18]],"date-time":"2026-01-18T00:42:55Z","timestamp":1768696975174,"version":"3.49.0"},"reference-count":44,"publisher":"Springer Science and Business Media LLC","issue":"1","license":[{"start":{"date-parts":[[2023,3,17]],"date-time":"2023-03-17T00:00:00Z","timestamp":1679011200000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2023,3,17]],"date-time":"2023-03-17T00:00:00Z","timestamp":1679011200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"publisher","award":["U1936122"],"award-info":[{"award-number":["U1936122"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Int J Comput Intell Syst"],"abstract":"<jats:title>Abstract<\/jats:title><jats:p>Binary code similarity detection (BCSD) is a task of detecting similarity of binary functions which are not available to the corresponding source code. It has been widely utilized to facilitate various kinds of crucial security analysis in software engineering. Because of the complexity of the program compilation process, identifying binary code similarity presents tough challenges. The most sensible binary similarity detector relies on a robust vector representation of binary code. However, few BCSD approaches are suitable to form vector representations for analyzing similarities between binaries, which may not only diverge in semantics but also in structures. And the existing solutions which only depend on hands-on feature engineering to form feature vectors, fail to take into consideration the relationships between instructions. To resolve these problems, we propose a novel and unified approach called DeepDual-SD that aims to combine the dual attributes (semantic and structural attribute). More specifically, DeepDual-SD consists of two branches, in which one text-based feature representation is driven by semantic attribute learning to exploit instruction semantics, another graph-based feature representation for structural attribute learning to investigate structural differences. Meanwhile deep embedding (DE) technology is utilized to map this information into low-dimensional vector representation. In addition, to get together the dual attributes, a fusion mechanism based on gate architecture is designed for learning to pay proper attention between the two attribute-aware embeddings. Experimental verifications are conducted on Openssl and Debian datasets for several tasks, including cross-compiler, cross-architecture and cross-version scenarios. The results demonstrate that our method outperforms the state-of-the-art BCSD methods in different scenarios in terms of detection accuracy.<\/jats:p>","DOI":"10.1007\/s44196-023-00206-9","type":"journal-article","created":{"date-parts":[[2023,3,27]],"date-time":"2023-03-27T00:50:23Z","timestamp":1679878223000},"update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":9,"title":["DeepDual-SD: Deep Dual Attribute-Aware Embedding for Binary Code Similarity Detection"],"prefix":"10.1007","volume":"16","author":[{"given":"Jiabao","family":"Guo","sequence":"first","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0003-1326-2889","authenticated-orcid":false,"given":"Bo","family":"Zhao","sequence":"additional","affiliation":[]},{"given":"Hui","family":"Liu","sequence":"additional","affiliation":[]},{"given":"Dongdong","family":"Leng","sequence":"additional","affiliation":[]},{"given":"Yang","family":"An","sequence":"additional","affiliation":[]},{"given":"Gangli","family":"Shu","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2023,3,17]]},"reference":[{"key":"206_CR1","unstructured":"Haq, I.U., Caballero, J.: A survey of binary code similarity (2019) arXiv:1909.11424"},{"key":"206_CR2","doi-asserted-by":"crossref","unstructured":"Luo, L., Ming, J., Wu, D., Liu, P., Zhu, S.: Semantics-based obfuscation-resilient binary code similarity comparison with applications to software and algorithm plagiarism detection. In: Proceedings of the 22nd ACM SIGSOFT international symposium on foundations of software engineering, pp. 389\u2013400 (2014)","DOI":"10.1145\/2635868.2635900"},{"key":"206_CR3","doi-asserted-by":"crossref","unstructured":"S\u00e6bj\u00f8rnsen, A., Willcock, J., Panas, T., Quinlan, D.J., Su, Z.: Detecting code clones in binary executables. In: Proceedings of the Eighteenth International Symposium on Software Testing and Analysis, pp. 117\u2013128 (2009)","DOI":"10.1145\/1572272.1572287"},{"key":"206_CR4","doi-asserted-by":"crossref","unstructured":"Chen, K., Liu, P., Zhang, Y.: Achieving accuracy and scalability simultaneously in detecting application clones on android markets. In: Proceedings of 36th ACM International Conference on Software Engineering, ICSE, pp. 175\u2013186 (2014)","DOI":"10.1145\/2568225.2568286"},{"key":"206_CR5","doi-asserted-by":"crossref","unstructured":"Zhang, F., Wu, D., Liu, P., Zhu, S.: Program logic based software plagiarism detection. In: 25th IEEE International Symposium on Software Reliability Engineering, ISSRE, pp. 66\u201377 (2014)","DOI":"10.1109\/ISSRE.2014.18"},{"key":"206_CR6","doi-asserted-by":"crossref","unstructured":"Gao, J., Yang, X., Fu, Y., Jiang, Y., Sun, J.: Vulseeker: a semantic learning based vulnerability seeker for cross-platform binary. In: Proceedings of the 33rd ACM\/IEEE International Conference on Automated Software Engineering, ASE, pp. 896\u2013899 (2018)","DOI":"10.1145\/3238147.3240480"},{"key":"206_CR7","doi-asserted-by":"crossref","unstructured":"Shirani, P., Collard, L., Agba, B.L., Lebel, B., Debbabi, M., Wang, L., Hanna, A.: BINARM: scalable and efficient detection of vulnerabilities in firmware images of intelligent electronic devices. In: Detection of Intrusions and Malware, and Vulnerability Assessment-15th International Conference, DIMVA, vol. 10885, pp. 114\u2013138. Springer (2018)","DOI":"10.1007\/978-3-319-93411-2_6"},{"key":"206_CR8","doi-asserted-by":"crossref","unstructured":"Cesare, S., Xiang, Y., Zhou, W.: Control flow-based malware variant detection. In: IEEE Transactions on Dependable & Secure Computing, pp. 307\u2013317 (2014)","DOI":"10.1109\/TDSC.2013.40"},{"key":"206_CR9","doi-asserted-by":"publisher","first-page":"98","DOI":"10.1145\/2766959","volume":"34","author":"S Bell","year":"2015","unstructured":"Bell, S., Bala, K.: Learning visual similarity for product design with convolutional neural networks. ACM Trans. Graph. 34, 98\u201319810 (2015)","journal-title":"ACM Trans. Graph."},{"key":"206_CR10","doi-asserted-by":"crossref","unstructured":"Hu, X., Chiueh, T., Shin, K.G.: Large-scale malware indexing using function-call graphs. In: Proceedings of the ACM Conference on Computer and Communications Security,CCS, pp. 611\u2013620 (2009)","DOI":"10.1145\/1653662.1653736"},{"key":"206_CR11","unstructured":"Jang, J., Woo, M., Brumley, D.: Towards automatic software lineage inference. In: Proceedings of the 22th USENIX Security Symposium, pp. 81\u201396 (2013)"},{"key":"206_CR12","doi-asserted-by":"crossref","unstructured":"Farhadi, M.R., Fung, B.C.M., Charland, P., Debbabi, M.: Binclone: Detecting code clones in malware. In: Proceedings of the 18th IEEE International Conference on Software Security and Reliability, SERE, pp. 78\u201387 (2014)","DOI":"10.1109\/SERE.2014.21"},{"key":"206_CR13","doi-asserted-by":"crossref","unstructured":"Brumley, D., Poosankam, P., Song, D.X., Zheng, J.: Automatic patch-based exploit generation is possible: Techniques and implications. In: IEEE Symposium on Security and Privacy (S &P), pp. 143\u2013157. IEEE Computer Society (2008)","DOI":"10.1109\/SP.2008.17"},{"key":"206_CR14","doi-asserted-by":"crossref","unstructured":"Xu, Z., Chen, B., Chandramohan, M., Liu, Y., Song, F.: SPAIN: security patch analysis for binaries towards understanding the pain and pills. In: Proceedings of the 39th IEEE\/ACM International Conference on Software Engineering, ICSE, pp. 462\u2013472 (2017)","DOI":"10.1109\/ICSE.2017.49"},{"key":"206_CR15","doi-asserted-by":"crossref","unstructured":"Li, Y., Xu, W., Tang, Y., Mi, X., Wang, B.: Semhunt: Identifying vulnerability type with double validation in binary code. In: The 29th International Conference on Software Engineering and Knowledge Engineering, pp. 491\u2013494 (2017)","DOI":"10.18293\/SEKE2017-117"},{"key":"206_CR16","unstructured":"Flake, H.: Structural comparison of executable objects. In: Detection of Intrusions and Malware & Vulnerability Assessment, GI SIG SIDAR Workshop, DIMVA, vol. P-46, pp. 161\u2013173 (2004)"},{"key":"206_CR17","doi-asserted-by":"publisher","first-page":"8","DOI":"10.1145\/3175492","volume":"21","author":"S Alrabaee","year":"2018","unstructured":"Alrabaee, S., Shirani, P., Wang, L., Debbabi, M.: FOSSIL: a resilient and efficient system for identifying FOSS functions in malware binaries. ACM Trans. Priv. Secur. 21, 8\u20131834 (2018)","journal-title":"ACM Trans. Priv. Secur."},{"key":"206_CR18","doi-asserted-by":"crossref","unstructured":"Gao, D., Reiter, M.K., Song, D.X.: Binhunt: Automatically finding semantic differences in binary programs. In: Information and Communications Security, 10th International Conference,ICICS, pp. 238\u2013255 (2008)","DOI":"10.1007\/978-3-540-88625-9_16"},{"key":"206_CR19","doi-asserted-by":"crossref","unstructured":"Ming, J., Pan, M., Gao, D.: ibinhunt: binary hunting with inter-procedural control flow. In: Proceedings of 15th International Conference on Information Security and Cryptology ICISC, vol. 7839, pp. 92\u2013109. Springer (2012)","DOI":"10.1007\/978-3-642-37682-5_8"},{"key":"206_CR20","doi-asserted-by":"crossref","unstructured":"Zuo, F., Li, X., Young, P., Luo, L., Zeng, Q., Zhang, Z.: Neural machine translation inspired binary code similarity comparison beyond function pairs. In: Proceedings of 26th Annual Network and Distributed System Security Symposium, NDSS (2019)","DOI":"10.14722\/ndss.2019.23492"},{"key":"206_CR21","doi-asserted-by":"crossref","unstructured":"Massarelli, L., Luna, G.A.D., Petroni, F., Baldoni, R., Querzoni, L.: SAFE: self-attentive function embeddings for binary similarity. In: Proceedings of 16th International Conference on Detection of Intrusions and Malware, and Vulnerability Assessment, DIMVA, vol. 11543, pp. 309\u2013329 (2019)","DOI":"10.1007\/978-3-030-22038-9_15"},{"key":"206_CR22","doi-asserted-by":"publisher","first-page":"61","DOI":"10.1016\/j.diin.2015.01.011","volume":"12","author":"S Alrabaee","year":"2015","unstructured":"Alrabaee, S., Shirani, P., Wang, L., Debbabi, M.: Sigma: a semantic integrated graph matching approach for identifying reused functions in binary code. Digit. Investig. 12, 61\u201371 (2015)","journal-title":"Digit. Investig."},{"key":"206_CR23","doi-asserted-by":"crossref","unstructured":"Eschweiler, S., Yakdan, K., Gerhards-Padilla, E.: Discovre: efficient cross-architecture identification of bugs in binary code. In: Proceedings of 23rd Annual Network and Distributed System Security Symposium, NDSS (2016)","DOI":"10.14722\/ndss.2016.23185"},{"key":"206_CR24","doi-asserted-by":"crossref","unstructured":"Chandramohan, M., Xue, Y., Xu, Z., Liu, Y., Cho, C.Y., Tan, H.B.K.: Bingo: cross-architecture cross-os binary search. In: Proceedings of the 24th ACM International Symposium on Foundations of Software Engineering, pp. 678\u2013689 (2016)","DOI":"10.1145\/2950290.2950350"},{"key":"206_CR25","doi-asserted-by":"crossref","unstructured":"Xu, X., Liu, C., Feng, Q., Yin, H., Song, L., Song, D.: Neural network-based graph embedding for cross-platform binary code similarity detection. In: Proceedings of the ACM Conference on Computer and Communications Security, CCS, pp. 363\u2013376 (2017)","DOI":"10.1145\/3133956.3134018"},{"key":"206_CR26","doi-asserted-by":"crossref","unstructured":"Chen, D., Yuan, Z., Chen, B., Zheng, N.: Similarity learning with spatial constraints for person re-identification. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition, CVPR, pp. 1268\u20131277","DOI":"10.1109\/CVPR.2016.142"},{"key":"206_CR27","doi-asserted-by":"publisher","first-page":"3911","DOI":"10.1109\/TIP.2020.2965275","volume":"29","author":"X Gao","year":"2020","unstructured":"Gao, X., Mu, T., Goulermas, J.Y., Thiyagalingam, J., Wang, M.: An interpretable deep architecture for similarity learning built upon hierarchical concepts. IEEE Trans. Image. Process. 29, 3911\u20133926 (2020)","journal-title":"IEEE Trans. Image. Process."},{"key":"206_CR28","doi-asserted-by":"crossref","unstructured":"Ou, M., Cui, P., Pei, J., Zhang, Z., Zhu, W.: Asymmetric transitivity preserving graph embedding. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 1105\u20131114 (2016)","DOI":"10.1145\/2939672.2939751"},{"key":"206_CR29","doi-asserted-by":"crossref","unstructured":"Heimann, M., Shen, H., Safavi, T., Koutra, D.: Regal: Representation learning-based graph alignment. In: Proceedings of the 27th ACM International Conference on Information and Knowledge Management, CIKM, pp. 117\u2013126 (2018)","DOI":"10.1145\/3269206.3271788"},{"key":"206_CR30","doi-asserted-by":"crossref","unstructured":"Ding, S.H.H., Fung, B.C.M., Charland, P.: Asm2vec: Boosting static representation robustness for binary clone search against code obfuscation and compiler optimization. In: Proceedings of IEEE Symposium on Security and Privacy, SP, pp. 472\u2013489 (2019)","DOI":"10.1109\/SP.2019.00003"},{"key":"206_CR31","doi-asserted-by":"publisher","unstructured":"Hin, D., Kan, A., Chen, H., Babar, M.A.: Linevd: statement-level vulnerability detection using graph neural networks (2022). arXiv:2203.05181 [cs]. https:\/\/doi.org\/10.48550\/arXiv.2203.05181","DOI":"10.48550\/arXiv.2203.05181"},{"key":"206_CR32","doi-asserted-by":"publisher","unstructured":"Neysiani, B.S., Morteza Babamir, S.: Automatic duplicate bug report detection using information retrieval-based versus machine learning-based approaches. In: 2020 6th International Conference on Web Research (ICWR), pp. 288\u2013293 (2020). https:\/\/doi.org\/10.1109\/ICWR49608.2020.9122288","DOI":"10.1109\/ICWR49608.2020.9122288"},{"key":"206_CR33","unstructured":"Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations (2019). arXiv:1909.11942"},{"key":"206_CR34","unstructured":"Devlin, J., Chang, M., Lee, K., Toutanova, K.: Bert: pre-training of deep bidirectional transformers for language understanding. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT, pp. 4171\u20134186 (2019)"},{"key":"206_CR35","doi-asserted-by":"publisher","first-page":"1735","DOI":"10.1162\/neco.1997.9.8.1735","volume":"9","author":"S Hochreiter","year":"1997","unstructured":"Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural. Comput. 9, 1735\u20131780 (1997)","journal-title":"Neural. Comput."},{"key":"206_CR36","unstructured":"Hex-Rays: Ida pro disassembler and debugger. In: Retrieved from https:\/\/www.hex-rays.com\/products\/ida\/index.shtml (2015)"},{"key":"206_CR37","first-page":"2702","volume":"48","author":"H Dai","year":"2016","unstructured":"Dai, H., Dai, B., Song, L.: Discriminative embeddings of latent variable models for structured data. Proceedings of the 33nd International Conference on Machine Learning, ICML 48, 2702\u20132711 (2016)","journal-title":"Proceedings of the 33nd International Conference on Machine Learning, ICML"},{"key":"206_CR38","doi-asserted-by":"crossref","unstructured":"Cho, K., van Merrienboer, B., G\u00fcl\u00e7ehre, \u00c7., Bahdanau, D., Bougares, F., Schwenk, H., Bengio, Y.: Learning phrase representations using RNN encoder-decoder for statistical machine translation. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing, EMNLP, pp. 1724\u20131734. ACL","DOI":"10.3115\/v1\/D14-1179"},{"key":"206_CR39","unstructured":"Openssl. Retrieved from https:\/\/www.openssl.org\/ (2020)"},{"key":"206_CR40","unstructured":"Debian. Retrieved from https:\/\/www.debian.org\/ (2020)"},{"key":"206_CR41","unstructured":"Chollet, F.: Keras. In: Retrieved from https:\/\/keras.io\/ (2015)"},{"key":"206_CR42","unstructured":"Abadi, M., Barham, P., Chen, J., Chen, Z., Davis, A., Dean, J., Devin, M., Ghemawat, S., Irving, G., Isard, M., Kudlur, M., Levenberg, J., Monga, R., Moore, S., Murray, D.G., Steiner, B., Tucker, P.A., Vasudevan, V., Warden, P., Wicke, M., Yu, Y., Zheng, X.: Tensorflow: A system for large-scale machine learning. In: 12th USENIX Symposium on Operating Systems Design and Implementation, pp. 265\u2013283 (2016)"},{"key":"206_CR43","unstructured":"Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. In: 3rd International Conference on Learning Representations, ICLR (2015)"},{"issue":"1","key":"206_CR44","doi-asserted-by":"publisher","first-page":"5","DOI":"10.1145\/963770.963772","volume":"22","author":"JL Herlocker","year":"2004","unstructured":"Herlocker, J.L., Konstan, J.A., Terveen, L.G., Riedl, J.: Evaluating collaborative filtering recommender systems. ACM Trans. Inf. Syst. 22(1), 5\u201353 (2004)","journal-title":"ACM Trans. Inf. Syst."}],"container-title":["International Journal of Computational Intelligence Systems"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s44196-023-00206-9.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1007\/s44196-023-00206-9\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s44196-023-00206-9.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,3,27]],"date-time":"2023-03-27T01:05:48Z","timestamp":1679879148000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/s44196-023-00206-9"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,3,17]]},"references-count":44,"journal-issue":{"issue":"1","published-online":{"date-parts":[[2023,12]]}},"alternative-id":["206"],"URL":"https:\/\/doi.org\/10.1007\/s44196-023-00206-9","relation":{},"ISSN":["1875-6883"],"issn-type":[{"value":"1875-6883","type":"electronic"}],"subject":[],"published":{"date-parts":[[2023,3,17]]},"assertion":[{"value":"22 September 2022","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"21 February 2023","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"17 March 2023","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Declarations"}},{"value":"Considered no such competing interests exist so, therefore, not applicable.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Conflict of interest"}},{"value":"The research does not relate to personal privacy.","order":3,"name":"Ethics","group":{"name":"EthicsHeading","label":"Ethical Approval and Consent to Participate"}},{"value":"All authors approved the final manuscript.","order":4,"name":"Ethics","group":{"name":"EthicsHeading","label":"Consent to Publication"}}],"article-number":"35"}}