{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,17]],"date-time":"2026-02-17T11:50:33Z","timestamp":1771329033568,"version":"3.50.1"},"reference-count":69,"publisher":"Association for Computing Machinery (ACM)","issue":"OOPSLA","license":[{"start":{"date-parts":[[2020,11,13]],"date-time":"2020-11-13T00:00:00Z","timestamp":1605225600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100000923","name":"Australian Research Council","doi-asserted-by":"crossref","award":["DP200101328"],"award-info":[{"award-number":["DP200101328"]}],"id":[{"id":"10.13039\/501100000923","id-type":"DOI","asserted-by":"crossref"}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["Proc. ACM Program. Lang."],"published-print":{"date-parts":[[2020,11,13]]},"abstract":"<jats:p>Code embedding, as an emerging paradigm for source code analysis, has attracted much attention over the past few years. It aims to represent code semantics through distributed vector representations, which can be used to support a variety of program analysis tasks (e.g., code summarization and semantic labeling). However, existing code embedding approaches are intraprocedural, alias-unaware and ignoring the asymmetric transitivity of directed graphs abstracted from source code, thus they are still ineffective in preserving the structural information of code.<\/jats:p><jats:p>This paper presents Flow2Vec, a new code embedding approach that precisely preserves interprocedural program dependence (a.k.a value-flows). By approximating the high-order proximity, i.e., the asymmetric transitivity of value-flows, Flow2Vec embeds control-flows and alias-aware data-flows of a program in a low-dimensional vector space. Our value-flow embedding is formulated as matrix multiplication to preserve context-sensitive transitivity through CFL reachability by filtering out infeasible value-flow paths. We have evaluated Flow2Vec using 32 popular open-source projects. Results from our experiments show that Flow2Vec successfully boosts the performance of two recent code embedding approaches codevec and codeseq for two client applications, i.e., code classification and code summarization. For code classification, Flow2Vec improves codevec with an average increase of 21.2%, 20.1% and 20.7% in precision, recall and F1, respectively. For code summarization, Flow2Vec outperforms codeseq by an average of 13.2%, 18.8% and 16.0% in precision, recall and F1, respectively.<\/jats:p>","DOI":"10.1145\/3428301","type":"journal-article","created":{"date-parts":[[2020,11,24]],"date-time":"2020-11-24T23:36:06Z","timestamp":1606260966000},"page":"1-27","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":84,"title":["Flow2Vec: value-flow-based precise code embedding"],"prefix":"10.1145","volume":"4","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-9510-6574","authenticated-orcid":false,"given":"Yulei","family":"Sui","sequence":"first","affiliation":[{"name":"University of Technology Sydney, Australia"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Xiao","family":"Cheng","sequence":"additional","affiliation":[{"name":"Beijing University of Posts and Telecommunications, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Guanqin","family":"Zhang","sequence":"additional","affiliation":[{"name":"University of Technology Sydney, Australia"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Haoyu","family":"Wang","sequence":"additional","affiliation":[{"name":"Beijing University of Posts and Telecommunications, China"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","published-online":{"date-parts":[[2020,11,13]]},"reference":[{"key":"e_1_2_2_1_1","doi-asserted-by":"publisher","DOI":"10.1145\/1985793.1985898"},{"key":"e_1_2_2_2_1","doi-asserted-by":"crossref","unstructured":"Miltiadis Allamanis Earl T Barr Christian Bird and Charles Sutton. 2015. Suggesting Accurate Method and Class Names. In FSE' 15. 38\u015b49. Miltiadis Allamanis Earl T Barr Christian Bird and Charles Sutton. 2015. Suggesting Accurate Method and Class Names. In FSE' 15. 38\u015b49.","DOI":"10.1145\/2786805.2786849"},{"key":"e_1_2_2_3_1","unstructured":"Miltiadis Allamanis Marc Brockschmidt and Mahmoud Khademi. 2018. Learning to represent programs with graphs. ( 2018 ). Miltiadis Allamanis Marc Brockschmidt and Mahmoud Khademi. 2018. Learning to represent programs with graphs. ( 2018 )."},{"key":"e_1_2_2_4_1","volume-title":"ICML '16","author":"Allamanis Miltiadis","year":"2016"},{"key":"e_1_2_2_5_1","unstructured":"Uri Alon Shaked Brody Omer Levy and Eran Yahav. 2019a. code2seq: Generating sequences from structured representations of code. ICLR ' 19 ( 2019 ). Uri Alon Shaked Brody Omer Levy and Eran Yahav. 2019a. code2seq: Generating sequences from structured representations of code. ICLR ' 19 ( 2019 )."},{"key":"e_1_2_2_6_1","doi-asserted-by":"publisher","DOI":"10.1145\/3192366.3192412"},{"key":"e_1_2_2_7_1","doi-asserted-by":"crossref","unstructured":"Uri Alon Meital Zilberstein Omer Levy and Eran Yahav. 2019b. code2vec: Learning distributed representations of code. ACM POPL 3 ( 2019 ) 40. Uri Alon Meital Zilberstein Omer Levy and Eran Yahav. 2019b. code2vec: Learning distributed representations of code. ACM POPL 3 ( 2019 ) 40.","DOI":"10.1145\/3290353"},{"key":"e_1_2_2_9_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-662-53413-7_5"},{"key":"e_1_2_2_10_1","volume-title":"Flow-Sensitive Type-Based Heap Cloning. In ECOOP '20","author":"Barbar Mohamad","year":"2020"},{"key":"e_1_2_2_11_1","doi-asserted-by":"crossref","unstructured":"Mikhail Belkin and Partha Niyogi. 2002. Laplacian eigenmaps and spectral techniques for embedding and clustering. In NeurIPS ' 02. 585\u015b591. Mikhail Belkin and Partha Niyogi. 2002. Laplacian eigenmaps and spectral techniques for embedding and clustering. In NeurIPS ' 02. 585\u015b591.","DOI":"10.7551\/mitpress\/1120.003.0080"},{"key":"e_1_2_2_12_1","volume-title":"Alice Shoshana Jakobovits, and Torsten Hoefler","author":"Ben-Nun Tal","year":"2018"},{"key":"e_1_2_2_13_1","doi-asserted-by":"publisher","DOI":"10.1145\/268946.268966"},{"key":"e_1_2_2_14_1","doi-asserted-by":"publisher","DOI":"10.1109\/TKDE.2018.2807452"},{"key":"e_1_2_2_15_1","doi-asserted-by":"publisher","DOI":"10.1109\/METRICS.2005.28"},{"key":"e_1_2_2_16_1","unstructured":"Xinyun Chen Chang Liu and Dawn Song. 2018. Tree-to-tree neural networks for program translation. In NeurIPS ' 18. 2547\u015b2557. Xinyun Chen Chang Liu and Dawn Song. 2018. Tree-to-tree neural networks for program translation. In NeurIPS ' 18. 2547\u015b2557."},{"key":"e_1_2_2_17_1","doi-asserted-by":"publisher","DOI":"10.1145\/99583.99594"},{"key":"e_1_2_2_18_1","doi-asserted-by":"publisher","DOI":"10.1007\/3-540-61053-7_66"},{"key":"e_1_2_2_19_1","doi-asserted-by":"crossref","unstructured":"Peng Cui Xiao Wang Jian Pei and Wenwu Zhu. 2018. A survey on network embedding. TKDE 31 5 ( 2018 ) 833\u015b852. Peng Cui Xiao Wang Jian Pei and Wenwu Zhu. 2018. A survey on network embedding. TKDE 31 5 ( 2018 ) 833\u015b852.","DOI":"10.1109\/TKDE.2018.2849727"},{"key":"e_1_2_2_20_1","volume-title":"Bart De Moor, and Joos Vandewalle","author":"Lathauwer Lieven De","year":"2000"},{"key":"e_1_2_2_21_1","doi-asserted-by":"crossref","unstructured":"Jeanne Ferrante Karl J Ottenstein and Joe D Warren. 1987. The program dependence graph and its use in optimization. ACM TOPLAS 9 3 ( 1987 ) 319\u015b349. Jeanne Ferrante Karl J Ottenstein and Joe D Warren. 1987. The program dependence graph and its use in optimization. ACM TOPLAS 9 3 ( 1987 ) 319\u015b349.","DOI":"10.1145\/24039.24041"},{"key":"e_1_2_2_22_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.jss.2007.03.004"},{"key":"e_1_2_2_23_1","doi-asserted-by":"publisher","DOI":"10.1109\/32.83912"},{"key":"e_1_2_2_24_1","doi-asserted-by":"publisher","DOI":"10.1145\/2939672.2939754"},{"key":"e_1_2_2_25_1","doi-asserted-by":"publisher","DOI":"10.1145\/1250734.1250767"},{"key":"e_1_2_2_26_1","doi-asserted-by":"publisher","DOI":"10.1109\/CGO.2011.5764696"},{"key":"e_1_2_2_27_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICSE.2012.6227135"},{"key":"e_1_2_2_28_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.laa.2009.03.003"},{"key":"e_1_2_2_29_1","doi-asserted-by":"publisher","DOI":"10.1145\/3196321.3196334"},{"key":"e_1_2_2_30_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/P16-1195"},{"key":"e_1_2_2_31_1","doi-asserted-by":"crossref","unstructured":"Leo Katz. 1953. A new status index derived from sociometric analysis. Psychometrika 18 1 ( 1953 ) 39\u015b43. Leo Katz. 1953. A new status index derived from sociometric analysis. Psychometrika 18 1 ( 1953 ) 39\u015b43.","DOI":"10.1007\/BF02289026"},{"key":"e_1_2_2_32_1","volume-title":"Kingma and Jimmy Ba","author":"Diederik","year":"2015"},{"key":"e_1_2_2_33_1","doi-asserted-by":"publisher","DOI":"10.1145\/996841.996867"},{"key":"e_1_2_2_34_1","doi-asserted-by":"publisher","DOI":"10.1109\/MSR.2019.00013"},{"key":"e_1_2_2_35_1","doi-asserted-by":"publisher","DOI":"10.1145\/567532.567555"},{"key":"e_1_2_2_36_1","doi-asserted-by":"publisher","DOI":"10.5555\/977395.977673"},{"key":"e_1_2_2_37_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-030-32304-2_3"},{"key":"e_1_2_2_38_1","doi-asserted-by":"crossref","unstructured":"Ondrej Lhot\u00e1k and Kwok-Chiang Andrew Chung. 2011. Points-To Analysis with Eficient Strong Updates. In POPL ' 11. 3\u015b16. Ondrej Lhot\u00e1k and Kwok-Chiang Andrew Chung. 2011. Points-To Analysis with Eficient Strong Updates. In POPL ' 11. 3\u015b16.","DOI":"10.1145\/1925844.1926389"},{"key":"e_1_2_2_39_1","doi-asserted-by":"publisher","DOI":"10.1145\/2025113.2025160"},{"key":"e_1_2_2_40_1","doi-asserted-by":"publisher","DOI":"10.14722\/ndss.2018.23158"},{"key":"e_1_2_2_41_1","doi-asserted-by":"publisher","DOI":"10.1145\/3219819.3220034"},{"key":"e_1_2_2_42_1","volume-title":"FSE '03 28","author":"Benjamin Livshits V","year":"2003"},{"key":"e_1_2_2_43_1","doi-asserted-by":"crossref","unstructured":"M T Luong H Pham and C D Manning. 2015. Efective approaches to attention-based neural machine translation. arXiv preprint arXiv:1508.04025 ( 2015 ). M T Luong H Pham and C D Manning. 2015. Efective approaches to attention-based neural machine translation. arXiv preprint arXiv:1508.04025 ( 2015 ).","DOI":"10.18653\/v1\/D15-1166"},{"key":"e_1_2_2_44_1","volume-title":"ICML '14","author":"Maddison Chris","year":"2014"},{"key":"e_1_2_2_45_1","volume-title":"NeurIPS '13","author":"Mikolov Tomas","year":"2013"},{"key":"e_1_2_2_46_1","doi-asserted-by":"publisher","DOI":"10.1145\/2939672.2939751"},{"key":"e_1_2_2_47_1","doi-asserted-by":"publisher","DOI":"10.1145\/2623330.2623732"},{"key":"e_1_2_2_48_1","doi-asserted-by":"publisher","DOI":"10.1145\/3276517"},{"key":"e_1_2_2_49_1","doi-asserted-by":"publisher","DOI":"10.1145\/2594291.2594321"},{"key":"e_1_2_2_50_1","first-page":"11","article-title":"Program analysis via graph reachability","volume":"40","author":"Reps Thomas","year":"1998","journal-title":"IST"},{"key":"e_1_2_2_51_1","doi-asserted-by":"publisher","DOI":"10.1109\/WPC.2003.1199195"},{"key":"e_1_2_2_52_1","doi-asserted-by":"publisher","DOI":"10.1145\/2884781.2884877"},{"key":"e_1_2_2_53_1","doi-asserted-by":"crossref","unstructured":"Prithviraj Sen Galileo Namata Mustafa Bilgic Lise Getoor Brian Galligher and Tina Eliassi-Rad. 2008. Collective classification in network data. AI magazine 29 3 ( 2008 ) 93. Prithviraj Sen Galileo Namata Mustafa Bilgic Lise Getoor Brian Galligher and Tina Eliassi-Rad. 2008. Collective classification in network data. AI magazine 29 3 ( 2008 ) 93.","DOI":"10.1609\/aimag.v29i3.2157"},{"key":"e_1_2_2_54_1","doi-asserted-by":"publisher","DOI":"10.1145\/3192366.3192418"},{"key":"e_1_2_2_55_1","doi-asserted-by":"publisher","DOI":"10.1145\/1869459.1869474"},{"key":"e_1_2_2_56_1","volume-title":"Vacha Dave, Yin Zhang, and Lili Qiu.","author":"Song Han Hee","year":"2009"},{"key":"e_1_2_2_57_1","doi-asserted-by":"crossref","unstructured":"Manu Sridharan and Rastislav Bod\u00edk. 2006. Refinement-based context-sensitive points-to analysis for Java. PLDI 41 6 ( 2006 ) 387\u015b400. Manu Sridharan and Rastislav Bod\u00edk. 2006. Refinement-based context-sensitive points-to analysis for Java. PLDI 41 6 ( 2006 ) 387\u015b400.","DOI":"10.1145\/1133255.1134027"},{"key":"e_1_2_2_58_1","doi-asserted-by":"publisher","DOI":"10.1145\/2892208.2892235"},{"key":"e_1_2_2_59_1","doi-asserted-by":"publisher","DOI":"10.1609\/aaai.v33i01.3301265"},{"key":"e_1_2_2_60_1","volume-title":"IJCNLP.","author":"Tai Kai Sheng"},{"key":"e_1_2_2_61_1","doi-asserted-by":"publisher","DOI":"10.1145\/2736277.2741093"},{"key":"e_1_2_2_62_1","doi-asserted-by":"publisher","DOI":"10.1145\/775047.775141"},{"key":"e_1_2_2_63_1","volume-title":"AAAI '17","author":"Wang Xiao","year":"2017"},{"key":"e_1_2_2_64_1","volume-title":"AAAI '14","author":"Wang Zhen","year":"2014"},{"key":"e_1_2_2_65_1","volume-title":"ICSE '81","author":"Weiser Mark","year":"1981"},{"key":"e_1_2_2_66_1","volume-title":"ICML '15","author":"Xu Kelvin","year":"2015"},{"key":"e_1_2_2_67_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICSE.2019.00086"},{"key":"e_1_2_2_68_1","doi-asserted-by":"publisher","DOI":"10.1145\/3219819.3219969"},{"key":"e_1_2_2_69_1","doi-asserted-by":"publisher","DOI":"10.1145\/3236024.3236068"},{"key":"e_1_2_2_70_1","volume-title":"NeurIPS '19","author":"Zhou Yaqin","year":"2019"}],"container-title":["Proceedings of the ACM on Programming Languages"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3428301","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3428301","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T22:02:58Z","timestamp":1750197778000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3428301"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2020,11,13]]},"references-count":69,"journal-issue":{"issue":"OOPSLA","published-print":{"date-parts":[[2020,11,13]]}},"alternative-id":["10.1145\/3428301"],"URL":"https:\/\/doi.org\/10.1145\/3428301","relation":{},"ISSN":["2475-1421"],"issn-type":[{"value":"2475-1421","type":"electronic"}],"subject":[],"published":{"date-parts":[[2020,11,13]]},"assertion":[{"value":"2020-11-13","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}