{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,6,4]],"date-time":"2026-06-04T15:22:09Z","timestamp":1780586529495,"version":"3.54.1"},"publisher-location":"New York, NY, USA","reference-count":59,"publisher":"ACM","license":[{"start":{"date-parts":[[2022,5,23]],"date-time":"2022-05-23T00:00:00Z","timestamp":1653264000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2022,5,23]]},"DOI":"10.1145\/3524842.3527947","type":"proceedings-article","created":{"date-parts":[[2022,10,18]],"date-time":"2022-10-18T00:08:36Z","timestamp":1666051716000},"page":"511-523","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":9,"title":["Senatus"],"prefix":"10.1145","author":[{"given":"Fran","family":"Silavong","sequence":"first","affiliation":[{"name":"JPMorgan Chase, London, United Kingdom"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Sean","family":"Moran","sequence":"additional","affiliation":[{"name":"JPMorgan Chase, London, United Kingdom"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Antonios","family":"Georgiadis","sequence":"additional","affiliation":[{"name":"JPMorgan Chase, London, United Kingdom"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Rohan","family":"Saphal","sequence":"additional","affiliation":[{"name":"JPMorgan Chase, Glasgow, United Kingdom"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Robert","family":"Otter","sequence":"additional","affiliation":[{"name":"JPMorgan Chase, London, United Kingdom"}],"role":[{"vocabulary":"crossref","role":"author"}]}],"member":"320","published-online":{"date-parts":[[2022,10,17]]},"reference":[{"key":"e_1_3_2_1_1_1","doi-asserted-by":"publisher","DOI":"10.1145\/3212695"},{"key":"e_1_3_2_1_2_1","doi-asserted-by":"publisher","DOI":"10.1145\/3290353"},{"key":"e_1_3_2_1_3_1","doi-asserted-by":"publisher","DOI":"10.1145\/1327452.1327494"},{"key":"e_1_3_2_1_4_1","doi-asserted-by":"publisher","DOI":"10.1007\/s10664-010-9144-6"},{"key":"e_1_3_2_1_5_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICSM.2015.7332498"},{"key":"e_1_3_2_1_6_1","unstructured":"Avishkar Bhoopchand Tim Rockt\u00e4schel Earl Barr and Sebastian Riedel. 2016. Learning Python Code Suggestion with a Sparse Pointer Network. arXiv:1611.08307 [cs.NE]  Avishkar Bhoopchand Tim Rockt\u00e4schel Earl Barr and Sebastian Riedel. 2016. Learning Python Code Suggestion with a Sparse Pointer Network. arXiv:1611.08307 [cs.NE]"},{"key":"e_1_3_2_1_7_1","doi-asserted-by":"publisher","DOI":"10.1109\/SEQUEN.1997.666900"},{"key":"e_1_3_2_1_8_1","doi-asserted-by":"publisher","DOI":"10.1145\/276698.276781"},{"key":"e_1_3_2_1_9_1","doi-asserted-by":"publisher","DOI":"10.1145\/1148170.1148284"},{"key":"e_1_3_2_1_10_1","doi-asserted-by":"publisher","DOI":"10.1145\/3338906.3340458"},{"key":"e_1_3_2_1_11_1","doi-asserted-by":"publisher","DOI":"10.1145\/509907.509965"},{"key":"e_1_3_2_1_12_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-642-00593-0_26"},{"key":"e_1_3_2_1_13_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICPC.2011.26"},{"key":"e_1_3_2_1_14_1","volume-title":"Proceedings of the Twenty-Fourth ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems","author":"Cormode Graham","unstructured":"Graham Cormode and S. Muthukrishnan . 2005. Space Efficient Mining of Multi-graph Streams . In Proceedings of the Twenty-Fourth ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems ( Baltimore, Maryland) (PODS '05). Association for Computing Machinery, New York, NY, USA, 271--282. Graham Cormode and S. Muthukrishnan. 2005. Space Efficient Mining of Multi-graph Streams. In Proceedings of the Twenty-Fourth ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems (Baltimore, Maryland) (PODS '05). Association for Computing Machinery, New York, NY, USA, 271--282."},{"key":"e_1_3_2_1_15_1","doi-asserted-by":"publisher","DOI":"10.1145\/1242572.1242610"},{"key":"e_1_3_2_1_16_1","doi-asserted-by":"publisher","DOI":"10.1145\/997817.997857"},{"key":"e_1_3_2_1_17_1","doi-asserted-by":"publisher","DOI":"10.1145\/3422392.3422462"},{"key":"e_1_3_2_1_18_1","doi-asserted-by":"publisher","DOI":"10.1145\/355744.355745"},{"key":"e_1_3_2_1_19_1","doi-asserted-by":"publisher","DOI":"10.1145\/3180155.3180167"},{"key":"e_1_3_2_1_20_1","volume-title":"CodeSearchNet Challenge: Evaluating the State of Semantic Code Search. CoRR abs\/1909.09436","author":"Husain Hamel","year":"2019","unstructured":"Hamel Husain , Ho-Hsiang Wu , Tiferet Gazit , Miltiadis Allamanis , and Marc Brockschmidt . 2019. CodeSearchNet Challenge: Evaluating the State of Semantic Code Search. CoRR abs\/1909.09436 ( 2019 ). Hamel Husain, Ho-Hsiang Wu, Tiferet Gazit, Miltiadis Allamanis, and Marc Brockschmidt. 2019. CodeSearchNet Challenge: Evaluating the State of Semantic Code Search. CoRR abs\/1909.09436 (2019)."},{"key":"e_1_3_2_1_21_1","volume-title":"CodeSearchNet Challenge: Evaluating the State of Semantic Code Search. (June","author":"Husain Hamel","year":"2020","unstructured":"Hamel Husain , Ho-Hsiang Wu , Tiferet Gazit , Miltiadis Allamanis , and Marc Brockschmidt . 2020. CodeSearchNet Challenge: Evaluating the State of Semantic Code Search. (June 2020 ). Hamel Husain, Ho-Hsiang Wu, Tiferet Gazit, Miltiadis Allamanis, and Marc Brockschmidt. 2020. CodeSearchNet Challenge: Evaluating the State of Semantic Code Search. (June 2020)."},{"key":"e_1_3_2_1_22_1","doi-asserted-by":"publisher","DOI":"10.1145\/276698.276876"},{"key":"e_1_3_2_1_23_1","volume-title":"Improved Consistent Sampling, Weighted Minhash and L1 Sketching (ICDM '10)","author":"Ioffe Sergey","unstructured":"Sergey Ioffe . 2010. Improved Consistent Sampling, Weighted Minhash and L1 Sketching (ICDM '10) . IEEE Computer Society , USA , 246--255. Sergey Ioffe. 2010. Improved Consistent Sampling, Weighted Minhash and L1 Sketching (ICDM '10). IEEE Computer Society, USA, 246--255."},{"key":"e_1_3_2_1_24_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICSE.2007.30"},{"key":"e_1_3_2_1_25_1","volume-title":"Research, and Development Department","author":"Jones K.S.","year":"1975","unstructured":"K.S. Jones , C.J. Van Rijsbergen , British Library . Research, and Development Department . 1975 . Report on the Need for and Provision of an 'ideal' Information Retrieval Test Collection. University Computer Laboratory . https:\/\/books.google.co.uk\/books?id=cuGnSgAACAAJ K.S. Jones, C.J. Van Rijsbergen, British Library. Research, and Development Department. 1975. Report on the Need for and Provision of an 'ideal' Information Retrieval Test Collection. University Computer Laboratory. https:\/\/books.google.co.uk\/books?id=cuGnSgAACAAJ"},{"key":"e_1_3_2_1_26_1","doi-asserted-by":"publisher","DOI":"10.1109\/TSE.2002.1019480"},{"key":"e_1_3_2_1_27_1","doi-asserted-by":"publisher","DOI":"10.1145\/3180155.3180187"},{"key":"e_1_3_2_1_28_1","doi-asserted-by":"crossref","unstructured":"Ken Krugler. 2013. Krugle Code Search Architecture. In Finding Source Code on the Web for Remix and Reuse.  Ken Krugler. 2013. Krugle Code Search Architecture. In Finding Source Code on the Web for Remix and Reuse.","DOI":"10.1007\/978-1-4614-6596-6_6"},{"key":"e_1_3_2_1_29_1","volume-title":"Maximal Divergence Sequential Autoencoder for Binary Software Vulnerability Detection. In 7th International Conference on Learning Representations, ICLR 2019","author":"Le Tue","year":"2019","unstructured":"Tue Le , Tuan Nguyen , Trung Le , Dinh Q. Phung , Paul Montague , Olivier Y. de Vel , and Lizhen Qu . 2019 . Maximal Divergence Sequential Autoencoder for Binary Software Vulnerability Detection. In 7th International Conference on Learning Representations, ICLR 2019 , New Orleans, LA, USA, May 6--9 , 2019. OpenReview.net. https:\/\/openreview.net\/forum?id=ByloIiCqYQ Tue Le, Tuan Nguyen, Trung Le, Dinh Q. Phung, Paul Montague, Olivier Y. de Vel, and Lizhen Qu. 2019. Maximal Divergence Sequential Autoencoder for Binary Software Vulnerability Detection. In 7th International Conference on Learning Representations, ICLR 2019, New Orleans, LA, USA, May 6--9, 2019. OpenReview.net. https:\/\/openreview.net\/forum?id=ByloIiCqYQ"},{"key":"e_1_3_2_1_30_1","volume-title":"Maximal Divergence Sequential Autoencoder for Binary Software Vulnerability Detection. In 7th International Conference on Learning Representations, ICLR 2019","author":"Le Tue","year":"2019","unstructured":"Tue Le , Tuan Nguyen , Trung Le , Dinh Q. Phung , Paul Montague , Olivier Y. de Vel , and Lizhen Qu . 2019 . Maximal Divergence Sequential Autoencoder for Binary Software Vulnerability Detection. In 7th International Conference on Learning Representations, ICLR 2019 , New Orleans, LA, USA, May 6--9 , 2019. OpenReview.net. https:\/\/openreview.net\/forum?id=ByloIiCqYQ Tue Le, Tuan Nguyen, Trung Le, Dinh Q. Phung, Paul Montague, Olivier Y. de Vel, and Lizhen Qu. 2019. Maximal Divergence Sequential Autoencoder for Binary Software Vulnerability Detection. In 7th International Conference on Learning Representations, ICLR 2019, New Orleans, LA, USA, May 6--9, 2019. OpenReview.net. https:\/\/openreview.net\/forum?id=ByloIiCqYQ"},{"key":"e_1_3_2_1_31_1","doi-asserted-by":"publisher","DOI":"10.1145\/1321631.1321726"},{"key":"e_1_3_2_1_32_1","unstructured":"Hongyu Li Seohyun Kim and Satish Chandra. 2019. Neural Code Search Evaluation Dataset. arXiv:1908.09804 [cs.SE]  Hongyu Li Seohyun Kim and Satish Chandra. 2019. Neural Code Search Evaluation Dataset. arXiv:1908.09804 [cs.SE]"},{"key":"e_1_3_2_1_33_1","volume-title":"Neural Code Search Evaluation Dataset. CoRR abs\/1908.09804","author":"Li Hongyu","year":"2019","unstructured":"Hongyu Li , Seohyun Kim , and Satish Chandra . 2019. Neural Code Search Evaluation Dataset. CoRR abs\/1908.09804 ( 2019 ). Hongyu Li, Seohyun Kim, and Satish Chandra. 2019. Neural Code Search Evaluation Dataset. CoRR abs\/1908.09804 (2019)."},{"key":"e_1_3_2_1_34_1","doi-asserted-by":"publisher","DOI":"10.24963\/ijcai.2018\/578"},{"key":"e_1_3_2_1_35_1","doi-asserted-by":"publisher","DOI":"10.1145\/1978542.1978566"},{"key":"e_1_3_2_1_36_1","doi-asserted-by":"publisher","DOI":"10.1145\/1772690.1772759"},{"key":"e_1_3_2_1_37_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICSME46990.2020.00021"},{"key":"e_1_3_2_1_38_1","unstructured":"Chao Liu Xin Xia David Lo Cuiyun Gao Xiaohu Yang and John Grundy. 2020. Opportunities and Challenges in Code Search Tools. arXiv:2011.02297 [cs.SE]  Chao Liu Xin Xia David Lo Cuiyun Gao Xiaohu Yang and John Grundy. 2020. Opportunities and Challenges in Code Search Tools. arXiv:2011.02297 [cs.SE]"},{"key":"e_1_3_2_1_39_1","doi-asserted-by":"publisher","DOI":"10.1145\/3360578"},{"key":"e_1_3_2_1_40_1","doi-asserted-by":"publisher","DOI":"10.1186\/s13059-016-0997-x"},{"key":"e_1_3_2_1_41_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICSE.2007.37"},{"key":"e_1_3_2_1_42_1","unstructured":"S. Petrovic. 2013. Real-time event detection in massive streams.  S. Petrovic. 2013. Real-time event detection in massive streams."},{"key":"e_1_3_2_1_43_1","doi-asserted-by":"publisher","DOI":"10.1145\/3276517"},{"key":"e_1_3_2_1_44_1","unstructured":"Ruchir Puri David Kung Geert Janssen Wei Zhang Giacomo Domeniconi Vladmir Zolotov Julian Dolby Jie Chen Mihir Choudhury Lindsey Decker Veronika Thost Luca Buratti Saurabh Pujar and Ulrich Finkler. 2021. Project CodeNet: A Large-Scale AI for Code Dataset for Learning a Diversity of Coding Tasks.  Ruchir Puri David Kung Geert Janssen Wei Zhang Giacomo Domeniconi Vladmir Zolotov Julian Dolby Jie Chen Mihir Choudhury Lindsey Decker Veronika Thost Luca Buratti Saurabh Pujar and Ulrich Finkler. 2021. Project CodeNet: A Large-Scale AI for Code Dataset for Learning a Diversity of Coding Tasks."},{"key":"e_1_3_2_1_45_1","volume-title":"Mining of massive datasets","author":"Rajaraman Anand","unstructured":"Anand Rajaraman and Jeffrey David Ullman . 2012. Mining of massive datasets . Cambridge University Press , Cambridge . http:\/\/www.amazon.de\/Mining-Massive-Datasets-Anand-Rajaraman\/dp\/1107015359\/ref=sr_1_1?ie=UTF8&qid=1350890245&sr=8-1 Anand Rajaraman and Jeffrey David Ullman. 2012. Mining of massive datasets. Cambridge University Press, Cambridge. http:\/\/www.amazon.de\/Mining-Massive-Datasets-Anand-Rajaraman\/dp\/1107015359\/ref=sr_1_1?ie=UTF8&qid=1350890245&sr=8-1"},{"key":"e_1_3_2_1_46_1","unstructured":"Huzefa Rangwala and Zeehasham Rasheed. 2013. MC-MinH: Metagenome Clustering using Minwise based Hashing. In SDM.  Huzefa Rangwala and Zeehasham Rasheed. 2013. MC-MinH: Metagenome Clustering using Minwise based Hashing. In SDM."},{"key":"e_1_3_2_1_47_1","doi-asserted-by":"publisher","DOI":"10.14778\/3236187.3236214"},{"key":"e_1_3_2_1_48_1","volume-title":"Proceedings of the 2018 26th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering (Lake Buena Vista, FL, USA) (ESEC\/FSE","author":"Saini Vaibhav","year":"2018","unstructured":"Vaibhav Saini , Farima Farmahinifarahani , Yadong Lu , Pierre Baldi , and Cristina V. Lopes . 2018. Oreo: Detection of Clones in the Twilight Zone . In Proceedings of the 2018 26th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering (Lake Buena Vista, FL, USA) (ESEC\/FSE 2018 ). Association for Computing Machinery, New York, NY, USA, 354--365. Vaibhav Saini, Farima Farmahinifarahani, Yadong Lu, Pierre Baldi, and Cristina V. Lopes. 2018. Oreo: Detection of Clones in the Twilight Zone. In Proceedings of the 2018 26th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering (Lake Buena Vista, FL, USA) (ESEC\/FSE 2018). Association for Computing Machinery, New York, NY, USA, 354--365."},{"key":"e_1_3_2_1_49_1","doi-asserted-by":"publisher","DOI":"10.1145\/2884781.2884877"},{"key":"e_1_3_2_1_50_1","doi-asserted-by":"publisher","DOI":"10.1145\/2736277.2741285"},{"key":"e_1_3_2_1_51_1","volume-title":"Shao Kun Deng, and Neel Sundaresan","author":"Tufano Michele","year":"2020","unstructured":"Michele Tufano , Dawn Drain , Alexey Svyatkovskiy , Shao Kun Deng, and Neel Sundaresan . 2020 . Unit Test Case Generation with Transformers . arXiv:2009.05617 [cs.SE] Michele Tufano, Dawn Drain, Alexey Svyatkovskiy, Shao Kun Deng, and Neel Sundaresan. 2020. Unit Test Case Generation with Transformers. arXiv:2009.05617 [cs.SE]"},{"key":"e_1_3_2_1_52_1","doi-asserted-by":"publisher","DOI":"10.1109\/APSEC.2002.1183002"},{"key":"e_1_3_2_1_53_1","volume-title":"2016 31st IEEE\/ACM International Conference on Automated Software Engineering (ASE). 87--98","author":"White M.","unstructured":"M. White , M. Tufano , C. Vendome , and D. Poshyvanyk . 2016. Deep learning code fragments for code clone detection . In 2016 31st IEEE\/ACM International Conference on Automated Software Engineering (ASE). 87--98 . M. White, M. Tufano, C. Vendome, and D. Poshyvanyk. 2016. Deep learning code fragments for code clone detection. In 2016 31st IEEE\/ACM International Conference on Automated Software Engineering (ASE). 87--98."},{"key":"e_1_3_2_1_54_1","doi-asserted-by":"publisher","DOI":"10.1109\/SANER48275.2020.9054840"},{"key":"e_1_3_2_1_55_1","doi-asserted-by":"publisher","DOI":"10.1145\/3234944.3234968"},{"key":"e_1_3_2_1_56_1","doi-asserted-by":"publisher","DOI":"10.1145\/3106237.3117771"},{"key":"e_1_3_2_1_57_1","first-page":"12","article-title":"LSH Ensemble","volume":"9","author":"Zhu Erkang","year":"2016","unstructured":"Erkang Zhu , Fatemeh Nargesian , Ken Q. Pu , and Ren\u00e9e J. Miller . 2016 . LSH Ensemble : Internet-Scale Domain Search. Proc. VLDB Endow. 9 , 12 (Aug. 2016), 1185--1196. Erkang Zhu, Fatemeh Nargesian, Ken Q. Pu, and Ren\u00e9e J. Miller. 2016. LSH Ensemble: Internet-Scale Domain Search. Proc. VLDB Endow. 9, 12 (Aug. 2016), 1185--1196.","journal-title":"Internet-Scale Domain Search. Proc. VLDB Endow."},{"key":"e_1_3_2_1_58_1","doi-asserted-by":"publisher","DOI":"10.1145\/290941.291014"},{"key":"e_1_3_2_1_59_1","volume-title":"International Conference on Learning Representations (ICLR).","author":"Z\u00fcgner Daniel","year":"2021","unstructured":"Daniel Z\u00fcgner , Tobias Kirschstein , Michele Catasta , Jure Leskovec , and Stephan G\u00fcnnemann . 2021 . Language-Agnostic Representation Learning of Source Code from Structure and Context . In International Conference on Learning Representations (ICLR). Daniel Z\u00fcgner, Tobias Kirschstein, Michele Catasta, Jure Leskovec, and Stephan G\u00fcnnemann. 2021. Language-Agnostic Representation Learning of Source Code from Structure and Context. In International Conference on Learning Representations (ICLR)."}],"event":{"name":"MSR '22: 19th International Conference on Mining Software Repositories","location":"Pittsburgh Pennsylvania","acronym":"MSR '22","sponsor":["SIGSOFT ACM Special Interest Group on Software Engineering","IEEE CS"]},"container-title":["Proceedings of the 19th International Conference on Mining Software Repositories"],"original-title":[],"link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3524842.3527947","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3524842.3527947","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T18:09:54Z","timestamp":1750183794000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3524842.3527947"}},"subtitle":["a fast and accurate code-to-code recommendation engine"],"short-title":[],"issued":{"date-parts":[[2022,5,23]]},"references-count":59,"alternative-id":["10.1145\/3524842.3527947","10.1145\/3524842"],"URL":"https:\/\/doi.org\/10.1145\/3524842.3527947","relation":{},"subject":[],"published":{"date-parts":[[2022,5,23]]},"assertion":[{"value":"2022-10-17","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}