{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,1,9]],"date-time":"2026-01-09T00:46:30Z","timestamp":1767919590574,"version":"3.49.0"},"reference-count":55,"publisher":"Association for Computing Machinery (ACM)","issue":"1","license":[{"start":{"date-parts":[[2021,6,2]],"date-time":"2021-06-02T00:00:00Z","timestamp":1622592000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["SIGOPS Oper. Syst. Rev."],"published-print":{"date-parts":[[2021,6,2]]},"abstract":"<jats:p>Graph mining workloads aim to extract structural properties of a graph by exploring its subgraph structures. PEREGRINE is a general-purpose graph mining system that provides a generic runtime to efficiently explore subgraph structures of interest and perform various graph mining analyses. It takes a 'pattern-aware' approach by incorporating a pattern-based programming model along with efficient pattern matching strategies. The programming model enables easier expression of complex graph mining use cases and enables PEREGRINE to extract the semantics of patterns. By analyzing the patterns, PEREGRINE generates efficient exploration plans which it uses to guide its subgraph exploration.<\/jats:p>\n          <jats:p>In this paper, we present an in-depth view of the patternanalysis techniques powering the matching engine of PEREGRINE. Beyond the theoretical foundations from prior research, we expose opportunities based on how the exploration plans are evaluated, and develop key techniques for computation reuse, enumeration depth reduction, and branch elimination. Our experiments show the importance of patternawareness for scalable and performant graph mining where the presented new techniques speed up the performance by up to two orders of magnitude on top of the benefits achieved from the prior theoretical foundations that generate the initial exploration plans.<\/jats:p>","DOI":"10.1145\/3469379.3469381","type":"journal-article","created":{"date-parts":[[2021,6,6]],"date-time":"2021-06-06T12:43:56Z","timestamp":1622983436000},"page":"1-10","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":11,"title":["A Deeper Dive into Pattern-Aware Subgraph Exploration with PEREGRINE"],"prefix":"10.1145","volume":"55","author":[{"given":"Kasra","family":"Jamshidi","sequence":"first","affiliation":[{"name":"Simon Fraser University, BC, Canada"}]},{"given":"Keval","family":"Vora","sequence":"additional","affiliation":[{"name":"Simon Fraser University, BC, Canada"}]}],"member":"320","published-online":{"date-parts":[[2021,6,6]]},"reference":[{"key":"e_1_2_1_1_1","doi-asserted-by":"publisher","DOI":"10.1109\/SC.2016.60"},{"key":"e_1_2_1_2_1","first-page":"1","volume-title":"Nick Duffield. Efficient Graphlet Counting for Large Networks. In IEEE International Conference on Data Mining (ICDM '15)","author":"Ahmed Nesreen K.","year":"2015"},{"key":"e_1_2_1_3_1","doi-asserted-by":"publisher","DOI":"10.1086\/386272"},{"key":"e_1_2_1_4_1","doi-asserted-by":"publisher","DOI":"10.1145\/2723372.2746478"},{"key":"e_1_2_1_5_1","doi-asserted-by":"publisher","DOI":"10.1145\/2882903.2915236"},{"key":"e_1_2_1_6_1","first-page":"858","volume-title":"What Is Frequent in a Single Graph? In Advances in Knowledge Discovery and Data Mining: 12th Pacific-Asia Confer- ence","author":"Bringmann Bjorn","year":"2008"},{"key":"e_1_2_1_7_1","first-page":"1","volume-title":"James Cheng. G-Miner: An Efficient Taskoriented Graph Mining System. In Proceedings of the European Conference on Computer Systems (EuroSys '18)","author":"Chen Hongzhi","year":"2018"},{"key":"e_1_2_1_8_1","first-page":"229","volume-title":"Quality of Service, 2008. IWQoS 2008. 16th International Workshop on","author":"Cheng Xu","year":"2008"},{"key":"e_1_2_1_9_1","doi-asserted-by":"publisher","DOI":"10.1145\/2324796.2324831"},{"key":"e_1_2_1_10_1","doi-asserted-by":"publisher","DOI":"10.1145\/3178876.3186125"},{"key":"e_1_2_1_11_1","first-page":"918","volume-title":"Julian Shun. Low-Latency Graph Streaming Using Compressed Purely-Functional Trees. In Proceedings of the ACM SIGPLAN Conference on Programming Language De- sign and Implementation (PLDI '19)","author":"Dhulipala Laxman","year":"2019"},{"key":"e_1_2_1_12_1","first-page":"1357","volume-title":"Srinivasan Parthasarathy. Fractal: A General-Purpose Graph Pattern Mining System. In Proceedings of the ACM International Conference on Management of Data (SIGMOD '19)","author":"Dias Vinicius","year":"2019"},{"key":"e_1_2_1_13_1","first-page":"517","volume-title":"Proceedings of the VLDB Endowment (PVLDB '14)","author":"Elseidy Mohammed","year":"2014"},{"key":"e_1_2_1_14_1","first-page":"17","volume-title":"Pro- ceedings of the USENIX Conference on Operating Sys- tems Design and Implementation (OSDI '12)","author":"Gonzalez Joseph E.","year":"2012"},{"key":"e_1_2_1_15_1","first-page":"599","volume-title":"Proceedings of the USENIX Confer- ence on Operating Systems Design and Implementation (OSDI '14)","author":"Gonzalez Joseph E.","year":"2014"},{"key":"e_1_2_1_16_1","first-page":"92","volume-title":"Network Motif Discovery Using Subgraph Enumeration and Symmetry- Breaking. In Research in Computational Molecular Biology","author":"Joshua","year":"2007"},{"key":"e_1_2_1_17_1","first-page":"8498","article-title":"The NBER Patent Citation Data File: Lessons","author":"Hall Bronwyn","year":"2001","journal-title":"Insights and Methodological Tools. NBER Working Paper"},{"key":"e_1_2_1_18_1","doi-asserted-by":"publisher","DOI":"10.1145\/3299869.3319880"},{"key":"e_1_2_1_19_1","first-page":"337","volume-title":"Jeong-Hoon Lee. TurboISO: Towards Ultrafast and Robust Subgraph Isomorphism Search in Large Graph Databases. In Proceedings of the ACM International Conference on Management of Data (SIGMOD '13)","author":"Han Wook-Shin","year":"2013"},{"key":"e_1_2_1_20_1","first-page":"1","volume-title":"DistTC: High Performance Distributed Triangle Counting","author":"Hoang Loc","year":"2019"},{"key":"e_1_2_1_21_1","doi-asserted-by":"publisher","DOI":"10.1145\/2807591.2807620"},{"key":"e_1_2_1_22_1","first-page":"745","volume-title":"Proceedings of the USENIX Symposium on Operating Systems Design and Implementation (OSDI '18)","author":"Iyer Anand Padmanabha","year":"2018"},{"key":"e_1_2_1_23_1","first-page":"1","volume-title":"Keval Vora. Peregrine: A Pattern-Aware Graph Mining System. In Proceedings of the Fifteenth European Conference on Computer Systems","author":"Jamshidi Kasra","year":"2020"},{"key":"e_1_2_1_24_1","doi-asserted-by":"publisher","DOI":"10.1145\/2882903.2915209"},{"key":"e_1_2_1_25_1","first-page":"637","volume-title":"Hwanjo Yu. OPT: A New Framework for Overlapped and Parallel Triangulation in Large-scale Graphs. In Proceedings of the ACM International Con- ference on Management of Data (SIGMOD '14)","author":"Kim Jinha","year":"2014"},{"key":"e_1_2_1_26_1","unstructured":"Kyoungmin Kim In Seo Wook-Shin Han Jeong-Hoon Lee Sungpack Hong Hassan Chafi Hyungyu Shin and Geonhwa Jeong. TurboFlux: A Fast Continuous Subgraph Matching System for Streaming Graph Data. In Proceedings of the ACM International Conference on Management of Data (SIGMOD '18) pages 411--426 2018.  Kyoungmin Kim In Seo Wook-Shin Han Jeong-Hoon Lee Sungpack Hong Hassan Chafi Hyungyu Shin and Geonhwa Jeong. TurboFlux: A Fast Continuous Subgraph Matching System for Streaming Graph Data. In Proceedings of the ACM International Conference on Management of Data (SIGMOD '18) pages 411--426 2018."},{"key":"e_1_2_1_27_1","doi-asserted-by":"publisher","DOI":"10.1002\/jcc.540050105"},{"key":"e_1_2_1_28_1","first-page":"249","volume-title":"Kumar and H Howie Huang. GraphOne: A Data Store for Real-Time Analytics on Evolving Graphs. In Proceedings of the USENIX Conference on File and Storage Technologies (FAST '19)","author":"Pradeep","year":"2019"},{"key":"e_1_2_1_29_1","first-page":"217","volume-title":"Shiyu Yang. Scalable Distributed Subgraph Enumeration. In Proceedings of the VLDB Endowment (PVLDB '16)","author":"Lai Longbin","year":"2016"},{"key":"e_1_2_1_30_1","first-page":"135","volume-title":"Google Inc. Pregel: A System for Large-Scale Graph Processing. In Proceedings of the ACM International Conference on Management of Data (SIGMOD '10)","author":"Malewicz Grzegorz","year":"2010"},{"key":"e_1_2_1_31_1","first-page":"83","volume-title":"Keval Vora. DZiG: Sparsity-Aware Incremental Processing of Streaming Graphs. In Proceedings of the European Conference on Computer Systems (EuroSys '21)","author":"Mariappan Mugilan","year":"2021"},{"key":"e_1_2_1_32_1","first-page":"1","volume-title":"Mariappan and Keval Vora. GraphBolt: Dependency-Driven Synchronous Processing of Streaming Graphs. In Proceedings of the European Conference on Computer Systems (EuroSys '19)","author":"Mugilan","year":"2019"},{"key":"e_1_2_1_33_1","first-page":"509","volume-title":"Mawhirter and Bo Wu. AutoMine: Harmonizing High-level Abstraction and High Performance for Graph Mining. In Proceedings of the ACM Symposium on Operating Systems Principles (SOSP '19)","author":"Daniel","year":"2019"},{"key":"e_1_2_1_34_1","first-page":"533","volume-title":"Chao Ai. ApproxG: Fast Approximate Parallel Graphlet Counting Through Accuracy Control. In IEEE\/ACM International Symposium on Cluster, Cloud and Grid Computing (CC- GRID '18)","author":"Mawhirter Daniel","year":"2018"},{"key":"e_1_2_1_35_1","doi-asserted-by":"publisher","DOI":"10.1080\/10511251003693694"},{"key":"e_1_2_1_36_1","doi-asserted-by":"publisher","DOI":"10.14778\/3342263.3342643"},{"key":"e_1_2_1_37_1","doi-asserted-by":"publisher","DOI":"10.1126\/science.298.5594.824"},{"key":"e_1_2_1_38_1","first-page":"456","volume-title":"Keshav Pingali. A Lightweight Infrastructure for Graph Analytics. In Proceedings of the ACM Symposium on Operating Sys- tems Principles (SOSP '13)","author":"Nguyen Donald","year":"2013"},{"key":"e_1_2_1_39_1","doi-asserted-by":"publisher","DOI":"10.1109\/SC.2018.00024"},{"key":"e_1_2_1_40_1","doi-asserted-by":"publisher","DOI":"10.1145\/3078447.3078454"},{"key":"e_1_2_1_41_1","doi-asserted-by":"publisher","DOI":"10.1145\/2815400.2815408"},{"key":"e_1_2_1_42_1","volume-title":"Salihoglu and Jennifer Widom. GPS: A Graph Processing System. In Proceedings of the International Conference on Scientific and Statistical Database Man- agement (SSDBM '13)","author":"Semih","year":"2013"},{"key":"e_1_2_1_43_1","first-page":"214","volume-title":"Proceedings of the Sympo- sium on Cloud Computing (SoCC '17)","author":"Serafini Marco","year":"2017"},{"key":"e_1_2_1_44_1","doi-asserted-by":"publisher","DOI":"10.1145\/2442516.2442530"},{"key":"e_1_2_1_45_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICDE.2015.7113280"},{"key":"e_1_2_1_46_1","first-page":"40","volume-title":"Mohammad Hossein Namaki, and Yinghui Wu. Answering Why-Questions for Subgraph Queries in Multi-attributed Graphs","author":"Song Qi","year":"2019"},{"key":"e_1_2_1_47_1","doi-asserted-by":"publisher","DOI":"10.1007\/s10618-016-0466-x"},{"key":"e_1_2_1_48_1","first-page":"425","volume-title":"Ashraf Aboulnaga. Arabesque: A System for Distributed Graph Mining. In Proceedings of the ACM Symposium on Op- erating Systems Principles (SOSP '15)","author":"Teixeira Carlos H. C.","year":"2015"},{"key":"e_1_2_1_49_1","first-page":"237","volume-title":"Proceedings of the International Conference on Architectural Support for Programming Languages and Operating Systems (ASP- LOS '17)","author":"Vora Keval","year":"2017"},{"key":"e_1_2_1_50_1","first-page":"861","volume-title":"Proceedings of SIGPLAN International Conference on Object Oriented Programming Systems Languages and Applications (OOPSLA '14)","author":"Vora Keval","year":"2014"},{"key":"e_1_2_1_51_1","doi-asserted-by":"publisher","DOI":"10.1145\/3037697.3037747"},{"key":"e_1_2_1_52_1","first-page":"763","volume-title":"Proceedings of the USENIX Conference on Operating Systems Design and Implementation (OSDI '18)","author":"Wang Kai","year":"2018"},{"key":"e_1_2_1_53_1","first-page":"637","volume-title":"Proceedings of the ACM International Conference on Web Search and Data Mining (WSDM '18)","author":"Huan Liu LiangWu","year":"2018"},{"key":"e_1_2_1_54_1","first-page":"1317","volume-title":"Proceedings of the ACM International Conference on Management of Data (SIGMOD '18)","author":"Zhang Gensheng","year":"2018"},{"key":"e_1_2_1_55_1","first-page":"301","volume-title":"Xiaosong Ma. Gemini: A Computation-Centric Distributed Graph Processing System. In Proceedings of the USENIX Symposium on Operating Systems Design and Implementation (OSDI '16)","author":"Zhu Xiaowei","year":"2016"}],"container-title":["ACM SIGOPS Operating Systems Review"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3469379.3469381","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3469379.3469381","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T21:28:23Z","timestamp":1750195703000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3469379.3469381"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2021,6,2]]},"references-count":55,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2021,6,2]]}},"alternative-id":["10.1145\/3469379.3469381"],"URL":"https:\/\/doi.org\/10.1145\/3469379.3469381","relation":{},"ISSN":["0163-5980"],"issn-type":[{"value":"0163-5980","type":"print"}],"subject":[],"published":{"date-parts":[[2021,6,2]]},"assertion":[{"value":"2021-06-06","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}