{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,6]],"date-time":"2026-03-06T10:04:54Z","timestamp":1772791494586,"version":"3.50.1"},"reference-count":55,"publisher":"Association for Computing Machinery (ACM)","issue":"1","license":[{"start":{"date-parts":[[2021,3,9]],"date-time":"2021-03-09T00:00:00Z","timestamp":1615248000000},"content-version":"vor","delay-in-days":365,"URL":"http:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"DOI":"10.13039\/100000185","name":"Defense Advanced Research Projects Agency","doi-asserted-by":"crossref","award":["FA8750-17-C-0086"],"award-info":[{"award-number":["FA8750-17-C-0086"]}],"id":[{"id":"10.13039\/100000185","id-type":"DOI","asserted-by":"crossref"}]},{"DOI":"10.13039\/100000001","name":"National Science Foundation","doi-asserted-by":"publisher","award":["CNS-1643351"],"award-info":[{"award-number":["CNS-1643351"]}],"id":[{"id":"10.13039\/100000001","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Parallel Comput."],"published-print":{"date-parts":[[2020,3,31]]},"abstract":"<jats:p>The past decade has seen the development of many shared-memory graph processing frameworks intended to reduce the effort of developing high-performance parallel applications. However, many of these frameworks, based on Vertex-centric or Edge-centric paradigms suffer from several issues, such as poor cache utilization, irregular memory accesses, heavy use of synchronization primitives, or theoretical inefficiency, that deteriorate over-all performance and scalability.<\/jats:p>\n          <jats:p>Recently, we proposed a cache and memory-efficient partition-centric paradigm for computing PageRank [26]. In this article, we generalize this approach to develop a novel Graph Processing Over Parts (GPOP) framework that is cache efficient, scalable, and work efficient. GPOP induces locality in memory accesses by increasing granularity of execution to vertex subsets called \u201cparts,\u201d thereby dramatically improving the cache performance of a variety of graph algorithms. It achieves high scalability by enabling completely lock and atomic free computation. GPOP\u2019s built-in analytical performance model enables it to use a hybrid of source and part-centric communication modes in a way that ensures work efficiency each iteration, while simultaneously boosting high bandwidth sequential memory accesses. Finally, the GPOP framework is designed with programmability in mind. It completely abstracts away underlying parallelism and programming model details from the user and provides an easy to program set of APIs with the ability to selectively continue the active vertex set across iterations. Such functionality is useful for many graph algorithms but not intrinsically supported by the current frameworks.<\/jats:p>\n          <jats:p>We extensively evaluate the performance of GPOP for a variety of graph algorithms, using several large datasets. We observe that GPOP incurs up to 9\u00d7, 6.8\u00d7, and 5.5\u00d7 less L2 cache misses compared to Ligra, GraphMat, and Galois, respectively. In terms of execution time, GPOP is up to 19\u00d7, 9.3\u00d7, and 3.6\u00d7 faster than Ligra, GraphMat, and Galois, respectively.<\/jats:p>","DOI":"10.1145\/3380942","type":"journal-article","created":{"date-parts":[[2020,3,9]],"date-time":"2020-03-09T08:20:30Z","timestamp":1583742030000},"page":"1-24","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":33,"title":["GPOP"],"prefix":"10.1145","volume":"7","author":[{"given":"Kartik","family":"Lakhotia","sequence":"first","affiliation":[{"name":"University of Southern California, Los Angeles"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Rajgopal","family":"Kannan","sequence":"additional","affiliation":[{"name":"US Army Research Lab, Los Angeles, CA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Sourav","family":"Pati","sequence":"additional","affiliation":[{"name":"University of Southern California, Los Angeles"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Viktor","family":"Prasanna","sequence":"additional","affiliation":[{"name":"University of Southern California, Los Angeles"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","published-online":{"date-parts":[[2020,3,9]]},"reference":[{"key":"e_1_2_1_1_1","doi-asserted-by":"publisher","DOI":"10.5555\/1898953.1899055"},{"key":"e_1_2_1_2_1","volume-title":"Joseph James Gebis, Parry Husbands, Kurt Keutzer, David A. Patterson, William Lester Plishker, John Shalf, Samuel Webb Williams, et al.","author":"Asanovic Krste","year":"2006","unstructured":"Krste Asanovic, Ras Bodik, Bryan Christopher Catanzaro, Joseph James Gebis, Parry Husbands, Kurt Keutzer, David A. Patterson, William Lester Plishker, John Shalf, Samuel Webb Williams, et al. 2006. The Landscape of Parallel Computing Research: A View from Berkeley. Technical Report. Technical Report UCB\/EECS-2006-183, EECS Department, University of California, Berkeley."},{"key":"e_1_2_1_3_1","doi-asserted-by":"publisher","DOI":"10.1109\/IPDPS.2017.76"},{"key":"e_1_2_1_4_1","first-page":"3","article-title":"Direction-optimizing breadth-first search. Sci","volume":"21","author":"Beamer Scott","year":"2013","unstructured":"Scott Beamer, Krste Asanovi\u0107, and David Patterson. 2013. Direction-optimizing breadth-first search. Sci. Program. 21, 3--4 (2013), 137--148.","journal-title":"Program."},{"key":"e_1_2_1_5_1","doi-asserted-by":"publisher","DOI":"10.1109\/IPDPS.2017.112"},{"key":"e_1_2_1_6_1","doi-asserted-by":"publisher","DOI":"10.1145\/3078597.3078616"},{"key":"e_1_2_1_7_1","doi-asserted-by":"publisher","DOI":"10.1145\/1963405.1963488"},{"key":"e_1_2_1_8_1","doi-asserted-by":"publisher","DOI":"10.1145\/1480506.1480511"},{"key":"e_1_2_1_9_1","doi-asserted-by":"publisher","DOI":"10.1145\/988672.988752"},{"key":"e_1_2_1_10_1","doi-asserted-by":"publisher","DOI":"10.1145\/2623330.2623660"},{"key":"e_1_2_1_11_1","doi-asserted-by":"publisher","DOI":"10.1177\/1094342011403516"},{"key":"e_1_2_1_12_1","doi-asserted-by":"publisher","DOI":"10.1145\/2925426.2926278"},{"key":"e_1_2_1_13_1","doi-asserted-by":"publisher","DOI":"10.1137\/1.9781611972740.43"},{"key":"e_1_2_1_14_1","doi-asserted-by":"publisher","DOI":"10.14778\/2824032.2824077"},{"key":"e_1_2_1_15_1","doi-asserted-by":"publisher","DOI":"10.1145\/3192366.3192404"},{"key":"e_1_2_1_16_1","volume-title":"Proceedings of the 2014 IEEE 28th International Parallel and Distributed Processing Symposium. IEEE, 349--359","author":"Davidson Andrew","unstructured":"Andrew Davidson, Sean Baxter, Michael Garland, and John D. Owens. 2014. Work-efficient parallel GPU methods for single-source shortest paths. In Proceedings of the 2014 IEEE 28th International Parallel and Distributed Processing Symposium. IEEE, 349--359."},{"key":"e_1_2_1_17_1","doi-asserted-by":"publisher","DOI":"10.1145\/3087556.3087580"},{"key":"e_1_2_1_18_1","doi-asserted-by":"publisher","DOI":"10.1145\/2398776.2398792"},{"key":"e_1_2_1_19_1","volume-title":"Presented as Part of the 10th USENIX Symposium on Operating Systems Design and Implementation (OSDI\u201912). USENIX, 17--30.","author":"Gonzalez Joseph E.","unstructured":"Joseph E. Gonzalez, Yucheng Low, Haijie Gu, Danny Bickson, and Carlos Guestrin. 2012. PowerGraph: Distributed graph-parallel computation on natural graphs. In Presented as Part of the 10th USENIX Symposium on Operating Systems Design and Implementation (OSDI\u201912). USENIX, 17--30."},{"key":"e_1_2_1_20_1","doi-asserted-by":"publisher","DOI":"10.1145\/3178487.3178506"},{"key":"e_1_2_1_21_1","volume-title":"Proceedings of the 30th on Symposium on Parallelism in Algorithms and Architectures. ACM, 415--426","author":"Gruevski Predrag","unstructured":"Predrag Gruevski, William Hasenplaugh, David Lugato, and James J. Thomas. 2018. Laika: Efficient in-place scheduling for 3d mesh graph computations. In Proceedings of the 30th on Symposium on Parallelism in Algorithms and Architectures. ACM, 415--426."},{"key":"e_1_2_1_22_1","doi-asserted-by":"publisher","DOI":"10.14778\/2777598.2777604"},{"key":"e_1_2_1_23_1","doi-asserted-by":"crossref","unstructured":"Sungpack Hong Hassan Chafi Edic Sedlar and Kunle Olukotun. 2012. Green-Marl: A DSL for easy and efficient graph analysis. In ACM SIGARCH Computer Architecture News. ACM 349--362.","DOI":"10.1145\/2150976.2151013"},{"key":"e_1_2_1_24_1","doi-asserted-by":"publisher","DOI":"10.1145\/2487788.2488173"},{"key":"e_1_2_1_25_1","doi-asserted-by":"publisher","DOI":"10.1145\/1772690.1772751"},{"key":"e_1_2_1_26_1","volume-title":"Proceedings of the 2018 USENIX Annual Technical Conference (USENIX ATC'18). USENIX Association.","author":"Lakhotia Kartik","year":"2018","unstructured":"Kartik Lakhotia, Rajgopal Kannan, and Viktor Prasanna. 2018. Accelerating PageRank using partition-centric processing. In Proceedings of the 2018 USENIX Annual Technical Conference (USENIX ATC'18). USENIX Association."},{"key":"e_1_2_1_27_1","doi-asserted-by":"publisher","DOI":"10.1145\/3084451"},{"key":"e_1_2_1_28_1","doi-asserted-by":"publisher","DOI":"10.14778\/2212351.2212354"},{"key":"e_1_2_1_29_1","doi-asserted-by":"publisher","DOI":"10.1142\/S0129626407002843"},{"key":"e_1_2_1_30_1","doi-asserted-by":"publisher","DOI":"10.1145\/1807167.1807184"},{"key":"e_1_2_1_31_1","volume-title":"Proceedings of the 2017 USENIX Annual Technical Conference (USENIX ATC'17). USENIX Association, 631--643","author":"Malicevic Jasmina","year":"2017","unstructured":"Jasmina Malicevic, Baptiste Lepers, and Willy Zwaenepoel. 2017. Everything you always wanted to know about multicore graph processing but were afraid to ask. In Proceedings of the 2017 USENIX Annual Technical Conference (USENIX ATC'17). USENIX Association, 631--643."},{"key":"e_1_2_1_32_1","volume-title":"PartitionedVC: Partitioned external memory graph analytics framework for SSDs. arXiv preprint arXiv:1905.04264","author":"Matam Kiran Kumar","year":"2019","unstructured":"Kiran Kumar Matam, Hanieh Hashemi, and Murali Annavaram. 2019. PartitionedVC: Partitioned external memory graph analytics framework for SSDs. arXiv preprint arXiv:1905.04264 (2019)."},{"key":"e_1_2_1_33_1","unstructured":"John D. McCalpin. 1995. STREAM benchmark. Retrieved from www.cs.virginia.edu\/stream\/ref.22."},{"key":"e_1_2_1_34_1","volume-title":"Proceedings of the 15th USENIX Conference on Hot Topics in Operating Systems (HOTOS\u201915)","author":"McSherry Frank","unstructured":"Frank McSherry, Michael Isard, and Derek G. Murray. 2015. Scalability! But at what cost? In Proceedings of the 15th USENIX Conference on Hot Topics in Operating Systems (HOTOS\u201915). USENIX Association, 14--14. http:\/\/dl.acm.org\/citation.cfm?id=2831090.2831104."},{"key":"e_1_2_1_35_1","doi-asserted-by":"publisher","DOI":"10.1561\/106.00000003"},{"key":"e_1_2_1_36_1","first-page":"45","article-title":"Introducing the graph 500","volume":"19","author":"Murphy Richard C.","year":"2010","unstructured":"Richard C. Murphy, Kyle B. Wheeler, Brian W. Barrett, and James A. Ang. 2010. Introducing the graph 500. Cray Users Group 19 (2010), 45--74.","journal-title":"Cray Users Group"},{"key":"e_1_2_1_37_1","doi-asserted-by":"publisher","DOI":"10.1145\/2517349.2522739"},{"key":"e_1_2_1_38_1","volume-title":"ACM Sigplan Notices","volume":"46","author":"Pingali Keshav","unstructured":"Keshav Pingali, Donald Nguyen, Milind Kulkarni, Martin Burtscher, M. Amber Hassaan, Rashid Kaleem, Tsung-Hsien Lee, Andrew Lenharth, Roman Manevich, Mario M\u00e9ndez-Lojo, et al. 2011. The tao of parallelism in algorithms. In ACM Sigplan Notices, Vol. 46. ACM, 12--25."},{"key":"e_1_2_1_39_1","doi-asserted-by":"publisher","DOI":"10.1145\/2517349.2522740"},{"key":"e_1_2_1_40_1","volume-title":"Blelloch","author":"Shun Julian","year":"2013","unstructured":"Julian Shun and Guy E. Blelloch. 2013. Ligra: A lightweight graph processing framework for shared memory. In ACM Sigplan Notices, Vol. 48. ACM, 135--146."},{"key":"e_1_2_1_41_1","doi-asserted-by":"publisher","DOI":"10.14778\/2994509.2994522"},{"key":"e_1_2_1_42_1","volume-title":"The Boost Graph Library: User Guide and Reference Manual","author":"Siek Jeremy","unstructured":"Jeremy Siek, Andrew Lumsdaine, and Lie-Quan Lee. 2002. The Boost Graph Library: User Guide and Reference Manual. Addison-Wesley."},{"key":"e_1_2_1_43_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-09873-9_38"},{"key":"e_1_2_1_44_1","doi-asserted-by":"publisher","DOI":"10.1109\/IPDPS.2016.93"},{"key":"e_1_2_1_45_1","doi-asserted-by":"publisher","DOI":"10.1137\/080744888"},{"key":"e_1_2_1_46_1","doi-asserted-by":"publisher","DOI":"10.14778\/2809974.2809983"},{"key":"e_1_2_1_47_1","doi-asserted-by":"publisher","DOI":"10.14778\/2732232.2732238"},{"key":"e_1_2_1_48_1","volume-title":"Yelick","author":"Vuduc Richard","year":"2005","unstructured":"Richard Vuduc, James W. Demmel, and Katherine A. Yelick. 2005. OSKI: A library of automatically tuned sparse matrix kernels. In Journal of Physics: Conference Series. IOP Publishing, 521."},{"key":"e_1_2_1_49_1","doi-asserted-by":"publisher","DOI":"10.1145\/2882903.2915220"},{"key":"e_1_2_1_50_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-662-48096-0_34"},{"key":"e_1_2_1_51_1","doi-asserted-by":"publisher","DOI":"10.14778\/2733085.2733103"},{"key":"e_1_2_1_52_1","doi-asserted-by":"publisher","DOI":"10.1145\/2858788.2688507"},{"key":"e_1_2_1_53_1","volume-title":"GraphIt\u2014A high-performance DSL for graph analytics. arXiv preprint arXiv:1805.00923","author":"Zhang Yunming","year":"2018","unstructured":"Yunming Zhang, Mengjiao Yang, Riyadh Baghdadi, Shoaib Kamil, Julian Shun, and Saman Amarasinghe. 2018. GraphIt\u2014A high-performance DSL for graph analytics. arXiv preprint arXiv:1805.00923 (2018)."},{"key":"e_1_2_1_54_1","volume-title":"Proceedings of the 2017 IEEE High Performance Extreme Computing Conference (HPEC\u201917)","author":"Zhou Shijie","unstructured":"Shijie Zhou, Kartik Lakhotia, Shreyas G. Singapura, Hanqing Zeng, Rajgopal Kannan, Viktor K. Prasanna, James Fox, Euna Kim, Oded Green, and David A. Bader. 2017. Design and implementation of parallel PageRank on multicore platforms. In Proceedings of the 2017 IEEE High Performance Extreme Computing Conference (HPEC\u201917). IEEE, 1--6."},{"key":"e_1_2_1_55_1","volume-title":"Proceedings of the 2015 USENIX Annual Technical Conference (USENIX ATC'15). USENIX Association, 375--386","author":"Zhu Xiaowei","year":"2015","unstructured":"Xiaowei Zhu, Wentao Han, and Wenguang Chen. 2015. GridGraph: Large-scale graph processing on a single machine using 2-level hierarchical partitioning. In Proceedings of the 2015 USENIX Annual Technical Conference (USENIX ATC'15). USENIX Association, 375--386."}],"container-title":["ACM Transactions on Parallel Computing"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3380942","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3380942","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3380942","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,8,22]],"date-time":"2025-08-22T17:08:11Z","timestamp":1755882491000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3380942"}},"subtitle":["A Scalable Cache- and Memory-efficient Framework for Graph Processing over Parts"],"short-title":[],"issued":{"date-parts":[[2020,3,9]]},"references-count":55,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2020,3,31]]}},"alternative-id":["10.1145\/3380942"],"URL":"https:\/\/doi.org\/10.1145\/3380942","relation":{},"ISSN":["2329-4949","2329-4957"],"issn-type":[{"value":"2329-4949","type":"print"},{"value":"2329-4957","type":"electronic"}],"subject":[],"published":{"date-parts":[[2020,3,9]]},"assertion":[{"value":"2018-11-01","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2020-01-01","order":2,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2020-03-09","order":3,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}