{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,31]],"date-time":"2026-03-31T19:00:34Z","timestamp":1774983634801,"version":"3.50.1"},"reference-count":77,"publisher":"Association for Computing Machinery (ACM)","issue":"1","license":[{"start":{"date-parts":[[2025,2,10]],"date-time":"2025-02-10T00:00:00Z","timestamp":1739145600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"DOI":"10.13039\/501100006374","name":"National Natural Science Foundation of China","doi-asserted-by":"publisher","award":["No. 62272353 and No. 62276193"],"award-info":[{"award-number":["No. 62272353 and No. 62276193"]}],"id":[{"id":"10.13039\/501100006374","id-type":"DOI","asserted-by":"publisher"}]},{"name":"Research Grants Council of Hong Kong","award":["No. 14205520"],"award-info":[{"award-number":["No. 14205520"]}]},{"name":"Huawei Cloud Database Innovation Lab"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Proc. ACM Manag. Data"],"published-print":{"date-parts":[[2025,2,10]]},"abstract":"<jats:p>Graph is ubiquitous in various real-world applications, and many graph processing systems have been developed. Recently, hardware accelerators have been exploited to speed up graph systems. However, such hardware-specific systems are hard to migrate across different hardware backends. In this paper, we propose the first tensor-based graph processing framework, Tgraph, which can be smoothly deployed and run on any powerful hardware accelerators (uniformly called XPU) that support Tensor Computation Runtimes (TCRs). TCRs, which are deep learning frameworks along with their runtimes and compilers, provide tensor-based interfaces to users to easily utilize specialized hardware accelerators without delving into the complex low-level programming details. However, building an efficient tensor-based graph processing framework is non-trivial. Thus, we make the following efforts: (1) propose a tensor-centric computation model for users to implement graph algorithms with easy-to-use programming interfaces; (2) provide a set of graph operators implemented by tensor to shield the computation model from the detailed tensor operators so that Tgraph can be easily migrated and deployed across different TCRs; (3) design a tensor-based graph compression and computation strategy and an out-of-XPU-memory computation strategy to handle large graphs. We conduct extensive experiments on multiple graph algorithms (BFS, WCC, SSSP, etc.), which validate that Tgraph not only outperforms seven state-of-the-art graph systems, but also can be smoothly deployed and run on multiple DL frameworks (PyTorch and TensorFlow) and hardware backends (Nvidia GPU, AMD GPU, and Apple MPS).<\/jats:p>","DOI":"10.1145\/3709731","type":"journal-article","created":{"date-parts":[[2025,2,11]],"date-time":"2025-02-11T15:45:06Z","timestamp":1739288706000},"page":"1-27","source":"Crossref","is-referenced-by-count":1,"title":["TGraph: A Tensor-centric Graph Processing Framework"],"prefix":"10.1145","volume":"3","author":[{"ORCID":"https:\/\/orcid.org\/0009-0008-7286-7582","authenticated-orcid":false,"given":"Yongliang","family":"Zhang","sequence":"first","affiliation":[{"name":"Wuhan University, Wuhan, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-3422-8017","authenticated-orcid":false,"given":"Yuanyuan","family":"Zhu","sequence":"additional","affiliation":[{"name":"Wuhan University, Wuhan, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-0026-9283","authenticated-orcid":false,"given":"Hao","family":"Zhang","sequence":"additional","affiliation":[{"name":"Huawei Technologies, Beijing, China"}]},{"ORCID":"https:\/\/orcid.org\/0009-0006-5483-3317","authenticated-orcid":false,"given":"Congli","family":"Gao","sequence":"additional","affiliation":[{"name":"Huawei Technologies, Beijing, China"}]},{"ORCID":"https:\/\/orcid.org\/0009-0006-0781-7461","authenticated-orcid":false,"given":"Yuyang","family":"Wang","sequence":"additional","affiliation":[{"name":"Wuhan University, Wuhan, China"}]},{"ORCID":"https:\/\/orcid.org\/0009-0000-2175-1813","authenticated-orcid":false,"given":"Guojing","family":"Li","sequence":"additional","affiliation":[{"name":"Wuhan University, Wuhan, China"}]},{"ORCID":"https:\/\/orcid.org\/0009-0002-5595-8008","authenticated-orcid":false,"given":"Tianyang","family":"Xu","sequence":"additional","affiliation":[{"name":"Wuhan University, Wuhan, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-9376-818X","authenticated-orcid":false,"given":"Ming","family":"Zhong","sequence":"additional","affiliation":[{"name":"Wuhan University, Wuhan, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-0051-0046","authenticated-orcid":false,"given":"Jiawei","family":"Jiang","sequence":"additional","affiliation":[{"name":"Wuhan University, Wuhan, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-4667-5794","authenticated-orcid":false,"given":"Tieyun","family":"Qian","sequence":"additional","affiliation":[{"name":"Wuhan University, Wuhan, China"}]},{"ORCID":"https:\/\/orcid.org\/0009-0003-9997-3901","authenticated-orcid":false,"given":"Chenyi","family":"Zhang","sequence":"additional","affiliation":[{"name":"Huawei Technologies, Beijing, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-9738-827X","authenticated-orcid":false,"given":"Jeffrey Xu","family":"Yu","sequence":"additional","affiliation":[{"name":"The Chinese University of Hong Kong, Hong Kong, China"}]}],"member":"320","published-online":{"date-parts":[[2025,2,11]]},"reference":[{"key":"e_1_2_1_1_1","volume-title":"Benoit Steiner, Paul A. Tucker, Vijay Vasudevan, Pete Warden, Martin Wicke, Yuan Yu, and Xiaoqiang Zheng.","author":"Abadi Mart\u00edn","year":"2016","unstructured":"Mart\u00edn Abadi, Paul Barham, Jianmin Chen, Zhifeng Chen, Andy Davis, Jeffrey Dean, Matthieu Devin, Sanjay Ghemawat, Geoffrey Irving, Michael Isard, Manjunath Kudlur, Josh Levenberg, Rajat Monga, Sherry Moore, Derek Gordon Murray, Benoit Steiner, Paul A. Tucker, Vijay Vasudevan, Pete Warden, Martin Wicke, Yuan Yu, and Xiaoqiang Zheng. 2016. TensorFlow: A System for Large-Scale Machine Learning. In OSDI. 265--283."},{"key":"e_1_2_1_2_1","unstructured":"AutomataLab. 2020. Subway. https:\/\/github.com\/AutomataLab\/Subway"},{"key":"e_1_2_1_3_1","doi-asserted-by":"publisher","DOI":"10.1145\/3018743.3018756"},{"key":"e_1_2_1_4_1","volume-title":"Survey and taxonomy of lossless graph compression and space-efficient graph representations. arXiv preprint arXiv:1806.01799","author":"Besta Maciej","year":"2018","unstructured":"Maciej Besta and Torsten Hoefler. 2018. Survey and taxonomy of lossless graph compression and space-efficient graph representations. arXiv preprint arXiv:1806.01799 (2018)."},{"key":"e_1_2_1_5_1","doi-asserted-by":"crossref","unstructured":"Paolo Boldi Marco Rosa Massimo Santini and Sebastiano Vigna. 2011. Layered label propagation: a multiresolution coordinate-free ordering for compressing social networks. In WWW. 587--596.","DOI":"10.1145\/1963405.1963488"},{"key":"e_1_2_1_6_1","doi-asserted-by":"crossref","unstructured":"Paolo Boldi and Sebastiano Vigna. 2004. The webgraph framework I: compression techniques. In WWW. 595--602.","DOI":"10.1145\/988672.988752"},{"key":"e_1_2_1_7_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-642-03784-9_3"},{"key":"e_1_2_1_8_1","doi-asserted-by":"publisher","DOI":"10.1177\/1094342011403516"},{"key":"e_1_2_1_9_1","volume-title":"MXNet: A Flexible and Efficient Machine Learning Library for Heterogeneous Distributed Systems. CoRR","author":"Chen Tianqi","year":"2015","unstructured":"Tianqi Chen, Mu Li, Yutian Li, Min Lin, Naiyan Wang, Minjie Wang, Tianjun Xiao, Bing Xu, Chiyuan Zhang, and Zheng Zhang. 2015. MXNet: A Flexible and Efficient Machine Learning Library for Heterogeneous Distributed Systems. CoRR, Vol. abs\/1512.01274 (2015). showeprint[arXiv]1512.01274"},{"key":"e_1_2_1_10_1","volume-title":"TVM: An Automated End-to-End Optimizing Compiler for Deep Learning. In OSDI. 578--594.","author":"Chen Tianqi","year":"2018","unstructured":"Tianqi Chen, Thierry Moreau, Ziheng Jiang, Lianmin Zheng, Eddie Q. Yan, Haichen Shen, Meghan Cowan, Leyuan Wang, Yuwei Hu, Luis Ceze, Carlos Guestrin, and Arvind Krishnamurthy. 2018. TVM: An Automated End-to-End Optimizing Compiler for Deep Learning. In OSDI. 578--594."},{"key":"e_1_2_1_11_1","doi-asserted-by":"crossref","unstructured":"Xinyu Chen Hongshi Tan Yao Chen Bingsheng He Weng-Fai Wong and Deming Chen. 2021. ThunderGP: HLS-based Graph Processing Framework on FPGAs. In FPGA. 69--80.","DOI":"10.1145\/3431920.3439290"},{"key":"e_1_2_1_12_1","doi-asserted-by":"publisher","DOI":"10.1145\/3588684"},{"key":"e_1_2_1_13_1","doi-asserted-by":"publisher","DOI":"10.1145\/2168836.2168846"},{"key":"e_1_2_1_14_1","doi-asserted-by":"crossref","unstructured":"Yuze Chi Guohao Dai Yu Wang Guangyu Sun Guoliang Li and Huazhong Yang. 2016. NXgraph: An efficient graph processing system on a single machine. In ICDE. 409--420.","DOI":"10.1109\/ICDE.2016.7498258"},{"key":"e_1_2_1_15_1","doi-asserted-by":"publisher","DOI":"10.1145\/2847263.2847339"},{"key":"e_1_2_1_16_1","doi-asserted-by":"publisher","DOI":"10.1145\/2939672.2939862"},{"key":"e_1_2_1_17_1","doi-asserted-by":"publisher","DOI":"10.14778\/3137765.3137801"},{"key":"e_1_2_1_18_1","doi-asserted-by":"publisher","DOI":"10.1201\/9781003033707-22"},{"key":"e_1_2_1_19_1","volume-title":"Jes\u00fas Camacho-Rodr\u00edguez, and Matteo Interlandi.","author":"Gandhi Apurva","year":"2023","unstructured":"Apurva Gandhi, Yuki Asada, Victor Fu, Advitya Gemawat, Lihao Zhang, Rathijit Sen, Carlo Curino, Jes\u00fas Camacho-Rodr\u00edguez, and Matteo Interlandi. 2023. The Tensor Data Platform: Towards an AI-centric Database System. In CIDR."},{"key":"e_1_2_1_20_1","volume-title":"Elizeu Santos-Neto, and Matei Ripeanu.","author":"Gharaibeh Abdullah","year":"2012","unstructured":"Abdullah Gharaibeh, Lauro Beltr ao Costa, Elizeu Santos-Neto, and Matei Ripeanu. 2012. A yoke of oxen and a thousand chickens for heavy lifting graph processing. In PACT. 345--354."},{"key":"e_1_2_1_21_1","unstructured":"Apache Giraph. 2012. http:\/\/giraph.apache.org."},{"key":"e_1_2_1_22_1","unstructured":"Joseph E. Gonzalez Yucheng Low Haijie Gu Danny Bickson and Carlos Guestrin. 2012. PowerGraph: Distributed Graph-Parallel Computation on Natural Graphs. In OSDI. 17--30."},{"key":"e_1_2_1_23_1","unstructured":"Joseph E. Gonzalez Reynold S. Xin Ankur Dave Daniel Crankshaw Michael J. Franklin and Ion Stoica. 2014. GraphX: Graph Processing in a Distributed Dataflow Framework. In OSDI. 599--613."},{"key":"e_1_2_1_24_1","volume-title":"Groute: An Asynchronous Multi-GPU Programming Framework. https:\/\/github.com\/groute\/groute","year":"2020","unstructured":"groute. 2020. Groute: An Asynchronous Multi-GPU Programming Framework. https:\/\/github.com\/groute\/groute"},{"key":"e_1_2_1_25_1","volume-title":"Gunrock: CUDA\/C GPU Graph Analytics. https:\/\/github.com\/gunrock\/gunrock","year":"2016","unstructured":"gunrock. 2016. Gunrock: CUDA\/C GPU Graph Analytics. https:\/\/github.com\/gunrock\/gunrock"},{"key":"e_1_2_1_26_1","unstructured":"gunrock. 2022. GraphBLAST. https:\/\/github.com\/gunrock\/graphblast"},{"key":"e_1_2_1_27_1","unstructured":"Aric Hagberg Pieter J Swart and Daniel A Schult. 2008. Exploring network structure dynamics and function using NetworkX. Technical Report. Los Alamos National Laboratory (LANL) Los Alamos NM (United States)."},{"key":"e_1_2_1_28_1","doi-asserted-by":"publisher","DOI":"10.14778\/3551793.3551833"},{"key":"e_1_2_1_29_1","doi-asserted-by":"crossref","unstructured":"Tao He Shuxian Hu Longbin Lai Dongze Li Neng Li Xue Li Lexiao Liu Xiaojian Luo Bingqing Lyu Ke Meng Sijie Shen Li Su Lei Wang Jingbo Xu Wenyuan Yu Weibin Zeng Lei Zhang Siyuan Zhang Jingren Zhou Xiaoli Zhou and Diwen Zhu. 2024. GraphScope Flex: LEGO-like Graph Computing Stack. In SIGMOD. ACM 386--399.","DOI":"10.1145\/3626246.3653383"},{"key":"e_1_2_1_30_1","volume-title":"ACM Comput. Surv.","volume":"51","author":"Heidari Safiollah","year":"2018","unstructured":"Safiollah Heidari, Yogesh Simmhan, Rodrigo N. Calheiros, and Rajkumar Buyya. 2018. Scalable Graph Processing Frameworks: A Taxonomy and Open Challenges. ACM Comput. Surv., Vol. 51, 3 (2018), 60:1--60:53."},{"key":"e_1_2_1_31_1","volume-title":"Tcudb: Accelerating database with tensor processors. In SIGMOD. 1360--1374.","author":"Hu Yu-Ching","year":"2022","unstructured":"Yu-Ching Hu, Yuliang Li, and Hung-Wei Tseng. 2022. Tcudb: Accelerating database with tensor processors. In SIGMOD. 1360--1374."},{"key":"e_1_2_1_32_1","unstructured":"IntelligentSoftwareSystems. 2013. Galois. https:\/\/github.com\/IntelligentSoftwareSystems\/Galois"},{"key":"e_1_2_1_33_1","volume-title":"Ligra: A Lightweight Graph Processing Framework for Shared Memory. https:\/\/github.com\/jshun\/ligra","year":"2013","unstructured":"jshun. 2013. Ligra: A Lightweight Graph Processing Framework for Shared Memory. https:\/\/github.com\/jshun\/ligra"},{"key":"e_1_2_1_34_1","doi-asserted-by":"crossref","unstructured":"Jeremy Kepner. 2017. Graphblas mathematics-provisional release 1.0. GraphBLAS. org Tech. Rep. (2017).","DOI":"10.1109\/HPEC.2017.8091098"},{"key":"e_1_2_1_35_1","doi-asserted-by":"crossref","unstructured":"Jeremy Kepner Peter Aaltonen David Bader Aydin Bulu\u00e7 Franz Franchetti John Gilbert Dylan Hutchison Manoj Kumar Andrew Lumsdaine Henning Meyerhenke et al. 2016. Mathematical foundations of the GraphBLAS. In HPEC. 1--9.","DOI":"10.1109\/HPEC.2016.7761646"},{"key":"e_1_2_1_36_1","doi-asserted-by":"crossref","unstructured":"Jeremy Kepner and John Gilbert. 2011. Graph algorithms in the language of linear algebra. SIAM.","DOI":"10.1137\/1.9780898719918"},{"key":"e_1_2_1_37_1","volume-title":"Bhuyan","author":"Khorasani Farzad","year":"2014","unstructured":"Farzad Khorasani, Keval Vora, Rajiv Gupta, and Laxmi N. Bhuyan. 2014. CuSha: vertex-centric graph processing on GPUs. In HPDC. 239--252."},{"key":"e_1_2_1_38_1","doi-asserted-by":"publisher","DOI":"10.14778\/3467861.3467869"},{"key":"e_1_2_1_39_1","unstructured":"Aapo Kyrola Guy E. Blelloch and Carlos Guestrin. 2012. GraphChi: Large-Scale Graph Computation on Just a PC. In OSDI. 31--46."},{"key":"e_1_2_1_40_1","unstructured":"Jure Leskovec and Andrej Krevl. 2014. SNAP Datasets: Stanford Large Network Dataset Collection. http:\/\/snap.stanford.edu\/data."},{"key":"e_1_2_1_41_1","first-page":"708","article-title":"The Deep Learning Compiler: A Comprehensive Survey","volume":"32","author":"Li Mingzhen","year":"2021","unstructured":"Mingzhen Li, Yi Liu, Xiaoyan Liu, Qingxiao Sun, Xin You, Hailong Yang, Zhongzhi Luan, Lin Gan, Guangwen Yang, and Depei Qian. 2021. The Deep Learning Compiler: A Comprehensive Survey. TPDS, Vol. 32, 3 (2021), 708--727.","journal-title":"TPDS"},{"key":"e_1_2_1_42_1","volume-title":"Mosaic: Processing a Trillion-Edge Graph on a Single Machine. In EuroSys. 527--543.","author":"Maass Steffen","year":"2017","unstructured":"Steffen Maass, Changwoo Min, Sanidhya Kashyap, Woon-Hak Kang, Mohan Kumar, and Taesoo Kim. 2017. Mosaic: Processing a Trillion-Edge Graph on a Single Machine. In EuroSys. 527--543."},{"key":"e_1_2_1_43_1","doi-asserted-by":"crossref","unstructured":"Grzegorz Malewicz Matthew H. Austern Aart J. C. Bik James C. Dehnert Ilan Horn Naty Leiser and Grzegorz Czajkowski. 2010. Pregel: a system for large-scale graph processing. In SIGMOD. 135--146.","DOI":"10.1145\/1807167.1807184"},{"key":"e_1_2_1_44_1","doi-asserted-by":"crossref","unstructured":"Tim Mattson David Bader Jon Berry Aydin Buluc Jack Dongarra Christos Faloutsos John Feo John Gilbert Joseph Gonzalez Bruce Hendrickson et al. 2013. Standards for graph algorithm primitives. In HPEC. 1--2.","DOI":"10.1109\/HPEC.2013.6670338"},{"key":"e_1_2_1_45_1","unstructured":"Microsoft. 2022. ONNX Runtime. https:\/\/github.com\/microsoft\/onnxruntime"},{"key":"e_1_2_1_46_1","volume-title":"Markus Weimer, and Matteo Interlandi.","author":"Nakandala Supun","year":"2020","unstructured":"Supun Nakandala, Karla Saur, Gyeong-In Yu, Konstantinos Karanasos, Carlo Curino, Markus Weimer, and Matteo Interlandi. 2020. A Tensor Compiler for Unified Machine Learning Prediction Serving. In OSDI. 899--917."},{"key":"e_1_2_1_47_1","doi-asserted-by":"crossref","unstructured":"Donald Nguyen Andrew Lenharth and Keshav Pingali. 2013. A lightweight infrastructure for graph analytics. In SOSP. 456--471.","DOI":"10.1145\/2517349.2522739"},{"key":"e_1_2_1_48_1","doi-asserted-by":"crossref","unstructured":"Eriko Nurvitadhi Gabriel Weisz Yu Wang Skand Hurkat Marie Nguyen James C. Hoe Jos\u00e9 F. Mart\u00ednez and Carlos Guestrin. 2014. GraphGen: An FPGA Framework for Vertex-Centric Graph Computation. In FCCM. 25--28.","DOI":"10.1109\/FCCM.2014.15"},{"key":"e_1_2_1_49_1","volume-title":"PyTorch: An Imperative Style","author":"Paszke Adam","unstructured":"Adam Paszke, Sam Gross, Francisco Massa, Adam Lerer, James Bradbury, Gregory Chanan, Trevor Killeen, Zeming Lin, Natalia Gimelshein, Luca Antiga, Alban Desmaison, Andreas K\u00f6pf, Edward Z. Yang, Zachary DeVito, Martin Raison, Alykhan Tejani, Sasank Chilamkurthy, Benoit Steiner, Lu Fang, Junjie Bai, and Soumith Chintala. 2019. PyTorch: An Imperative Style, High-Performance Deep Learning Library. In NIPS. 8024--8035."},{"key":"e_1_2_1_50_1","unstructured":"Rapids. 2022. cuGraph. https:\/\/github.com\/rapidsai\/cugraph\/tree\/branch-22.04"},{"key":"e_1_2_1_51_1","volume-title":"Ahmed","author":"Rossi Ryan A.","year":"2015","unstructured":"Ryan A. Rossi and Nesreen K. Ahmed. 2015. The Network Data Repository with Interactive Graph Analytics and Visualization. In AAAI. https:\/\/networkrepository.com"},{"key":"e_1_2_1_52_1","doi-asserted-by":"crossref","unstructured":"Amitabha Roy Laurent Bindschaedler Jasmina Malicevic and Willy Zwaenepoel. 2015. Chaos: scale-out graph processing from secondary storage. In SOSP. 410--424.","DOI":"10.1145\/2815400.2815408"},{"key":"e_1_2_1_53_1","doi-asserted-by":"crossref","unstructured":"Amitabha Roy Ivo Mihailovic and Willy Zwaenepoel. 2013. X-Stream: edge-centric graph processing using streaming partitions. In SOSP. 472--488.","DOI":"10.1145\/2517349.2522740"},{"key":"e_1_2_1_54_1","first-page":"1","article-title":"Subway: minimizing data transfer during out-of-GPU-memory graph processing","volume":"12","author":"Nodehi Sabet Amir Hossein","year":"2020","unstructured":"Amir Hossein Nodehi Sabet, Zhijia Zhao, and Rajiv Gupta. 2020. Subway: minimizing data transfer during out-of-GPU-memory graph processing. In EuroSys. ACM, 12:1--12:16.","journal-title":"EuroSys. ACM"},{"key":"e_1_2_1_55_1","first-page":"1","article-title":"GPS: a graph processing system","volume":"22","author":"Salihoglu Semih","year":"2013","unstructured":"Semih Salihoglu and Jennifer Widom. 2013. GPS: a graph processing system. In SSDBM. 22:1--22:12.","journal-title":"SSDBM."},{"key":"e_1_2_1_56_1","volume-title":"Benchmarking for graph clustering and partitioning. Encyclopedia of social network analysis and mining Springer","author":"Sanders Peter","year":"2014","unstructured":"Peter Sanders, Christian Schulz, and Dorothea Wagner. 2014. Benchmarking for graph clustering and partitioning. Encyclopedia of social network analysis and mining Springer (2014)."},{"key":"e_1_2_1_57_1","doi-asserted-by":"crossref","unstructured":"Xuanhua Shi Junling Liang Sheng Di Bingsheng He Hai Jin Lu Lu Zhixiang Wang Xuan Luo and Jianlong Zhong. 2015. Optimization of asynchronous graph processing on GPU with hybridvcoloring model. In PPoPP. 271--272.","DOI":"10.1145\/2688500.2688542"},{"key":"e_1_2_1_58_1","volume-title":"Blelloch","author":"Shun Julian","year":"2013","unstructured":"Julian Shun and Guy E. Blelloch. 2013. Ligra: a lightweight graph processing framework for shared memory. In PPoPP. 135--146."},{"key":"e_1_2_1_59_1","volume-title":"Blelloch","author":"Shun Julian","year":"2015","unstructured":"Julian Shun, Laxman Dhulipala, and Guy E. Blelloch. 2015. Smaller and Faster: Parallel Processing of Compressed Graphs with Ligra. In DCC. 403--412."},{"key":"e_1_2_1_60_1","doi-asserted-by":"publisher","DOI":"10.14778\/3282495.3282501"},{"key":"e_1_2_1_61_1","doi-asserted-by":"publisher","DOI":"10.14778\/2809974.2809983"},{"key":"e_1_2_1_62_1","doi-asserted-by":"publisher","DOI":"10.14778\/2732232.2732238"},{"key":"e_1_2_1_63_1","unstructured":"torch_scatter. 2022. compilation issue of torch_scatter. https:\/\/github.com\/rusty1s\/pytorch_scatter\/issues\/440"},{"key":"e_1_2_1_64_1","unstructured":"Guozhang Wang Wenlei Xie Alan J. Demers and Johannes Gehrke. 2013. Asynchronous Large-Scale Graph Processing Made Easy. In CIDR."},{"key":"e_1_2_1_65_1","first-page":"1","article-title":"Gunrock: a high-performance graph processing library on the GPU","volume":"11","author":"Wang Yangzihao","year":"2016","unstructured":"Yangzihao Wang, Andrew A. Davidson, Yuechao Pan, Yuduo Wu, Andy Riffel, and John D. Owens. 2016. Gunrock: a high-performance graph processing library on the GPU. In PPoPP. 11:1--11:12.","journal-title":"PPoPP."},{"key":"e_1_2_1_66_1","doi-asserted-by":"publisher","DOI":"10.1038\/30918"},{"key":"e_1_2_1_67_1","doi-asserted-by":"publisher","DOI":"10.14778\/3476311.3476324"},{"key":"e_1_2_1_68_1","doi-asserted-by":"publisher","DOI":"10.14778\/2733085.2733103"},{"key":"e_1_2_1_69_1","doi-asserted-by":"publisher","DOI":"10.1109\/TPDS.2017.2743708"},{"key":"e_1_2_1_70_1","doi-asserted-by":"publisher","DOI":"10.1145\/3466795"},{"key":"e_1_2_1_71_1","doi-asserted-by":"crossref","unstructured":"Feng Zhang Bo Wu Jidong Zhai Bingsheng He and Wenguang Chen. 2017. FinePar: irregularity-aware fine-grained workload partitioning on integrated architectures. In CGO. 27--38.","DOI":"10.1109\/CGO.2017.7863726"},{"key":"e_1_2_1_72_1","doi-asserted-by":"publisher","DOI":"10.1145\/3477603"},{"key":"e_1_2_1_73_1","volume-title":"Szalay","author":"Zheng Da","year":"2015","unstructured":"Da Zheng, Disa Mhembere, Randal C. Burns, Joshua T. Vogelstein, Carey E. Priebe, and Alexander S. Szalay. 2015. FlashGraph: Processing Billion-Node Graphs on an Array of Commodity SSDs. In FAST. 45--58."},{"key":"e_1_2_1_74_1","volume-title":"Scaph: Scalable GPU-Accelerated Graph Processing with Value-Driven Differential Scheduling. In ATC. 573--588.","author":"Zheng Long","year":"2020","unstructured":"Long Zheng, Xianliang Li, Yaohui Zheng, Yu Huang, Xiaofei Liao, Hai Jin, Jingling Xue, Zhiyuan Shao, and Qiang-Sheng Hua. 2020. Scaph: Scalable GPU-Accelerated Graph Processing with Value-Driven Differential Scheduling. In ATC. 573--588."},{"key":"e_1_2_1_75_1","first-page":"1543","article-title":"Medusa","volume":"25","author":"Zhong Jianlong","year":"2014","unstructured":"Jianlong Zhong and Bingsheng He. 2014. Medusa: Simplified Graph Processing on GPUs. TPDS, Vol. 25, 6 (2014), 1543--1552.","journal-title":"Simplified Graph Processing on GPUs. TPDS"},{"key":"e_1_2_1_76_1","unstructured":"Xiaowei Zhu Wentao Han and Wenguang Chen. 2015. GridGraph: Large-Scale Graph Processing on a Single Machine Using 2-Level Hierarchical Partitioning. In ATC. 375--386."},{"key":"e_1_2_1_77_1","doi-asserted-by":"publisher","DOI":"10.14778\/3598581.3598590"}],"container-title":["Proceedings of the ACM on Management of Data"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3709731","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3709731","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2026,3,31]],"date-time":"2026-03-31T18:16:15Z","timestamp":1774980975000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3709731"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,2,10]]},"references-count":77,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2025,2,10]]}},"alternative-id":["10.1145\/3709731"],"URL":"https:\/\/doi.org\/10.1145\/3709731","relation":{},"ISSN":["2836-6573"],"issn-type":[{"value":"2836-6573","type":"electronic"}],"subject":[],"published":{"date-parts":[[2025,2,10]]}}}