{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,7,22]],"date-time":"2025-07-22T10:33:56Z","timestamp":1753180436251,"version":"3.41.0"},"reference-count":58,"publisher":"Association for Computing Machinery (ACM)","issue":"4","license":[{"start":{"date-parts":[[2020,9,30]],"date-time":"2020-09-30T00:00:00Z","timestamp":1601424000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"name":"SERB CRG","award":["CRG\/2018\/002488"],"award-info":[{"award-number":["CRG\/2018\/002488"]}]},{"name":"NSM research","award":["MeitY\/R8D\/HPC\/2(1)\/2014"],"award-info":[{"award-number":["MeitY\/R8D\/HPC\/2(1)\/2014"]}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Archit. Code Optim."],"published-print":{"date-parts":[[2020,12,31]]},"abstract":"<jats:p>\n            Graph algorithms are widely used in various applications. Their programmability and performance have garnered a lot of interest among the researchers. Being able to run these graph analytics programs on distributed systems is an important requirement. Green-Marl is a popular Domain Specific Language (DSL) for coding graph algorithms and is known for its simplicity. However, the existing Green-Marl compiler for distributed systems (Green-Marl to Pregel) can only compile limited types of Green-Marl programs (in Pregel canonical form). This severely restricts the types of parallel Green-Marl programs that can be executed on distributed systems. We present\n            <jats:italic>DisGCo<\/jats:italic>\n            , the first compiler to translate any general Green-Marl program to equivalent MPI program that can run on distributed systems.\n          <\/jats:p>\n          <jats:p>\n            Translating Green-Marl programs to MPI (SPMD\/MPMD style of computation, distributed memory) presents many other exciting challenges, besides the issues related to differences in syntax, as Green-Marl gives the programmer a unified view of the whole memory and allows the parallel and serial code to be inter-mixed. We first present the set of challenges involved in translating Green-Marl programs to MPI and then present a systematic approach to do the translation. We also present a few optimization techniques to improve the performance of our generated programs.\n            <jats:italic>DisGCo<\/jats:italic>\n            is the first graph DSL compiler that can handle all syntactic capabilities of a practical graph DSL like Green-Marl and generate code that can run on distributed systems. Our preliminary evaluation of\n            <jats:italic>DisGCo<\/jats:italic>\n            shows that our generated programs are scalable. Further, compared to the state-of-the-art DH-Falcon compiler that translates a subset of Falcon programs to MPI, our generated codes exhibit a geomean speedup of 17.32\u00d7.\n          <\/jats:p>","DOI":"10.1145\/3414469","type":"journal-article","created":{"date-parts":[[2020,9,30]],"date-time":"2020-09-30T11:23:50Z","timestamp":1601465030000},"page":"1-26","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":5,"title":["DisGCo"],"prefix":"10.1145","volume":"17","author":[{"given":"Anchu","family":"Rajendran","sequence":"first","affiliation":[{"name":"Indian Institute of Technology Madras, Chennai, Tamil Nadu, India"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-5949-0046","authenticated-orcid":false,"given":"V. Krishna","family":"Nandivada","sequence":"additional","affiliation":[{"name":"Indian Institute of Technology Madras, Chennai, Tamil Nadu, India"}]}],"member":"320","published-online":{"date-parts":[[2020,9,30]]},"reference":[{"key":"e_1_2_1_1_1","unstructured":"2015. Green-Marl Language Spec. Retrieved from https:\/\/docs.oracle.com\/cd\/E56133_01\/1.2.0\/Green_Marl_Language_Specification.pdf.  2015. Green-Marl Language Spec. Retrieved from https:\/\/docs.oracle.com\/cd\/E56133_01\/1.2.0\/Green_Marl_Language_Specification.pdf."},{"key":"e_1_2_1_2_1","unstructured":"2015. MPI3.1 documentation. Retrieved from https:\/\/www.mpi-forum.org\/docs\/mpi-3.1\/mpi31-report.pdf.  2015. MPI3.1 documentation. Retrieved from https:\/\/www.mpi-forum.org\/docs\/mpi-3.1\/mpi31-report.pdf."},{"key":"e_1_2_1_3_1","unstructured":"2016. Mezzanine Apapters. Retrieved from http:\/\/www.mellanox.com\/related-docs\/user_manuals.  2016. Mezzanine Apapters. Retrieved from http:\/\/www.mellanox.com\/related-docs\/user_manuals."},{"key":"e_1_2_1_4_1","unstructured":"2019. MPICH Home Page. Retrieved from http:\/\/www.mcs.anl.gov\/mpi\/mpich2.  2019. MPICH Home Page. Retrieved from http:\/\/www.mcs.anl.gov\/mpi\/mpich2."},{"volume-title":"Proceedings of the IEEE International Congress on Big Data (BigData Congress). 18--25","author":"Abdolrashidi A.","key":"e_1_2_1_5_1","unstructured":"A. Abdolrashidi and L. Ramaswamy . 2016. Continual and cost-effective partitioning of dynamic graphs for optimizing big graph processing systems . In Proceedings of the IEEE International Congress on Big Data (BigData Congress). 18--25 . A. Abdolrashidi and L. Ramaswamy. 2016. Continual and cost-effective partitioning of dynamic graphs for optimizing big graph processing systems. In Proceedings of the IEEE International Congress on Big Data (BigData Congress). 18--25."},{"volume-title":"Proceedings of the World Wide Web Conference. 37--48","author":"Ahmed A.","key":"e_1_2_1_6_1","unstructured":"A. Ahmed , N. Shervashidze , S. Narayanamurthy , V. Josifovski , and A. J. Smola . 2013. Distributed large-scale natural graph factorization . In Proceedings of the World Wide Web Conference. 37--48 . A. Ahmed, N. Shervashidze, S. Narayanamurthy, V. Josifovski, and A. J. Smola. 2013. Distributed large-scale natural graph factorization. In Proceedings of the World Wide Web Conference. 37--48."},{"volume-title":"Proceedings of the Conference on Programming Language Design and Implementation. 126--138","author":"Amarasinghe S. P.","key":"e_1_2_1_7_1","unstructured":"S. P. Amarasinghe and M. S. Lam . 1993. Communication optimization and code generation for distributed memory machines . In Proceedings of the Conference on Programming Language Design and Implementation. 126--138 . S. P. Amarasinghe and M. S. Lam. 1993. Communication optimization and code generation for distributed memory machines. In Proceedings of the Conference on Programming Language Design and Implementation. 126--138."},{"volume-title":"Proceedings of the ACM Symposium on Parallelism in Algorithms and Architectures. 120--124","author":"Andreev K.","key":"e_1_2_1_8_1","unstructured":"K. Andreev and H. R\u00e4cke . 2004. Balanced graph partitioning . In Proceedings of the ACM Symposium on Parallelism in Algorithms and Architectures. 120--124 . K. Andreev and H. R\u00e4cke. 2004. Balanced graph partitioning. In Proceedings of the ACM Symposium on Parallelism in Algorithms and Architectures. 120--124."},{"volume-title":"Proceedings of the International Parallel and Distributed Processing Symposium. 1--12","author":"Bader A.","key":"e_1_2_1_9_1","unstructured":"A. Bader and K. Madduri . 2008. SNAP, small-world network analysis and partitioning: An open-source parallel graph framework for the exploration of large-scale networks . In Proceedings of the International Parallel and Distributed Processing Symposium. 1--12 . A. Bader and K. Madduri. 2008. SNAP, small-world network analysis and partitioning: An open-source parallel graph framework for the exploration of large-scale networks. In Proceedings of the International Parallel and Distributed Processing Symposium. 1--12."},{"volume-title":"Proceedings of the Symposium on Principles and Practice of Parallel Programming. 271--282","author":"Bikshandi G.","key":"e_1_2_1_10_1","unstructured":"G. Bikshandi , J. G. Castanos , S. B. Kodali , V. K. Nandivada , I. Peshansky , V. A. Saraswat , S. Sur , P. Varma , and T. Wen . 2009. Efficient, portable implementation of asynchronous multi-place programs . In Proceedings of the Symposium on Principles and Practice of Parallel Programming. 271--282 . G. Bikshandi, J. G. Castanos, S. B. Kodali, V. K. Nandivada, I. Peshansky, V. A. Saraswat, S. Sur, P. Varma, and T. Wen. 2009. Efficient, portable implementation of asynchronous multi-place programs. In Proceedings of the Symposium on Principles and Practice of Parallel Programming. 271--282."},{"key":"e_1_2_1_11_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-1-4471-0763-7_2"},{"key":"e_1_2_1_12_1","doi-asserted-by":"crossref","unstructured":"A. Chan and F. Dehne. 2003. CGMgraph\/CGMlib: Implementing and testing CGM graph algorithms on PC clusters. In Recent Advances in Parallel Virtual Machine and Message Passing Interface. 117--125.  A. Chan and F. Dehne. 2003. CGMgraph\/CGMlib: Implementing and testing CGM graph algorithms on PC clusters. In Recent Advances in Parallel Virtual Machine and Message Passing Interface. 117--125.","DOI":"10.1007\/978-3-540-39924-7_20"},{"volume-title":"Proceedings of the IEEE International Conference on Cluster Computing. 439--450","author":"Cheramangalath U.","key":"e_1_2_1_13_1","unstructured":"U. Cheramangalath , R. Nasre , and Y. N. Srikant . 2017. DH-Falcon: A language for large-scale graph processing on distributed heterogeneous systems . In Proceedings of the IEEE International Conference on Cluster Computing. 439--450 . U. Cheramangalath, R. Nasre, and Y. N. Srikant. 2017. DH-Falcon: A language for large-scale graph processing on distributed heterogeneous systems. In Proceedings of the IEEE International Conference on Cluster Computing. 439--450."},{"volume-title":"Proceedings of the Conference on Programming Language Design and Implementation. 304--315","author":"Cherem S.","key":"e_1_2_1_14_1","unstructured":"S. Cherem , T. Chilimbi , and S. Gulwani . 2008. Inferring locks for atomic sections . In Proceedings of the Conference on Programming Language Design and Implementation. 304--315 . S. Cherem, T. Chilimbi, and S. Gulwani. 2008. Inferring locks for atomic sections. In Proceedings of the Conference on Programming Language Design and Implementation. 304--315."},{"key":"e_1_2_1_15_1","unstructured":"T. H. Cormen C. E. Leiserson R. L. Rivest and C. Stein. 2009. Introduction to Algorithms (3rd ed.). The MIT Press Cambridge MA.  T. H. Cormen C. E. Leiserson R. L. Rivest and C. Stein. 2009. Introduction to Algorithms (3rd ed.). The MIT Press Cambridge MA."},{"volume-title":"Proceedings of the ACM\/IEEE Supercomputing Conference. 398--406","author":"Cytron R.","key":"e_1_2_1_16_1","unstructured":"R. Cytron , J. Lipkis , and E. Schonberg . 1990. A compiler-assisted approach to SPMD execution . In Proceedings of the ACM\/IEEE Supercomputing Conference. 398--406 . R. Cytron, J. Lipkis, and E. Schonberg. 1990. A compiler-assisted approach to SPMD execution. In Proceedings of the ACM\/IEEE Supercomputing Conference. 398--406."},{"key":"e_1_2_1_17_1","volume-title":"Proceedings of the 39th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI\u201918)","author":"Dathathri R.","year":"1923","unstructured":"R. Dathathri , G. Gill , L. Hoang , H. Dang , A. Brooks , N. Dryden , M. Snir , and K. Pingali . 2018. Gluon: A communication-optimizing substrate for distributed heterogeneous graph analytics . In Proceedings of the 39th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI\u201918) . ACM, New York, NY, 752--768. DOI:https:\/\/doi.org\/10.1145\/3 1923 66.3192404 R. Dathathri, G. Gill, L. Hoang, H. Dang, A. Brooks, N. Dryden, M. Snir, and K. Pingali. 2018. Gluon: A communication-optimizing substrate for distributed heterogeneous graph analytics. In Proceedings of the 39th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI\u201918). ACM, New York, NY, 752--768. DOI:https:\/\/doi.org\/10.1145\/3192366.3192404"},{"key":"e_1_2_1_18_1","doi-asserted-by":"publisher","DOI":"10.1002\/cpe.3758"},{"volume-title":"Proceedings of the European Conference on Parallel Processing. 249--264","author":"Gill G.","key":"e_1_2_1_19_1","unstructured":"G. Gill , R. Dathathri , L. Hoang , A. Lenharth , and K. Pingali . 2018. Abelian: A compiler for graph analytics on distributed, heterogeneous platforms . In Proceedings of the European Conference on Parallel Processing. 249--264 . G. Gill, R. Dathathri, L. Hoang, A. Lenharth, and K. Pingali. 2018. Abelian: A compiler for graph analytics on distributed, heterogeneous platforms. In Proceedings of the European Conference on Parallel Processing. 249--264."},{"volume-title":"Proceedings of the 10th USENIX Conference on Operating Systems Design and Implementation (OSDI\u201912)","author":"Gonzalez J. E.","key":"e_1_2_1_20_1","unstructured":"J. E. Gonzalez , Y. Low , H. Gu , D. Bickson , and C. Guestrin . 2012. PowerGraph: Distributed graph-parallel computation on natural graphs . In Proceedings of the 10th USENIX Conference on Operating Systems Design and Implementation (OSDI\u201912) . USENIX Association, Berkeley, CA, 17--30. Retrieved from http:\/\/dl.acm.org\/citation.cfm?id=2387880.2387883. J. E. Gonzalez, Y. Low, H. Gu, D. Bickson, and C. Guestrin. 2012. PowerGraph: Distributed graph-parallel computation on natural graphs. In Proceedings of the 10th USENIX Conference on Operating Systems Design and Implementation (OSDI\u201912). USENIX Association, Berkeley, CA, 17--30. Retrieved from http:\/\/dl.acm.org\/citation.cfm?id=2387880.2387883."},{"volume-title":"Proceedings of the IFIP Working Conference on Modelling in Data Base Management Systems.","author":"Gray J.","key":"e_1_2_1_21_1","unstructured":"J. Gray , R. A. Lorie , G. R. Putzolu , and I. L. Traiger . 1976. Granularity of locks and degrees of consistency in a shared data base . In Proceedings of the IFIP Working Conference on Modelling in Data Base Management Systems. J. Gray, R. A. Lorie, G. R. Putzolu, and I. L. Traiger. 1976. Granularity of locks and degrees of consistency in a shared data base. In Proceedings of the IFIP Working Conference on Modelling in Data Base Management Systems."},{"volume-title":"Proceedings of the International Conference on Very Large Data Bases. 428--451","author":"Gray J. N.","key":"e_1_2_1_22_1","unstructured":"J. N. Gray , R. A. Lorie , and G. R. Putzolu . 1975. Granularity of locks in a shared data base . In Proceedings of the International Conference on Very Large Data Bases. 428--451 . J. N. Gray, R. A. Lorie, and G. R. Putzolu. 1975. Granularity of locks in a shared data base. In Proceedings of the International Conference on Very Large Data Bases. 428--451."},{"volume-title":"Proceedings of the ACM SIGPLAN International Conference on Object-oriented Programming, Systems, Languages, and Applications. 423--437","author":"Gregor D.","key":"e_1_2_1_23_1","unstructured":"D. Gregor and A. Lumsdaine . 2005. Lifting sequential graph algorithms for distributed-memory parallel computation . In Proceedings of the ACM SIGPLAN International Conference on Object-oriented Programming, Systems, Languages, and Applications. 423--437 . D. Gregor and A. Lumsdaine. 2005. Lifting sequential graph algorithms for distributed-memory parallel computation. In Proceedings of the ACM SIGPLAN International Conference on Object-oriented Programming, Systems, Languages, and Applications. 423--437."},{"volume-title":"Proceedings of the PVM\/MPI Users\u2019 Group Conference. 272--280","author":"Gropp W. D.","key":"e_1_2_1_24_1","unstructured":"W. D. Gropp and R. Thakur . 2007. Revealing the performance of MPI RMA implementations . In Proceedings of the PVM\/MPI Users\u2019 Group Conference. 272--280 . W. D. Gropp and R. Thakur. 2007. Revealing the performance of MPI RMA implementations. In Proceedings of the PVM\/MPI Users\u2019 Group Conference. 272--280."},{"key":"e_1_2_1_25_1","unstructured":"F. Hielscher and P. Gottschling. 2004. ParGraph. Retrieved from http:\/\/pargraph.sourceforge.net\/.  F. Hielscher and P. Gottschling. 2004. ParGraph. Retrieved from http:\/\/pargraph.sourceforge.net\/."},{"key":"e_1_2_1_26_1","doi-asserted-by":"publisher","DOI":"10.1145\/2780584"},{"volume-title":"Proceedings of the International Conference on Architectural Support for Programming Languages and Operating Systems. 349--362","author":"Hong S.","key":"e_1_2_1_27_1","unstructured":"S. Hong , H. Chafi , E. Sedlar , and K. Olukotun . 2012. Green-Marl: A DSL for easy and efficient graph analysis . In Proceedings of the International Conference on Architectural Support for Programming Languages and Operating Systems. 349--362 . S. Hong, H. Chafi, E. Sedlar, and K. Olukotun. 2012. Green-Marl: A DSL for easy and efficient graph analysis. In Proceedings of the International Conference on Architectural Support for Programming Languages and Operating Systems. 349--362."},{"volume-title":"Proceedings of the Annual IEEE\/ACM International Symposium on Code Generation and Optimization (CGO\u201914)","author":"Hong S.","key":"e_1_2_1_28_1","unstructured":"S. Hong , S. Salihoglu , J. Widom , and K. Olukotun . 2014. Simplifying scalable graph processing with a domain-specific language . In Proceedings of the Annual IEEE\/ACM International Symposium on Code Generation and Optimization (CGO\u201914) . ACM, New York, NY. DOI:https:\/\/doi.org\/10.1145\/2581122.2544162 S. Hong, S. Salihoglu, J. Widom, and K. Olukotun. 2014. Simplifying scalable graph processing with a domain-specific language. In Proceedings of the Annual IEEE\/ACM International Symposium on Code Generation and Optimization (CGO\u201914). ACM, New York, NY. DOI:https:\/\/doi.org\/10.1145\/2581122.2544162"},{"key":"e_1_2_1_29_1","doi-asserted-by":"publisher","DOI":"10.5555\/305219.305248"},{"volume-title":"Proceedings of the European Conference on Computer Systems. 169--182","author":"Khayyat Z.","key":"e_1_2_1_30_1","unstructured":"Z. Khayyat , K. Awara , A. Alonazi , H. Jamjoom , D. Williams , and P. Kalnis . 2013. Mizan: A system for dynamic load balancing in large-scale graph processing . In Proceedings of the European Conference on Computer Systems. 169--182 . Z. Khayyat, K. Awara, A. Alonazi, H. Jamjoom, D. Williams, and P. Kalnis. 2013. Mizan: A system for dynamic load balancing in large-scale graph processing. In Proceedings of the European Conference on Computer Systems. 169--182."},{"volume-title":"Proceedings of the International Conference on Supercomputing. 341--352","author":"Kim J.","key":"e_1_2_1_31_1","unstructured":"J. Kim , S. Seo , J. Lee , J. Nah , G. Jo , and J. Lee . 2012. SnuCL: An OpenCL framework for heterogeneous CPU\/GPU clusters . In Proceedings of the International Conference on Supercomputing. 341--352 . J. Kim, S. Seo, J. Lee, J. Nah, G. Jo, and J. Lee. 2012. SnuCL: An OpenCL framework for heterogeneous CPU\/GPU clusters. In Proceedings of the International Conference on Supercomputing. 341--352."},{"volume-title":"Proceedings of the 23rd IEEE International Conference on High Performance Computing, Data, and Analytics. IEEE, 42--51","author":"Li M.","key":"e_1_2_1_32_1","unstructured":"M. Li , X. Lu , K. Hamidouche , J. Zhang , and D. K. Panda . 2016. Mizan-RMA: Accelerating Mizan graph processing framework with MPI RMA . In Proceedings of the 23rd IEEE International Conference on High Performance Computing, Data, and Analytics. IEEE, 42--51 . M. Li, X. Lu, K. Hamidouche, J. Zhang, and D. K. Panda. 2016. Mizan-RMA: Accelerating Mizan graph processing framework with MPI RMA. In Proceedings of the 23rd IEEE International Conference on High Performance Computing, Data, and Analytics. IEEE, 42--51."},{"volume-title":"Proceedings of the IEEE International Conference on Cluster Computing (CLUSTER\u201914)","author":"Li M.","key":"e_1_2_1_33_1","unstructured":"M. Li , X. Lu , S. Potluri , K. Hamidouche , J. Jose , K. Tomko , and D. K. Panda . 2014. Scalable Graph500 design with MPI-3 RMA . In Proceedings of the IEEE International Conference on Cluster Computing (CLUSTER\u201914) . 230--238. M. Li, X. Lu, S. Potluri, K. Hamidouche, J. Jose, K. Tomko, and D. K. Panda. 2014. Scalable Graph500 design with MPI-3 RMA. In Proceedings of the IEEE International Conference on Cluster Computing (CLUSTER\u201914). 230--238."},{"key":"e_1_2_1_34_1","doi-asserted-by":"publisher","DOI":"10.14778\/2212351.2212354"},{"key":"e_1_2_1_35_1","unstructured":"Y. Low J. Gonzalez A. Kyrola D. Bickson C. Guestrin and J. M. Hellerstein. 2010. GraphLab: New framework for parallel machine learning. CoRR abs\/1006.4990 (2010).  Y. Low J. Gonzalez A. Kyrola D. Bickson C. Guestrin and J. M. Hellerstein. 2010. GraphLab: New framework for parallel machine learning. CoRR abs\/1006.4990 (2010)."},{"key":"e_1_2_1_36_1","first-page":"41","article-title":"2016. Concurrent hash tables: Fast and general?(!)","volume":"3","author":"Maier T.","unstructured":"T. Maier , P. Sanders , and R. Dementiev . 2016. Concurrent hash tables: Fast and general?(!) In Proceedings of the Symposium on Principles and Practice of Parallel Programming. 3 : 41 \u2013 43 :42. T. Maier, P. Sanders, and R. Dementiev. 2016. Concurrent hash tables: Fast and general?(!) In Proceedings of the Symposium on Principles and Practice of Parallel Programming. 3:41\u20133:42.","journal-title":"Proceedings of the Symposium on Principles and Practice of Parallel Programming."},{"volume-title":"Proceedings of the SIGMOD Conference. 135--146","author":"Malewicz G.","key":"e_1_2_1_37_1","unstructured":"G. Malewicz , M. H. Austern , A. J. Bik , J. C. Dehnert , I. Horn , N. Leiser , and G. Czajkowski . 2010. Pregel: A system for large-scale graph processing . In Proceedings of the SIGMOD Conference. 135--146 . G. Malewicz, M. H. Austern, A. J. Bik, J. C. Dehnert, I. Horn, N. Leiser, and G. Czajkowski. 2010. Pregel: A system for large-scale graph processing. In Proceedings of the SIGMOD Conference. 135--146."},{"volume-title":"Proceedings of the USENIX Annual Technical Conference. 291--305","author":"Nelson J.","key":"e_1_2_1_38_1","unstructured":"J. Nelson , B. Holt , B. Myers , P. Briggs , L. Ceze , S. Kahan , and M. Oskin . 2015. Latency-tolerant software distributed shared memory . In Proceedings of the USENIX Annual Technical Conference. 291--305 . J. Nelson, B. Holt, B. Myers, P. Briggs, L. Ceze, S. Kahan, and M. Oskin. 2015. Latency-tolerant software distributed shared memory. In Proceedings of the USENIX Annual Technical Conference. 291--305."},{"volume-title":"Proceedings of the ACM Symposium on Operating Systems Principles. 456--471","author":"Nguyen D.","key":"e_1_2_1_39_1","unstructured":"D. Nguyen , A. Lenharth , and K. Pingali . 2013. A lightweight infrastructure for graph analytics . In Proceedings of the ACM Symposium on Operating Systems Principles. 456--471 . D. Nguyen, A. Lenharth, and K. Pingali. 2013. A lightweight infrastructure for graph analytics. In Proceedings of the ACM Symposium on Operating Systems Principles. 456--471."},{"volume-title":"Proceedings of the ACM Symposium on Operating Systems Principles. 456--471","author":"Nguyen D.","key":"e_1_2_1_40_1","unstructured":"D. Nguyen , A. Lenharth , and K. Pingali . 2013. A lightweight infrastructure for graph analytics . In Proceedings of the ACM Symposium on Operating Systems Principles. 456--471 . D. Nguyen, A. Lenharth, and K. Pingali. 2013. A lightweight infrastructure for graph analytics. In Proceedings of the ACM Symposium on Operating Systems Principles. 456--471."},{"volume-title":"Proceedings of the Knowledge Discovery and Data Mining Conference. 1106--1114","author":"Nishimura J.","key":"e_1_2_1_41_1","unstructured":"J. Nishimura and J. Ugander . 2013. Restreaming graph partitioning: Simple versatile algorithms for advanced balancing . In Proceedings of the Knowledge Discovery and Data Mining Conference. 1106--1114 . J. Nishimura and J. Ugander. 2013. Restreaming graph partitioning: Simple versatile algorithms for advanced balancing. In Proceedings of the Knowledge Discovery and Data Mining Conference. 1106--1114."},{"volume-title":"Proceedings of the ACM SIGPLAN International Conference on Object-Oriented Programming, Systems, Languages, and Applications. 1--19","author":"Pai S.","key":"e_1_2_1_42_1","unstructured":"S. Pai and K. Pingali . 2016. A compiler for throughput optimization of graph algorithms on GPUs . In Proceedings of the ACM SIGPLAN International Conference on Object-Oriented Programming, Systems, Languages, and Applications. 1--19 . S. Pai and K. Pingali. 2016. A compiler for throughput optimization of graph algorithms on GPUs. In Proceedings of the ACM SIGPLAN International Conference on Object-Oriented Programming, Systems, Languages, and Applications. 1--19."},{"key":"e_1_2_1_43_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.parco.2011.02.004"},{"volume-title":"Proceedings of the International Workshop on Languages, Compilers, and Run-time Systems for Scalable Computers. 402--409","author":"Rauchwerger L.","key":"e_1_2_1_44_1","unstructured":"L. Rauchwerger , F. Arzu , and K. Ouchi . 1998. Standard templates adaptive parallel library (STAPL) . In Proceedings of the International Workshop on Languages, Compilers, and Run-time Systems for Scalable Computers. 402--409 . L. Rauchwerger, F. Arzu, and K. Ouchi. 1998. Standard templates adaptive parallel library (STAPL). In Proceedings of the International Workshop on Languages, Compilers, and Run-time Systems for Scalable Computers. 402--409."},{"volume-title":"Proceedings of the Scientific and Statistical Database Management Conference. 22: 1\u201322:12","author":"Salihoglu S.","key":"e_1_2_1_45_1","unstructured":"S. Salihoglu and J. Widom . 2013. GPS: A graph processing system . In Proceedings of the Scientific and Statistical Database Management Conference. 22: 1\u201322:12 . S. Salihoglu and J. Widom. 2013. GPS: A graph processing system. In Proceedings of the Scientific and Statistical Database Management Conference. 22:1\u201322:12."},{"key":"e_1_2_1_46_1","doi-asserted-by":"publisher","DOI":"10.14778\/2556549.2556572"},{"volume-title":"Proceedings of the SIGMOD Conference. 505--516","author":"Shao B.","key":"e_1_2_1_47_1","unstructured":"B. Shao , H. Wang , and Y. Li . 2013. Trinity: A distributed graph engine on a memory cloud . In Proceedings of the SIGMOD Conference. 505--516 . B. Shao, H. Wang, and Y. Li. 2013. Trinity: A distributed graph engine on a memory cloud. In Proceedings of the SIGMOD Conference. 505--516."},{"volume-title":"Proceedings of the Workshop on Languages and Compilers for Parallel Computing. 235--249","author":"Shashidhar G.","key":"e_1_2_1_48_1","unstructured":"G. Shashidhar and R. Nasre . 2017. LightHouse: An automatic code generator for graph algorithms on GPUs . In Proceedings of the Workshop on Languages and Compilers for Parallel Computing. 235--249 . G. Shashidhar and R. Nasre. 2017. LightHouse: An automatic code generator for graph algorithms on GPUs. In Proceedings of the Workshop on Languages and Compilers for Parallel Computing. 235--249."},{"volume-title":"Proceedings of the Symposium on Principles and Practice of Parallel Programming. 135--146","author":"Shun J.","key":"e_1_2_1_49_1","unstructured":"J. Shun and G. E. Blelloch . 2013. Ligra: A lightweight graph processing framework for shared memory . In Proceedings of the Symposium on Principles and Practice of Parallel Programming. 135--146 . J. Shun and G. E. Blelloch. 2013. Ligra: A lightweight graph processing framework for shared memory. In Proceedings of the Symposium on Principles and Practice of Parallel Programming. 135--146."},{"volume-title":"Proceedings of the International Parallel and Distributed Processing Symposium. 646--655","author":"Slota G. M.","key":"e_1_2_1_50_1","unstructured":"G. M. Slota , S. Rajamanickam , K. Devine , and K. Madduri . 2017. Partitioning trillion-edge graphs in minutes . In Proceedings of the International Parallel and Distributed Processing Symposium. 646--655 . G. M. Slota, S. Rajamanickam, K. Devine, and K. Madduri. 2017. Partitioning trillion-edge graphs in minutes. In Proceedings of the International Parallel and Distributed Processing Symposium. 646--655."},{"volume-title":"Proceedings of the International Conference on Parallel Processing. 293--300","author":"Tipparaju V.","key":"e_1_2_1_51_1","unstructured":"V. Tipparaju , W. Gropp , H. Ritzdorf , R. Thakur , and J. L. Tr\u00e4ff . 2009. Investigating high performance RMA interfaces for the MPI-3 standard . In Proceedings of the International Conference on Parallel Processing. 293--300 . V. Tipparaju, W. Gropp, H. Ritzdorf, R. Thakur, and J. L. Tr\u00e4ff. 2009. Investigating high performance RMA interfaces for the MPI-3 standard. In Proceedings of the International Conference on Parallel Processing. 293--300."},{"key":"e_1_2_1_52_1","volume-title":"Compiler optimizations for eliminating barrier synchronization. SIGPLAN Not. 30 (Aug","author":"Tseng C.","year":"1995","unstructured":"C. Tseng . 1995. Compiler optimizations for eliminating barrier synchronization. SIGPLAN Not. 30 (Aug 1995 ), 144--155. C. Tseng. 1995. Compiler optimizations for eliminating barrier synchronization. SIGPLAN Not. 30 (Aug 1995), 144--155."},{"volume-title":"Proceedings of the Web Search and Data Mining Conference. 333--342","author":"Tsourakakis C.","key":"e_1_2_1_53_1","unstructured":"C. Tsourakakis , C. Gkantsidis , B. Radunovic , and M. Vojnovic . 2014. FENNEL: Streaming graph partitioning for massive scale graphs . In Proceedings of the Web Search and Data Mining Conference. 333--342 . C. Tsourakakis, C. Gkantsidis, B. Radunovic, and M. Vojnovic. 2014. FENNEL: Streaming graph partitioning for massive scale graphs. In Proceedings of the Web Search and Data Mining Conference. 333--342."},{"volume-title":"Proceedings of the IEEE International Conference on Big Data. 537--542","author":"Wang R.","key":"e_1_2_1_54_1","unstructured":"R. Wang and K. Chiu . 2013. A stream partitioning approach to processing large scale distributed graph datasets . In Proceedings of the IEEE International Conference on Big Data. 537--542 . R. Wang and K. Chiu. 2013. A stream partitioning approach to processing large scale distributed graph datasets. In Proceedings of the IEEE International Conference on Big Data. 537--542."},{"volume-title":"Proceedings of the International Symposium on Software Testing and Analysis. 389--400","author":"Yu T.","key":"e_1_2_1_55_1","unstructured":"T. Yu and M. Pradel . 2016. SyncProf: Detecting, localizing, and optimizing synchronization bottlenecks . In Proceedings of the International Symposium on Software Testing and Analysis. 389--400 . T. Yu and M. Pradel. 2016. SyncProf: Detecting, localizing, and optimizing synchronization bottlenecks. In Proceedings of the International Symposium on Software Testing and Analysis. 389--400."},{"volume-title":"Proceedings of the Symposium on Principles and Practice of Parallel Programming. 146--147","author":"Zhang Y.","key":"e_1_2_1_56_1","unstructured":"Y. Zhang , V. C. Sreedhar , W. Zhu , V. Sarkar , and G. R. Gao . 2007. Optimized lock assignment and allocation: A method for exploiting concurrency among critical sections . In Proceedings of the Symposium on Principles and Practice of Parallel Programming. 146--147 . Y. Zhang, V. C. Sreedhar, W. Zhu, V. Sarkar, and G. R. Gao. 2007. Optimized lock assignment and allocation: A method for exploiting concurrency among critical sections. In Proceedings of the Symposium on Principles and Practice of Parallel Programming. 146--147."},{"key":"e_1_2_1_57_1","volume-title":"Proc. ACM Program. Lang. 2 (Oct.","author":"Zhang Y.","year":"2018","unstructured":"Y. Zhang , M. Yang , R. Baghdadi , S. Kamil , J. Shun , and S. Amarasinghe . 2018. GraphIt: A high-performance graph DSL . Proc. ACM Program. Lang. 2 (Oct. 2018 ). DOI:https:\/\/doi.org\/10.1145\/3276491 Y. Zhang, M. Yang, R. Baghdadi, S. Kamil, J. Shun, and S. Amarasinghe. 2018. GraphIt: A high-performance graph DSL. Proc. ACM Program. Lang. 2 (Oct. 2018). DOI:https:\/\/doi.org\/10.1145\/3276491"},{"volume-title":"Proceedings of the Symposium on Operating Systems Design and Implementation. 301--316","author":"Zhu X.","key":"e_1_2_1_58_1","unstructured":"X. Zhu , W. Chen , W. Zheng , and X. Ma . 2016. Gemini: A computation-centric distributed graph processing system . In Proceedings of the Symposium on Operating Systems Design and Implementation. 301--316 . X. Zhu, W. Chen, W. Zheng, and X. Ma. 2016. Gemini: A computation-centric distributed graph processing system. In Proceedings of the Symposium on Operating Systems Design and Implementation. 301--316."}],"container-title":["ACM Transactions on Architecture and Code Optimization"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3414469","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3414469","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T22:38:38Z","timestamp":1750199918000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3414469"}},"subtitle":["A Compiler for Distributed Graph Analytics"],"short-title":[],"issued":{"date-parts":[[2020,9,30]]},"references-count":58,"journal-issue":{"issue":"4","published-print":{"date-parts":[[2020,12,31]]}},"alternative-id":["10.1145\/3414469"],"URL":"https:\/\/doi.org\/10.1145\/3414469","relation":{},"ISSN":["1544-3566","1544-3973"],"issn-type":[{"type":"print","value":"1544-3566"},{"type":"electronic","value":"1544-3973"}],"subject":[],"published":{"date-parts":[[2020,9,30]]},"assertion":[{"value":"2019-12-01","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2020-07-01","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2020-09-30","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}