{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T04:16:14Z","timestamp":1750220174163,"version":"3.41.0"},"reference-count":44,"publisher":"Association for Computing Machinery (ACM)","issue":"3","license":[{"start":{"date-parts":[[2022,8,18]],"date-time":"2022-08-18T00:00:00Z","timestamp":1660780800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"name":"National Science Foundation","award":["1406304 and 1955367"],"award-info":[{"award-number":["1406304 and 1955367"]}]},{"name":"Department of Energy, National Nuclear Security Administration","award":["DE-NA0003969"],"award-info":[{"award-number":["DE-NA0003969"]}]},{"name":"NVIDIA Corporation"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Parallel Comput."],"published-print":{"date-parts":[[2022,9,30]]},"abstract":"<jats:p>Graph coloring assigns a color to each vertex of a graph such that no two adjacent vertices get the same color. It is a key building block in many applications. In practice, solutions that require fewer distinct colors and that can be computed faster are typically preferred. Various coloring heuristics exist that provide different quality versus speed tradeoffs. The highest-quality heuristics tend to be slow. To improve performance, several parallel implementations have been proposed. This paper describes two improvements of the widely used LDF heuristic. First, we present a \u201cshortcutting\u201d approach to increase the parallelism by non-speculatively breaking data dependencies. Second, we present \u201ccolor reduction\u201d techniques to boost the solution of LDF. On 18 graphs from various domains, the shortcutting approach yields 2.5 times more parallelism in the mean, and the color-reduction techniques improve the result quality by up to 20%. Our deterministic CUDA implementation running on a Titan V is 2.9 times faster in the mean and uses as few or fewer colors as the best GPU codes from the literature.<\/jats:p>","DOI":"10.1145\/3543545","type":"journal-article","created":{"date-parts":[[2022,7,11]],"date-time":"2022-07-11T11:30:09Z","timestamp":1657539009000},"page":"1-35","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":0,"title":["Improving the Speed and Quality of Parallel Graph Coloring"],"prefix":"10.1145","volume":"9","author":[{"ORCID":"https:\/\/orcid.org\/0000-0001-5522-9441","authenticated-orcid":false,"given":"Ghadeer","family":"Alabandi","sequence":"first","affiliation":[{"name":"Texas State University, San Marcos, TX, United States"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-7717-3354","authenticated-orcid":false,"given":"Martin","family":"Burtscher","sequence":"additional","affiliation":[{"name":"Texas State University, San Marcos, TX, United States"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","published-online":{"date-parts":[[2022,8,18]]},"reference":[{"key":"e_1_3_1_2_2","doi-asserted-by":"publisher","DOI":"10.1145\/3332466.3374519"},{"key":"e_1_3_1_3_2","volume-title":"The Traveling Salesman Problem","author":"Applegate David L.","year":"2011","unstructured":"David L. Applegate, Robert E. Bixby, Va\u0161ek Chv\u00e1tal, and William J. Cook. 2011. The Traveling Salesman Problem. Princeton University Press, 2011."},{"key":"e_1_3_1_4_2","doi-asserted-by":"publisher","DOI":"10.5555\/3433701.3433833"},{"key":"e_1_3_1_5_2","unstructured":"Boost. https:\/\/www.boost.org\/doc\/libs\/1_63_0\/libs\/graph_parallel\/doc\/html\/index.html last accessed on 5\/6\/2021."},{"key":"e_1_3_1_6_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.parco.2012.07.001"},{"key":"e_1_3_1_7_2","unstructured":"Chen and Li. https:\/\/github.com\/chenxuhao\/csrcolor last accessed on 5\/6\/2021."},{"key":"e_1_3_1_8_2","doi-asserted-by":"publisher","DOI":"10.1002\/cpe.4064"},{"key":"e_1_3_1_9_2","first-page":"1","volume-title":"GPU Technology Conference","author":"Cohen Jonathan","year":"2012","unstructured":"Jonathan Cohen and Patrice Castonguay. 2012. Efficient graph matching and coloring on the GPU. In GPU Technology Conference (2012), 1\u201310."},{"key":"e_1_3_1_10_2","doi-asserted-by":"publisher","DOI":"10.1137\/S1064827595295349"},{"key":"e_1_3_1_11_2","doi-asserted-by":"publisher","DOI":"10.1007\/BF02612334"},{"key":"e_1_3_1_12_2","unstructured":"ColPack. 2022. Combinatorial scientific computing and petascale simulations. https:\/\/github.com\/CSCsw\/ColPack last accessed on 02\/07\/2022."},{"key":"e_1_3_1_13_2","unstructured":"Joseph Culberson. 1992. Iterated greedy graph coloring and the difficulty landscape. (1992)."},{"volume-title":"NVIDIA Corporation","year":"2014","key":"e_1_3_1_14_2","unstructured":"Cusparse Library. 2014. NVIDIA Corporation, Santa Clara, California. 2014."},{"key":"e_1_3_1_15_2","unstructured":"S. Dalton and N. Bell. 2021. CUSP: A C++ templated sparse matrix library. http:\/\/cusplibrary.github.io last accessed on 5\/6\/2021."},{"key":"e_1_3_1_16_2","doi-asserted-by":"publisher","DOI":"10.1109\/IPDPS.2016.54"},{"key":"e_1_3_1_17_2","unstructured":"DIMACS. 2021. Center for discrete mathematics and theoretical computer science. http:\/\/www.dis.uniroma1.it\/challenge9\/download.shtml last accessed on 5\/6\/2021."},{"key":"e_1_3_1_18_2","unstructured":"ECL-GC. 2021. Texas State University. https:\/\/cs.txstate.edu\/\u223cburtscher\/research\/ECL-GC\/ last accessed on 5\/6\/2021."},{"key":"e_1_3_1_19_2","unstructured":"Galois. 2021. ISS - The University of Texas at Austin. https:\/\/iss.oden.utexas.edu\/?p=projects\/galois last accessed on 5\/6\/2021."},{"key":"e_1_3_1_20_2","first-page":"1","volume-title":"Computers and Intractability","author":"Garey Michael R.","year":"2002","unstructured":"Michael R. Garey and David S. Johnson. 2002. Computers and Intractability. W. H. Freeman and Company, New York 29, (2002), 1\u201399."},{"key":"e_1_3_1_21_2","doi-asserted-by":"publisher","DOI":"10.1145\/2513109.2513110"},{"key":"e_1_3_1_22_2","doi-asserted-by":"publisher","DOI":"10.1002\/1096-9128(200010)12:12<1131::AID-CPE528>3.0.CO;2-2"},{"key":"e_1_3_1_23_2","unstructured":"Grappolo. 2021. The Grappolo graph toolkit. https:\/\/github.com\/luhowardmark\/GrappoloTK last accessed on 5\/6\/2021."},{"key":"e_1_3_1_24_2","doi-asserted-by":"publisher","DOI":"10.1145\/2038037.1941597"},{"key":"e_1_3_1_25_2","doi-asserted-by":"crossref","unstructured":"Kshitij Gupta Jeff A. Stuart and John D. Owens. 2012. A study of persistent threads style GPU programming for GPGPU workload s . IEEE 2012.","DOI":"10.1109\/InPar.2012.6339596"},{"key":"e_1_3_1_26_2","doi-asserted-by":"publisher","DOI":"10.1145\/2612669.2612697"},{"key":"e_1_3_1_27_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-1-4419-1153-7_1068"},{"key":"e_1_3_1_28_2","doi-asserted-by":"publisher","DOI":"10.5555\/645604.662431"},{"key":"e_1_3_1_29_2","doi-asserted-by":"publisher","DOI":"10.1145\/3208040.3208041"},{"key":"e_1_3_1_30_2","doi-asserted-by":"publisher","DOI":"10.1137\/0914041"},{"key":"e_1_3_1_31_2","unstructured":"Kokkos-Kernels. https:\/\/github.com\/kokkos\/kokkos-kernels last accessed on 5\/6\/2021."},{"key":"e_1_3_1_32_2","doi-asserted-by":"publisher","DOI":"10.1137\/0215074"},{"key":"e_1_3_1_33_2","doi-asserted-by":"publisher","DOI":"10.1145\/2351476.2351489"},{"key":"e_1_3_1_34_2","doi-asserted-by":"publisher","DOI":"10.1016\/B978-1-4832-3187-7.50015-5"},{"key":"e_1_3_1_35_2","doi-asserted-by":"publisher","DOI":"10.1145\/2370036.2145832"},{"key":"e_1_3_1_36_2","doi-asserted-by":"publisher","DOI":"10.1109\/IPDPS.2013.28"},{"key":"e_1_3_1_37_2","volume-title":"Parallel Graph Coloring with Applications to the Incomplete-LU Factorization on the GPU","author":"Naumov Maxim","year":"2015","unstructured":"Maxim Naumov, Patrice Castonguay, and Jonathan Cohen. 2015. Parallel Graph Coloring with Applications to the Incomplete-LU Factorization on the GPU. Nvidia White Paper, 2015."},{"key":"e_1_3_1_38_2","article-title":"Heuristics for the traveling salesman problem","volume":"38","author":"Nilsson Christian","year":"2003","unstructured":"Christian Nilsson. 2003. Heuristics for the traveling salesman problem. Linkoping University 38 (2003), 00085\u20139.","journal-title":"Linkoping University"},{"key":"e_1_3_1_39_2","doi-asserted-by":"publisher","DOI":"10.1109\/IPDPSW.2019.00046"},{"key":"e_1_3_1_40_2","volume-title":"The Boost Graph Library: User Guide and Reference Manual","author":"Siek Jeremy","year":"2002","unstructured":"Jeremy Siek, Andrew Lumsdaine, and Lie-Quan Lee. 2002. The Boost Graph Library: User Guide and Reference Manual. Addison-Wesley, 2002."},{"key":"e_1_3_1_41_2","doi-asserted-by":"publisher","DOI":"10.1145\/3007748.3018281"},{"key":"e_1_3_1_42_2","unstructured":"SNAP Stanford Large Network Dataset Collection. https:\/\/snap.stanford.edu\/data\/ last accessed on 5\/6\/2021."},{"key":"e_1_3_1_43_2","unstructured":"SuiteSparse Matrix Collection. https:\/\/sparse.tamu.edu\/ last accessed on 5\/6\/2021."},{"key":"e_1_3_1_44_2","volume-title":"Hacker's Delight","author":"Warren Henry S.","year":"2013","unstructured":"Henry S. Warren. 2013. Hacker's Delight. Pearson Education, 2013."},{"key":"e_1_3_1_45_2","doi-asserted-by":"publisher","DOI":"10.1093\/comjnl\/10.1.85"}],"container-title":["ACM Transactions on Parallel Computing"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3543545","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3543545","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T19:00:47Z","timestamp":1750186847000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3543545"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,8,18]]},"references-count":44,"journal-issue":{"issue":"3","published-print":{"date-parts":[[2022,9,30]]}},"alternative-id":["10.1145\/3543545"],"URL":"https:\/\/doi.org\/10.1145\/3543545","relation":{},"ISSN":["2329-4949","2329-4957"],"issn-type":[{"type":"print","value":"2329-4949"},{"type":"electronic","value":"2329-4957"}],"subject":[],"published":{"date-parts":[[2022,8,18]]},"assertion":[{"value":"2021-05-01","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2022-06-01","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2022-08-18","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}