{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,5,1]],"date-time":"2026-05-01T22:44:54Z","timestamp":1777675494515,"version":"3.51.4"},"reference-count":34,"publisher":"SAGE Publications","issue":"2","license":[{"start":{"date-parts":[[2018,4,15]],"date-time":"2018-04-15T00:00:00Z","timestamp":1523750400000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/journals.sagepub.com\/page\/policies\/text-and-data-mining-license"}],"funder":[{"name":"NSF","award":["ACI-1053575"],"award-info":[{"award-number":["ACI-1053575"]}]}],"content-domain":{"domain":["journals.sagepub.com"],"crossmark-restriction":true},"short-container-title":["The International Journal of High Performance Computing Applications"],"published-print":{"date-parts":[[2019,3]]},"abstract":"<jats:p>In many scientific and computational domains, graphs are used to represent and analyze data. Such graphs often exhibit the characteristics of small-world networks: few high-degree vertexes connect many low-degree vertexes. Despite the randomness in a graph search, it is possible to capitalize on the characteristics of small-world networks and cache relevant information of high-degree vertexes. We applied this idea by caching remote vertex ids in a parallel breadth-first search benchmark. Our experiment with different implementations demonstrated significant performance improvements over the reference implementation in several configurations, using 64 to 1024 cores. We proposed a system design in which resources are dedicated exclusively to caching and shared among a set of nodes. Our evaluation demonstrates that this design reduces communication and has the potential to improve performance on large-scale systems in which the communication cost increases significantly with the distance between nodes. We also tested a memcached system as the cache server finding that its generic protocol, which does not match our usage semantics, hinders significantly the potential performance improvements and suggested that a generic system should also support a basic and lightweight communication protocol to meet the needs of high-performance computing applications. Finally, we explored different configurations to find efficient ways to utilize the resources allocated to solve a given problem size; to this extent, we found utilizing half of the compute cores per allocated node improves performance, and even in this case, caching variants always outperform the reference implementation.<\/jats:p>","DOI":"10.1177\/1094342018762510","type":"journal-article","created":{"date-parts":[[2018,4,15]],"date-time":"2018-04-15T23:17:09Z","timestamp":1523834229000},"page":"384-396","update-policy":"https:\/\/doi.org\/10.1177\/sage-journals-update-policy","source":"Crossref","is-referenced-by-count":0,"title":["Reducing communication in parallel graph search algorithms with software caches"],"prefix":"10.1177","volume":"33","author":[{"given":"Pietro","family":"Cicotti","sequence":"first","affiliation":[{"name":"San Diego Supercomputer Center, University of California, San Diego, La Jolla, CA, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Manu","family":"Shantharam","sequence":"additional","affiliation":[{"name":"San Diego Supercomputer Center, University of California, San Diego, La Jolla, CA, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Laura","family":"Carrington","sequence":"additional","affiliation":[{"name":"San Diego Supercomputer Center, University of California, San Diego, La Jolla, CA, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"179","published-online":{"date-parts":[[2018,4,15]]},"reference":[{"key":"bibr1-1094342018762510","unstructured":"(2010) Graph 500. Available at: www.graph500.org."},{"key":"bibr2-1094342018762510","doi-asserted-by":"publisher","DOI":"10.1109\/SC.2010.46"},{"key":"bibr3-1094342018762510","doi-asserted-by":"publisher","DOI":"10.1109\/ICPP.2005.55"},{"key":"bibr4-1094342018762510","doi-asserted-by":"publisher","DOI":"10.1109\/ICPP.2006.34"},{"key":"bibr5-1094342018762510","doi-asserted-by":"publisher","DOI":"10.1109\/SC.2012.50"},{"key":"bibr6-1094342018762510","volume-title":"Tech. Rep","author":"Burger D","year":"1996"},{"key":"bibr7-1094342018762510","doi-asserted-by":"publisher","DOI":"10.1109\/SC.2012.25"},{"key":"bibr8-1094342018762510","doi-asserted-by":"crossref","unstructured":"Chow E, Henderson K, Yoo A. (2005) Distributed breadth-First Search with 2-D Partitioning. LLNL.","DOI":"10.2172\/919217"},{"key":"bibr9-1094342018762510","unstructured":"Computer C. The HC series."},{"key":"bibr10-1094342018762510","unstructured":"Computer C. (2011) Convey Computer Announces Graph 500 Performance."},{"key":"bibr11-1094342018762510","unstructured":"Computer C. (2013) Convey launches Hybrid-Core Memcached Appliance."},{"key":"bibr12-1094342018762510","doi-asserted-by":"publisher","DOI":"10.1109\/SC.2010.26"},{"key":"bibr13-1094342018762510","doi-asserted-by":"publisher","DOI":"10.1016\/j.jpdc.2005.12.004"},{"key":"bibr14-1094342018762510","volume-title":"Introduction to Algorithms","author":"Cormen TH","year":"2001"},{"key":"bibr15-1094342018762510","doi-asserted-by":"publisher","DOI":"10.1109\/CLUSTER.2012.29"},{"key":"bibr16-1094342018762510","doi-asserted-by":"publisher","DOI":"10.1145\/155332.155333"},{"key":"bibr17-1094342018762510","doi-asserted-by":"publisher","DOI":"10.1109\/FCCM.2006.45"},{"key":"bibr18-1094342018762510","volume-title":"High Performance Computing in the U.S. 1995\u2014An Analysis on the Basis of the TOP500 List","author":"Dongarra JJ","year":"1995"},{"key":"bibr19-1094342018762510","volume":"5","author":"Fitzpatrick B.","year":"2004","journal-title":"Linux Journal"},{"key":"bibr20-1094342018762510","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-540-75755-9_32"},{"key":"bibr21-1094342018762510","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-540-77220-0_21"},{"key":"bibr22-1094342018762510","doi-asserted-by":"publisher","DOI":"10.1109\/PACT.2011.14"},{"key":"bibr23-1094342018762510","first-page":"985","volume":"11","author":"Leskovec J","year":"2010","journal-title":"Journal of Machine Learning Research"},{"key":"bibr24-1094342018762510","doi-asserted-by":"publisher","DOI":"10.1145\/1837274.1837289"},{"key":"bibr25-1094342018762510","doi-asserted-by":"publisher","DOI":"10.1007\/3-540-46117-5_94"},{"key":"bibr26-1094342018762510","doi-asserted-by":"publisher","DOI":"10.1109\/IPDPS.2009.5161108"},{"key":"bibr27-1094342018762510","author":"Murphy RC","year":"2010","journal-title":"Cray User Group (CUG)"},{"key":"bibr28-1094342018762510","doi-asserted-by":"publisher","DOI":"10.1109\/IPDPSW.2012.204"},{"key":"bibr29-1094342018762510","doi-asserted-by":"publisher","DOI":"10.1109\/TPDS.2007.70811"},{"key":"bibr30-1094342018762510","doi-asserted-by":"publisher","DOI":"10.1145\/2335755.2335789"},{"key":"bibr31-1094342018762510","doi-asserted-by":"publisher","DOI":"10.1109\/IISWC.2011.6114175"},{"key":"bibr32-1094342018762510","unstructured":"TACC. Stampede, Dell PowerEdge C8220 Cluster with Intel Xeon Phi coprocessors."},{"key":"bibr33-1094342018762510","volume-title":"21st international conference on parallel and distributed computing and systems (PDCS)","author":"Xia Y"},{"key":"bibr34-1094342018762510","doi-asserted-by":"publisher","DOI":"10.1109\/SC.2005.4"}],"container-title":["The International Journal of High Performance Computing Applications"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/journals.sagepub.com\/doi\/pdf\/10.1177\/1094342018762510","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/journals.sagepub.com\/doi\/full-xml\/10.1177\/1094342018762510","content-type":"application\/xml","content-version":"vor","intended-application":"text-mining"},{"URL":"http:\/\/journals.sagepub.com\/doi\/pdf\/10.1177\/1094342018762510","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/journals.sagepub.com\/doi\/pdf\/10.1177\/1094342018762510","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2026,4,29]],"date-time":"2026-04-29T08:15:40Z","timestamp":1777450540000},"score":1,"resource":{"primary":{"URL":"https:\/\/journals.sagepub.com\/doi\/10.1177\/1094342018762510"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2018,4,15]]},"references-count":34,"journal-issue":{"issue":"2","published-print":{"date-parts":[[2019,3]]}},"alternative-id":["10.1177\/1094342018762510"],"URL":"https:\/\/doi.org\/10.1177\/1094342018762510","relation":{},"ISSN":["1094-3420","1741-2846"],"issn-type":[{"value":"1094-3420","type":"print"},{"value":"1741-2846","type":"electronic"}],"subject":[],"published":{"date-parts":[[2018,4,15]]}}}