{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,1,10]],"date-time":"2026-01-10T07:12:09Z","timestamp":1768029129674,"version":"3.49.0"},"reference-count":10,"publisher":"Association for Computing Machinery (ACM)","issue":"4","license":[{"start":{"date-parts":[[2016,4,22]],"date-time":"2016-04-22T00:00:00Z","timestamp":1461283200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["SIGARCH Comput. Archit. News"],"published-print":{"date-parts":[[2016,4,22]]},"abstract":"<jats:p>A parallel Breadth First Search (BFS) algorithm is proposed for cost-efficient multi-GPU systems without enough memory amount or communication performance. By using an improved data structure for the duplication elimination of local nodes, both required memory amount and processing time are reduced. By using Unified Virtual Addressing, time for communication can be hidden with the computation. The proposed algorithm is implemented on two costefficient multi-GPU systems: Express multi-GPU system which has a full of flexibility but the communication latency between GPU and host is limited, and a high-end gaming machine whose memory is limited. Both systems achieve good strong scaling with the proposed methods. On Express multi-GPU system, the communication overhead was almost completely hidden, and the aggregate communication throughput reached 4.77 GB\/sec (38.16 Gbps), almost theoretical maximum.<\/jats:p>","DOI":"10.1145\/2927964.2927975","type":"journal-article","created":{"date-parts":[[2016,4,25]],"date-time":"2016-04-25T19:51:13Z","timestamp":1461613873000},"page":"58-63","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":6,"title":["Breadth First Search on Cost-efficient Multi-GPU Systems"],"prefix":"10.1145","volume":"43","author":[{"given":"Takuji","family":"Mitsuishi","sequence":"first","affiliation":[{"name":"Keio University, Yokohama, Japan"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Jun","family":"Suzuki","sequence":"additional","affiliation":[{"name":"NEC Corporation, Kanagawa, Japan"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Yuki","family":"Hayashi","sequence":"additional","affiliation":[{"name":"NEC Corporation, Kanagawa, Japan"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Masaki","family":"Kan","sequence":"additional","affiliation":[{"name":"NEC Corporation, Kanagawa, Japan"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Hideharu","family":"Amano","sequence":"additional","affiliation":[{"name":"Keio University, Yokohama, Japan"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","published-online":{"date-parts":[[2016,4,22]]},"reference":[{"key":"e_1_2_1_1_1","unstructured":"Graph 500. http:\/\/www.graph500.org\/.  Graph 500. http:\/\/www.graph500.org\/."},{"key":"e_1_2_1_2_1","unstructured":"L. Barnes. Multi-gpu programming 2013. http:\/\/ondemand.gputechconf.com\/gtc\/2013\/presentations\/S3465-Multi-GPU-Programming.pdf.  L. Barnes. Multi-gpu programming 2013. http:\/\/ondemand.gputechconf.com\/gtc\/2013\/presentations\/S3465-Multi-GPU-Programming.pdf."},{"key":"e_1_2_1_3_1","first-page":"133","volume-title":"PKDD","author":"Leskovec J.","year":"2005"},{"key":"e_1_2_1_4_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.jpdc.2013.05.007"},{"key":"e_1_2_1_5_1","doi-asserted-by":"publisher","DOI":"10.1145\/2145816.2145832"},{"key":"e_1_2_1_6_1","doi-asserted-by":"publisher","DOI":"10.1145\/2693714.2693729"},{"key":"e_1_2_1_7_1","unstructured":"NEC Corporation. http:\/\/www.nec.co.jp.  NEC Corporation. http:\/\/www.nec.co.jp."},{"key":"e_1_2_1_8_1","doi-asserted-by":"publisher","DOI":"10.1145\/2693714.2693717"},{"key":"e_1_2_1_9_1","doi-asserted-by":"publisher","DOI":"10.1109\/HOTI.2006.12"},{"key":"e_1_2_1_10_1","doi-asserted-by":"publisher","DOI":"10.1109\/HiPC.2013.6799136"}],"container-title":["ACM SIGARCH Computer Architecture News"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/2927964.2927975","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/2927964.2927975","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T04:56:21Z","timestamp":1750222581000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/2927964.2927975"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2016,4,22]]},"references-count":10,"journal-issue":{"issue":"4","published-print":{"date-parts":[[2016,4,22]]}},"alternative-id":["10.1145\/2927964.2927975"],"URL":"https:\/\/doi.org\/10.1145\/2927964.2927975","relation":{},"ISSN":["0163-5964"],"issn-type":[{"value":"0163-5964","type":"print"}],"subject":[],"published":{"date-parts":[[2016,4,22]]},"assertion":[{"value":"2016-04-22","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}