{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,1,28]],"date-time":"2026-01-28T22:33:08Z","timestamp":1769639588013,"version":"3.49.0"},"reference-count":65,"publisher":"Association for Computing Machinery (ACM)","issue":"1","license":[{"start":{"date-parts":[[2023,12,18]],"date-time":"2023-12-18T00:00:00Z","timestamp":1702857600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Des. Autom. Electron. Syst."],"published-print":{"date-parts":[[2024,1,31]]},"abstract":"<jats:p>\n            Coarse-Grained Reconfigurable Arrays (CGRA) are promising edge accelerators due to the outstanding balance in flexibility, performance, and energy efficiency. Classic CGRAs statically map compute operations onto the processing elements (PE) and route the data dependencies among the operations through the Network-on-Chip. However, CGRAs are designed for fine-grained static instruction-level parallelism and struggle to accelerate applications with dynamic and irregular data-level parallelism, such as graph processing. To address this limitation, we present\n            <jats:sc>Flip<\/jats:sc>\n            , a novel accelerator that enhances traditional CGRA architectures to boost the performance of graph applications.\n            <jats:sc>Flip<\/jats:sc>\n            retains the classic CGRA execution model while introducing a special data-centric mode for efficient graph processing. Specifically, it leverages the inherent data parallelism of graph algorithms by mapping graph vertices onto PEs rather than the operations and supporting dynamic routing of temporary data according to the runtime evolution of the graph frontier. Experimental results demonstrate that\n            <jats:sc>Flip<\/jats:sc>\n            achieves up to 36\u00d7 speedup with merely 19% more area compared to classic CGRAs. Compared to state-of-the-art large-scale graph processors,\n            <jats:sc>Flip<\/jats:sc>\n            has similar energy efficiency and 2.2\u00d7 better area efficiency at a much-reduced power\/area budget.\n          <\/jats:p>","DOI":"10.1145\/3631118","type":"journal-article","created":{"date-parts":[[2023,11,3]],"date-time":"2023-11-03T18:35:24Z","timestamp":1699036524000},"page":"1-25","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":3,"title":["<scp>Flip<\/scp>\n            : Data-centric Edge CGRA Accelerator"],"prefix":"10.1145","volume":"29","author":[{"ORCID":"https:\/\/orcid.org\/0009-0003-5260-0980","authenticated-orcid":false,"given":"Dan","family":"Wu","sequence":"first","affiliation":[{"name":"National University of Singapore, Singapore"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-0808-5451","authenticated-orcid":false,"given":"Peng","family":"Chen","sequence":"additional","affiliation":[{"name":"Chongqing University of Posts and Telecommunications, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-3413-136X","authenticated-orcid":false,"given":"Thilini Kaushalya","family":"Bandara","sequence":"additional","affiliation":[{"name":"National University of Singapore, Singapore"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-7513-9494","authenticated-orcid":false,"given":"Zhaoying","family":"Li","sequence":"additional","affiliation":[{"name":"National University of Singapore, Singapore"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-4136-4188","authenticated-orcid":false,"given":"Tulika","family":"Mitra","sequence":"additional","affiliation":[{"name":"National University of Singapore, Singapore"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","published-online":{"date-parts":[[2023,12,18]]},"reference":[{"key":"e_1_3_1_2_2","doi-asserted-by":"publisher","DOI":"10.1145\/2749469.2750386"},{"key":"e_1_3_1_3_2","doi-asserted-by":"publisher","DOI":"10.1109\/TCAD.2017.2706562"},{"key":"e_1_3_1_4_2","doi-asserted-by":"publisher","DOI":"10.1145\/3503222.3507772"},{"key":"e_1_3_1_5_2","doi-asserted-by":"publisher","DOI":"10.1145\/3466752.3480133"},{"key":"e_1_3_1_6_2","article-title":"Graph processing on FPGAs: Taxonomy, survey, challenges","author":"Besta Maciej","year":"2019","unstructured":"Maciej Besta, Dimitri Stanojevic, Johannes De Fine Licht, Tal Ben-Nun, and Torsten Hoefler. 2019. Graph processing on FPGAs: Taxonomy, survey, challenges. arXiv preprint arXiv:1903.06697 (2019).","journal-title":"arXiv preprint arXiv:1903.06697"},{"key":"e_1_3_1_7_2","doi-asserted-by":"publisher","DOI":"10.1109\/TPDS.2015.2475270"},{"key":"e_1_3_1_8_2","unstructured":"Aydin Buluc Scott Beamer Kamesh Madduri Krste Asanovic and David Patterson. 2017. Distributed-memory Breadth-first Search on Massive Graphs. arxiv:1705.04590 [cs.DC]"},{"key":"e_1_3_1_9_2","volume-title":"Proceedings of the 14th Workshop on Hot Topics in Operating Systems (HotOS\u201913)","author":"Cipar James","year":"2013","unstructured":"James Cipar, Qirong Ho, Jin Kyu Kim, Seunghak Lee, Gregory R. Ganger, Garth Gibson, Kimberly Keeton, and Eric Xing. 2013. Solving the straggler problem with bounded staleness. In Proceedings of the 14th Workshop on Hot Topics in Operating Systems (HotOS\u201913)."},{"key":"e_1_3_1_10_2","doi-asserted-by":"publisher","DOI":"10.1109\/ISCA52012.2021.00053"},{"key":"e_1_3_1_11_2","doi-asserted-by":"publisher","DOI":"10.1109\/MM.2022.3160862"},{"key":"e_1_3_1_12_2","doi-asserted-by":"publisher","DOI":"10.1109\/TCAD.2018.2821565"},{"key":"e_1_3_1_13_2","doi-asserted-by":"publisher","DOI":"10.5555\/2821589"},{"key":"e_1_3_1_14_2","doi-asserted-by":"publisher","DOI":"10.1109\/JSSC.2020.3048726"},{"key":"e_1_3_1_15_2","doi-asserted-by":"publisher","DOI":"10.1109\/MICRO56248.2022.00046"},{"key":"e_1_3_1_16_2","doi-asserted-by":"publisher","DOI":"10.1007\/s11390-019-1914-z"},{"key":"e_1_3_1_17_2","doi-asserted-by":"publisher","DOI":"10.1109\/WWC.2001.990739"},{"key":"e_1_3_1_18_2","doi-asserted-by":"publisher","DOI":"10.1109\/MICRO.2016.7783759"},{"key":"e_1_3_1_19_2","doi-asserted-by":"publisher","DOI":"10.1145\/2593069.2593100"},{"key":"e_1_3_1_20_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-031-01755-1"},{"key":"e_1_3_1_21_2","doi-asserted-by":"publisher","DOI":"10.1109\/ASAP.2015.7245698"},{"key":"e_1_3_1_22_2","doi-asserted-by":"publisher","DOI":"10.1145\/3061639.3062262"},{"key":"e_1_3_1_23_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICCAD45719.2019.8942148"},{"issue":"5","key":"e_1_3_1_24_2","first-page":"473","article-title":"An FPGA implementation for solving the large single-source-shortest-path problem","volume":"63","author":"Lei Guoqing","year":"2015","unstructured":"Guoqing Lei, Yong Dou, Rongchun Li, and Fei Xia. 2015. An FPGA implementation for solving the large single-source-shortest-path problem. IEEE Trans. Circ. Syst. II: Expr. Briefs 63, 5 (2015), 473\u2013477.","journal-title":"IEEE Trans. Circ. Syst. II: Expr. Briefs"},{"key":"e_1_3_1_25_2","unstructured":"Jure Leskovec and Andrej Krevl. 2014. SNAP Datasets: Stanford large network dataset collection. http:\/\/snap.stanford.edu\/data"},{"key":"e_1_3_1_26_2","doi-asserted-by":"publisher","DOI":"10.1007\/11535331_16"},{"key":"e_1_3_1_27_2","doi-asserted-by":"publisher","DOI":"10.1109\/TCAD.2021.3058313"},{"key":"e_1_3_1_28_2","article-title":"Coarse grained reconfigurable array CGRA","author":"Li Zhaoying","year":"2022","unstructured":"Zhaoying Li, D. Wijerathne, and Tulika Mitra. 2022. Coarse grained reconfigurable array CGRA. In Springer Handbook of Computer Architecture. Springer.","journal-title":"Springer Handbook of Computer Architecture"},{"key":"e_1_3_1_29_2","doi-asserted-by":"publisher","DOI":"10.1109\/HPCA53966.2022.00040"},{"key":"e_1_3_1_30_2","doi-asserted-by":"publisher","DOI":"10.1145\/3357375"},{"key":"e_1_3_1_31_2","first-page":"35","volume-title":"Proceedings of the 55th IEEE\/ACM International Symposium on Microarchitecture (MICRO\u201922)","year":"2022","unstructured":"Sihao Liu, Jian Weng, Dylan Kupsh, Atefeh Sohrabizadeh, Zhengrong Wang, Licheng Guo, Jiuyang Liu, Maxim Zhulin, Rishabh Mani, Lucheng Zhang, Jason Cong, and Tony Nowatzki. 2022. OverGen: Improving FPGA usability through domain-specific overlay generation. In Proceedings of the 55th IEEE\/ACM International Symposium on Microarchitecture (MICRO\u201922). IEEE, 35\u201356."},{"key":"e_1_3_1_32_2","article-title":"GraphLab: A new framework for parallel machine learning","author":"Low Yucheng","year":"2014","unstructured":"Yucheng Low, Joseph E. Gonzalez, Aapo Kyrola, Danny Bickson, Carlos E. Guestrin, and Joseph Hellerstein. 2014. GraphLab: A new framework for parallel machine learning. arXiv preprint arXiv:1408.2041 (2014).","journal-title":"arXiv preprint arXiv:1408.2041"},{"key":"e_1_3_1_33_2","doi-asserted-by":"publisher","DOI":"10.1109\/IPDPSW55747.2022.00118"},{"key":"e_1_3_1_34_2","doi-asserted-by":"publisher","DOI":"10.1145\/2818185"},{"key":"e_1_3_1_35_2","doi-asserted-by":"publisher","DOI":"10.1145\/1168919.1168878"},{"key":"e_1_3_1_36_2","doi-asserted-by":"publisher","DOI":"10.1145\/3466752.3480048"},{"key":"e_1_3_1_37_2","doi-asserted-by":"publisher","DOI":"10.1145\/3079856.3080255"},{"key":"e_1_3_1_38_2","doi-asserted-by":"publisher","DOI":"10.1145\/3007787.3001155"},{"key":"e_1_3_1_39_2","doi-asserted-by":"publisher","DOI":"10.1109\/HOTCHIPS.2011.7477494"},{"key":"e_1_3_1_40_2","doi-asserted-by":"publisher","DOI":"10.1109\/ACCESS.2020.3012084"},{"key":"e_1_3_1_41_2","doi-asserted-by":"publisher","DOI":"10.1109\/MICRO50266.2020.00078"},{"key":"e_1_3_1_42_2","unstructured":"Steven M. Rubin and Raj Reddy. 1977. The locus model of search and its use in image interpretation. Cambridge Massachusetts (1977) 590\u2013595."},{"key":"e_1_3_1_43_2","doi-asserted-by":"publisher","DOI":"10.1109\/HPCA.2018.00052"},{"key":"e_1_3_1_44_2","doi-asserted-by":"publisher","DOI":"10.1109\/DAC18074.2021.9586122"},{"key":"e_1_3_1_45_2","doi-asserted-by":"publisher","DOI":"10.1109\/MICRO.2003.1253203"},{"key":"e_1_3_1_46_2","doi-asserted-by":"publisher","DOI":"10.1109\/ASAP52443.2021.00029"},{"key":"e_1_3_1_47_2","doi-asserted-by":"publisher","DOI":"10.1145\/3524059.3532359"},{"key":"e_1_3_1_48_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICCD50377.2020.00070"},{"key":"e_1_3_1_49_2","doi-asserted-by":"publisher","DOI":"10.23919\/DATE51398.2021.9473955"},{"key":"e_1_3_1_50_2","doi-asserted-by":"publisher","DOI":"10.1145\/2678373.2665703"},{"key":"e_1_3_1_51_2","doi-asserted-by":"publisher","DOI":"10.1109\/2.612254"},{"key":"e_1_3_1_52_2","doi-asserted-by":"publisher","DOI":"10.1109\/A-SSCC47793.2019.9056954"},{"key":"e_1_3_1_53_2","doi-asserted-by":"publisher","DOI":"10.1109\/ISCA45697.2020.00032"},{"key":"e_1_3_1_54_2","doi-asserted-by":"publisher","DOI":"10.1109\/HPCA47549.2020.00063"},{"key":"e_1_3_1_55_2","doi-asserted-by":"publisher","DOI":"10.1145\/3489517.3530429"},{"key":"e_1_3_1_56_2","doi-asserted-by":"publisher","DOI":"10.1145\/3358177"},{"key":"e_1_3_1_57_2","volume-title":"Proceedings of the 5th Workshop on Open-Source EDA Technology (WOSET\u201922)","author":"Wijerathne Dhananjaya","year":"2022","unstructured":"Dhananjaya Wijerathne, Zhaoying Li, Manupa Karunaratne, Li-Shiuan Peh, and Tulika Mitra. 2022. Morpher: An open-source integrated compilation and simulation framework for CGRA. In Proceedings of the 5th Workshop on Open-Source EDA Technology (WOSET\u201922)."},{"key":"e_1_3_1_58_2","doi-asserted-by":"publisher","DOI":"10.1109\/TCAD.2021.3132551"},{"key":"e_1_3_1_59_2","doi-asserted-by":"publisher","DOI":"10.1109\/JPROC.2012.2190369"},{"key":"e_1_3_1_60_2","doi-asserted-by":"publisher","DOI":"10.1109\/HPCA53966.2022.00023"},{"key":"e_1_3_1_61_2","article-title":"Dynamic-II pipeline: Compiling loops with irregular branches on static-scheduling CGRA","author":"Yuan Baofen","year":"2021","unstructured":"Baofen Yuan, Jianfeng Zhu, Xingchen Man, Zijiao Ma, Shouyi Yin, Shaojun Wei, and Leibo Liu. 2021. Dynamic-II pipeline: Compiling loops with irregular branches on static-scheduling CGRA. IEEE Trans. Comput.-aid. Des. Integ. Circ. Syst. 41, 9 (2021), 2929\u20132942.","journal-title":"IEEE Trans. Comput.-aid. Des. Integ. Circ. Syst."},{"key":"e_1_3_1_62_2","doi-asserted-by":"publisher","DOI":"10.1109\/HPCA.2018.00053"},{"key":"e_1_3_1_63_2","doi-asserted-by":"publisher","DOI":"10.1109\/HPCA51647.2021.00039"},{"key":"e_1_3_1_64_2","doi-asserted-by":"publisher","DOI":"10.1109\/CCGRID.2017.114"},{"key":"e_1_3_1_65_2","doi-asserted-by":"publisher","DOI":"10.1145\/3203217.3203233"},{"key":"e_1_3_1_66_2","doi-asserted-by":"publisher","DOI":"10.1145\/3352460.3358256"}],"container-title":["ACM Transactions on Design Automation of Electronic Systems"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3631118","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3631118","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T16:35:51Z","timestamp":1750178151000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3631118"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,12,18]]},"references-count":65,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2024,1,31]]}},"alternative-id":["10.1145\/3631118"],"URL":"https:\/\/doi.org\/10.1145\/3631118","relation":{},"ISSN":["1084-4309","1557-7309"],"issn-type":[{"value":"1084-4309","type":"print"},{"value":"1557-7309","type":"electronic"}],"subject":[],"published":{"date-parts":[[2023,12,18]]},"assertion":[{"value":"2023-05-25","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2023-10-25","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2023-12-18","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}