{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,11]],"date-time":"2026-03-11T01:55:32Z","timestamp":1773194132850,"version":"3.50.1"},"publisher-location":"New York, NY, USA","reference-count":69,"publisher":"ACM","license":[{"start":{"date-parts":[[2021,10,17]],"date-time":"2021-10-17T00:00:00Z","timestamp":1634428800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by-sa\/4.0\/"}],"funder":[{"name":"DARPA","award":["HR0011-18-3-0007"],"award-info":[{"award-number":["HR0011-18-3-0007"]}]},{"DOI":"10.13039\/100000028","name":"Semiconductor Research Corporation","doi-asserted-by":"publisher","award":["2020-AH-2985"],"award-info":[{"award-number":["2020-AH-2985"]}],"id":[{"id":"10.13039\/100000028","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2021,10,18]]},"DOI":"10.1145\/3466752.3480048","type":"proceedings-article","created":{"date-parts":[[2021,10,17]],"date-time":"2021-10-17T19:16:55Z","timestamp":1634498215000},"page":"1064-1077","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":46,"title":["Fifer: Practical Acceleration of Irregular Applications on Reconfigurable Architectures"],"prefix":"10.1145","author":[{"given":"Quan M.","family":"Nguyen","sequence":"first","affiliation":[{"name":"Massachusetts Institute of Technology, United States of America"}]},{"given":"Daniel","family":"Sanchez","sequence":"additional","affiliation":[{"name":"Massachusetts Institute of Technology, United States of America"}]}],"member":"320","published-online":{"date-parts":[[2021,10,17]]},"reference":[{"key":"e_1_3_2_1_1_1","doi-asserted-by":"publisher","DOI":"10.1145\/223982.223985"},{"key":"e_1_3_2_1_2_1","volume-title":"Sparcle: an evolutionary processor design for large-scale multiprocessors","author":"Agarwal Anant","year":"1993","unstructured":"Anant Agarwal , John Kubiatowicz , David\u00a0 A. Kranz , Beng-Hong Lim , Donald Yeung , Godfrey D\u2019Souza , and Mike Parkin . 1993. Sparcle: an evolutionary processor design for large-scale multiprocessors . IEEE Micro 13, 3 ( 1993 ). Anant Agarwal, John Kubiatowicz, David\u00a0A. Kranz, Beng-Hong Lim, Donald Yeung, Godfrey D\u2019Souza, and Mike Parkin. 1993. Sparcle: an evolutionary processor design for large-scale multiprocessors. IEEE Micro 13, 3 (1993)."},{"key":"e_1_3_2_1_3_1","doi-asserted-by":"publisher","DOI":"10.1145\/325164.325119"},{"key":"e_1_3_2_1_4_1","volume":"199","author":"Alverson Robert","unstructured":"Robert Alverson , David Callahan , Daniel Cummings , Brian\u00a0 D. Koblenz , Allan Porterfield , and Burton\u00a0 J. Smith. 199 0. The Tera computer system. In Proc. ICS\u201990. Robert Alverson, David Callahan, Daniel Cummings, Brian\u00a0D. Koblenz, Allan Porterfield, and Burton\u00a0J. Smith. 1990. The Tera computer system. In Proc. ICS\u201990.","journal-title":"J. Smith."},{"key":"e_1_3_2_1_5_1","doi-asserted-by":"publisher","DOI":"10.1145\/3085572"},{"key":"e_1_3_2_1_6_1","doi-asserted-by":"publisher","DOI":"10.1145\/1024393.1024396"},{"key":"e_1_3_2_1_7_1","doi-asserted-by":"publisher","DOI":"10.1007\/3-540-44614-1_65"},{"key":"e_1_3_2_1_8_1","doi-asserted-by":"publisher","DOI":"10.1109\/ASAP.2017.7995277"},{"key":"e_1_3_2_1_9_1","volume-title":"NVIDIA A100 Tensor Core GPU: Performance and Innovation","author":"Choquette Jack","year":"2021","unstructured":"Jack Choquette , Wishwesh Gandhi , Olivier Giroux , Nick Stam , and Ronny Krashinsky . 2021. NVIDIA A100 Tensor Core GPU: Performance and Innovation . IEEE Micro 41, 2 ( 2021 ). Jack Choquette, Wishwesh Gandhi, Olivier Giroux, Nick Stam, and Ronny Krashinsky. 2021. NVIDIA A100 Tensor Core GPU: Performance and Innovation. IEEE Micro 41, 2 (2021)."},{"key":"e_1_3_2_1_10_1","doi-asserted-by":"publisher","DOI":"10.1145\/1807128.1807152"},{"key":"e_1_3_2_1_11_1","doi-asserted-by":"publisher","DOI":"10.1145\/3352460.3358276"},{"key":"e_1_3_2_1_12_1","unstructured":"William\u00a0James Dally and Brian\u00a0Patrick Towles. 2004. Principles and practices of interconnection networks.  William\u00a0James Dally and Brian\u00a0Patrick Towles. 2004. Principles and practices of interconnection networks."},{"key":"e_1_3_2_1_13_1","volume-title":"Inside 6th Gen Intel Core: New Microarchitecture Code Named Skylake. (2016). Hot Chips","author":"Doweck Jack","unstructured":"Jack Doweck and Wen-fu Kao. 2016. Inside 6th Gen Intel Core: New Microarchitecture Code Named Skylake. (2016). Hot Chips . Jack Doweck and Wen-fu Kao. 2016. Inside 6th Gen Intel Core: New Microarchitecture Code Named Skylake. (2016). Hot Chips."},{"key":"e_1_3_2_1_14_1","doi-asserted-by":"publisher","DOI":"10.1145\/1168857.1168880"},{"key":"e_1_3_2_1_15_1","doi-asserted-by":"publisher","DOI":"10.1145\/1508128.1508158"},{"key":"e_1_3_2_1_16_1","doi-asserted-by":"publisher","DOI":"10.1109\/HPCA.2016.7446059"},{"key":"e_1_3_2_1_17_1","volume-title":"PipeRench: A Reconfigurable Architecture and Compiler. Computer 33, 4","author":"Goldstein Seth\u00a0Copen","year":"2000","unstructured":"Seth\u00a0Copen Goldstein , Herman Schmit , Mihai Budiu , Srihari Cadambi , Matthew Moe , and R.\u00a0 Reed Taylor . 2000. PipeRench: A Reconfigurable Architecture and Compiler. Computer 33, 4 ( 2000 ). Seth\u00a0Copen Goldstein, Herman Schmit, Mihai Budiu, Srihari Cadambi, Matthew Moe, and R.\u00a0Reed Taylor. 2000. PipeRench: A Reconfigurable Architecture and Compiler. Computer 33, 4 (2000)."},{"key":"e_1_3_2_1_18_1","doi-asserted-by":"publisher","DOI":"10.1145\/327070.327117"},{"key":"e_1_3_2_1_19_1","doi-asserted-by":"publisher","DOI":"10.1145\/1168857.1168877"},{"key":"e_1_3_2_1_20_1","doi-asserted-by":"publisher","DOI":"10.1109\/HPCA.2011.5749755"},{"key":"e_1_3_2_1_21_1","doi-asserted-by":"publisher","DOI":"10.1145\/2830772.2830800"},{"key":"e_1_3_2_1_22_1","doi-asserted-by":"publisher","DOI":"10.1109\/MICRO.2016.7783759"},{"key":"e_1_3_2_1_23_1","doi-asserted-by":"publisher","DOI":"10.1109\/ISSCC.2014.6757323"},{"key":"e_1_3_2_1_24_1","volume-title":"Elastic CGRAs. In Proc. FPGA.","author":"Huang Yuanjie","year":"2013","unstructured":"Yuanjie Huang , Paolo Ienne , Olivier Temam , Yunji Chen , and Chengyong Wu . 2013 . Elastic CGRAs. In Proc. FPGA. Yuanjie Huang, Paolo Ienne, Olivier Temam, Yunji Chen, and Chengyong Wu. 2013. Elastic CGRAs. In Proc. FPGA."},{"key":"e_1_3_2_1_25_1","volume-title":"Parallel MIMD Computation: HEP Supercomputer and Its Applications","author":"Jordan F.","unstructured":"Harry\u00a0 F. Jordan . 1985. Parallel MIMD Computation: HEP Supercomputer and Its Applications . MIT Press . Harry\u00a0F. Jordan. 1985. Parallel MIMD Computation: HEP Supercomputer and Its Applications. MIT Press."},{"key":"e_1_3_2_1_26_1","volume-title":"IBM Power5 Chip: A Dual-Core Multithreaded Processor","author":"Kalla N.","year":"2004","unstructured":"Ronald\u00a0 N. Kalla , Balaram Sinharoy , and Joel\u00a0 M. Tendler . 2004. IBM Power5 Chip: A Dual-Core Multithreaded Processor . IEEE Micro ( 2004 ). Ronald\u00a0N. Kalla, Balaram Sinharoy, and Joel\u00a0M. Tendler. 2004. IBM Power5 Chip: A Dual-Core Multithreaded Processor. IEEE Micro (2004)."},{"key":"e_1_3_2_1_27_1","doi-asserted-by":"publisher","DOI":"10.1145\/3133901"},{"key":"e_1_3_2_1_28_1","doi-asserted-by":"publisher","DOI":"10.1145\/1950413.1950427"},{"key":"e_1_3_2_1_29_1","volume-title":"Niagara: A 32-Way Multithreaded Sparc Processor","author":"Kongetira Poonacha","year":"2005","unstructured":"Poonacha Kongetira , Kathirgamar Aingaran , and Kunle Olukotun . 2005 . Niagara: A 32-Way Multithreaded Sparc Processor . IEEE Micro 25, 2 (2005). Poonacha Kongetira, Kathirgamar Aingaran, and Kunle Olukotun. 2005. Niagara: A 32-Way Multithreaded Sparc Processor. IEEE Micro 25, 2 (2005)."},{"key":"e_1_3_2_1_30_1","volume-title":"Proc. SPAA.","author":"E.","unstructured":"Charles\u00a0 E. Leiserson and Tao\u00a0B. Schardl. 2010. A work-efficient parallel breadth-first search algorithm (or how to cope with the nondeterminism of reducers) . In Proc. SPAA. Charles\u00a0E. Leiserson and Tao\u00a0B. Schardl. 2010. A work-efficient parallel breadth-first search algorithm (or how to cope with the nondeterminism of reducers). In Proc. SPAA."},{"key":"e_1_3_2_1_31_1","doi-asserted-by":"publisher","DOI":"10.1145\/1669112.1669172"},{"key":"e_1_3_2_1_32_1","doi-asserted-by":"publisher","DOI":"10.1145\/3079856.3080228"},{"key":"e_1_3_2_1_33_1","doi-asserted-by":"publisher","DOI":"10.1109\/TCAD.2018.2878183"},{"key":"e_1_3_2_1_34_1","doi-asserted-by":"publisher","DOI":"10.1145\/2593069.2593105"},{"key":"e_1_3_2_1_35_1","doi-asserted-by":"publisher","DOI":"10.1145\/1065010.1065034"},{"key":"e_1_3_2_1_36_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-540-45234-8_7"},{"key":"e_1_3_2_1_37_1","doi-asserted-by":"publisher","DOI":"10.1145\/1168857.1168878"},{"key":"e_1_3_2_1_38_1","unstructured":"Nangate Inc.2008. The NanGate 45nm Open Cell Library. http:\/\/www.nangate.com\/?page_id=2325.  Nangate Inc.2008. The NanGate 45nm Open Cell Library. http:\/\/www.nangate.com\/?page_id=2325."},{"key":"e_1_3_2_1_39_1","doi-asserted-by":"publisher","DOI":"10.1145\/2517349.2522739"},{"key":"e_1_3_2_1_40_1","doi-asserted-by":"publisher","DOI":"10.1109\/FPL.2018.00046"},{"key":"e_1_3_2_1_41_1","doi-asserted-by":"publisher","DOI":"10.1109\/MICRO50266.2020.00056"},{"key":"e_1_3_2_1_42_1","volume-title":"Stream-Dataflow Acceleration. In Proc. ISCA-44","author":"Nowatzki Tony","year":"2017","unstructured":"Tony Nowatzki , Vinay Gangadhar , Newsha Ardalani , and Karthikeyan Sankaralingam . 2017 . Stream-Dataflow Acceleration. In Proc. ISCA-44 . Tony Nowatzki, Vinay Gangadhar, Newsha Ardalani, and Karthikeyan Sankaralingam. 2017. Stream-Dataflow Acceleration. In Proc. ISCA-44."},{"key":"e_1_3_2_1_43_1","doi-asserted-by":"publisher","DOI":"10.1145\/2491956.2462163"},{"key":"e_1_3_2_1_44_1","volume":"201","author":"O\u2019Connor Mike","unstructured":"Mike O\u2019Connor , Niladrish Chatterjee , Donghyuk Lee , John\u00a0 M. Wilson , Aditya Agrawal , Stephen\u00a0 W. Keckler , and William\u00a0 J. Dally. 201 7. Fine-grained DRAM: energy-efficient DRAM for extreme bandwidth systems. In Proc. MICRO-50. Mike O\u2019Connor, Niladrish Chatterjee, Donghyuk Lee, John\u00a0M. Wilson, Aditya Agrawal, Stephen\u00a0W. Keckler, and William\u00a0J. Dally. 2017. Fine-grained DRAM: energy-efficient DRAM for extreme bandwidth systems. In Proc. MICRO-50.","journal-title":"J. Dally."},{"key":"e_1_3_2_1_45_1","doi-asserted-by":"publisher","DOI":"10.1145\/2485922.2485935"},{"key":"e_1_3_2_1_46_1","volume":"201","author":"Parashar Angshuman","unstructured":"Angshuman Parashar , Minsoo Rhu , Anurag Mukkara , Antonio Puglielli , Rangharajan Venkatesan , Brucek Khailany , Joel\u00a0 S. Emer , Stephen\u00a0 W. Keckler , and William\u00a0 J. Dally. 201 7. SCNN: An Accelerator for Compressed-sparse Convolutional Neural Networks. In Proc. ISCA-44. Angshuman Parashar, Minsoo Rhu, Anurag Mukkara, Antonio Puglielli, Rangharajan Venkatesan, Brucek Khailany, Joel\u00a0S. Emer, Stephen\u00a0W. Keckler, and William\u00a0J. Dally. 2017. SCNN: An Accelerator for Compressed-sparse Convolutional Neural Networks. In Proc. ISCA-44.","journal-title":"J. Dally."},{"key":"e_1_3_2_1_47_1","doi-asserted-by":"crossref","unstructured":"Guiqiang Peng Leibo Liu Sheng Zhou Shouyi Yin and Shaojun Wei. 2020. A 2.92-Gb\/s\/W and 0.43-Gb\/s\/MG Flexible and Scalable CGRA-Based Baseband Processor for Massive MIMO Detection. IEEE J. Solid State Circuits(2020).  Guiqiang Peng Leibo Liu Sheng Zhou Shouyi Yin and Shaojun Wei. 2020. A 2.92-Gb\/s\/W and 0.43-Gb\/s\/MG Flexible and Scalable CGRA-Based Baseband Processor for Massive MIMO Detection. IEEE J. Solid State Circuits(2020).","DOI":"10.1109\/JSSC.2019.2952839"},{"key":"e_1_3_2_1_48_1","doi-asserted-by":"publisher","DOI":"10.1145\/3079856.3080256"},{"key":"e_1_3_2_1_49_1","doi-asserted-by":"publisher","DOI":"10.1109\/PACT.2004.1342552"},{"key":"e_1_3_2_1_50_1","volume-title":"\u00a0K. Vrudhula","author":"Shrivastava Aviral","year":"2011","unstructured":"Aviral Shrivastava , Jared Pager , Reiley Jeyapaul , Mahdi Hamzeh , and Sarma B . \u00a0K. Vrudhula . 2011 . Enabling Multithreading on CGRAs. In Proc. ICPP. Aviral Shrivastava, Jared Pager, Reiley Jeyapaul, Mahdi Hamzeh, and Sarma B.\u00a0K. Vrudhula. 2011. Enabling Multithreading on CGRAs. In Proc. ICPP."},{"key":"e_1_3_2_1_51_1","doi-asserted-by":"publisher","DOI":"10.1145\/2442516.2442530"},{"key":"e_1_3_2_1_52_1","doi-asserted-by":"publisher","DOI":"10.1145\/337292.337583"},{"key":"e_1_3_2_1_53_1","doi-asserted-by":"publisher","DOI":"10.1145\/1067649.801719"},{"key":"e_1_3_2_1_54_1","doi-asserted-by":"publisher","DOI":"10.14778\/2809974.2809983"},{"key":"e_1_3_2_1_55_1","doi-asserted-by":"crossref","unstructured":"Michael Sung Ronny Krashinsky and Krste Asanovi\u0107. 2001. Multithreading Decoupled Architectures for Complexity-effective General Purpose Computing. SIGARCH Comput. Archit. News(2001).  Michael Sung Ronny Krashinsky and Krste Asanovi\u0107. 2001. Multithreading Decoupled Architectures for Complexity-effective General Purpose Computing. SIGARCH Comput. Archit. News(2001).","DOI":"10.1145\/563647.563658"},{"key":"e_1_3_2_1_56_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICCAD.2014.7001431"},{"key":"e_1_3_2_1_57_1","doi-asserted-by":"publisher","DOI":"10.1109\/MCSoC.2015.41"},{"key":"e_1_3_2_1_58_1","volume-title":"Proc. MICRO-35","author":"Taylor B.","unstructured":"M.\u00a0 B. Taylor , J. Kim , J. Miller , D. Wentzlaff , F. Ghodrat , B. Greenwald , H. Hoffman , P. Johnson , Jae-Wook Lee , W. Lee , A. Ma , A. Saraf , M. Seneski , N. Shnidman , V. Strumpen , M. Frank , S. Amarasinghe , and A. Agarwal . 2002. The Raw microprocessor: a computational fabric for software circuits and general-purpose programs . In Proc. MICRO-35 . M.\u00a0B. Taylor, J. Kim, J. Miller, D. Wentzlaff, F. Ghodrat, B. Greenwald, H. Hoffman, P. Johnson, Jae-Wook Lee, W. Lee, A. Ma, A. Saraf, M. Seneski, N. Shnidman, V. Strumpen, M. Frank, S. Amarasinghe, and A. Agarwal. 2002. The Raw microprocessor: a computational fabric for software circuits and general-purpose programs. In Proc. MICRO-35."},{"key":"e_1_3_2_1_59_1","unstructured":"Michael\u00a0E. Thomadakis. 2008. The Architecture of the Nehalem Processor and Nehalem-EP SMP Platforms. (2008). Hot Chips.  Michael\u00a0E. Thomadakis. 2008. The Architecture of the Nehalem Processor and Nehalem-EP SMP Platforms. (2008). Hot Chips."},{"key":"e_1_3_2_1_60_1","volume-title":"Topham and Kenneth McDougall","author":"P.","year":"1995","unstructured":"Nigel\u00a0 P. Topham and Kenneth McDougall . 1995 . Performance of the decoupled ACRI-1 architecture: the perfect club. In Proc. HPCN. Nigel\u00a0P. Topham and Kenneth McDougall. 1995. Performance of the decoupled ACRI-1 architecture: the perfect club. In Proc. HPCN."},{"key":"e_1_3_2_1_61_1","doi-asserted-by":"publisher","DOI":"10.1145\/2517349.2522713"},{"key":"e_1_3_2_1_62_1","doi-asserted-by":"publisher","DOI":"10.1109\/ISCA45697.2020.00035"},{"key":"e_1_3_2_1_63_1","doi-asserted-by":"publisher","DOI":"10.1145\/2678373.2665703"},{"key":"e_1_3_2_1_64_1","doi-asserted-by":"publisher","DOI":"10.1109\/MICRO.2018.00013"},{"key":"e_1_3_2_1_65_1","doi-asserted-by":"publisher","DOI":"10.1109\/ISCA45697.2020.00032"},{"key":"e_1_3_2_1_66_1","doi-asserted-by":"publisher","DOI":"10.1109\/HPCA47549.2020.00063"},{"key":"e_1_3_2_1_67_1","unstructured":"Clifford Wolf. 2014. Yosys Open SYnthesis Suite. http:\/\/www.clifford.at\/yosys\/.  Clifford Wolf. 2014. Yosys Open SYnthesis Suite. http:\/\/www.clifford.at\/yosys\/."},{"key":"e_1_3_2_1_68_1","doi-asserted-by":"publisher","DOI":"10.1145\/2541940.2541961"},{"key":"e_1_3_2_1_69_1","volume-title":"Proc. OOPSLA.","author":"Zhang Yunming","year":"2018","unstructured":"Yunming Zhang , Mengjiao Yang , Riyadh Baghdadi , Shoaib Kamil , Julian Shun , and Saman\u00a0 P. Amarasinghe . 2018 . GraphIt - A High-Performance DSL for Graph Analytics . In Proc. OOPSLA. Yunming Zhang, Mengjiao Yang, Riyadh Baghdadi, Shoaib Kamil, Julian Shun, and Saman\u00a0P. Amarasinghe. 2018. GraphIt - A High-Performance DSL for Graph Analytics. In Proc. OOPSLA."}],"event":{"name":"MICRO '21: 54th Annual IEEE\/ACM International Symposium on Microarchitecture","location":"Virtual Event Greece","acronym":"MICRO '21","sponsor":["SIGMICRO ACM Special Interest Group on Microarchitectural Research and Processing"]},"container-title":["MICRO-54: 54th Annual IEEE\/ACM International Symposium on Microarchitecture"],"original-title":[],"link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3466752.3480048","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/abs\/10.1145\/3466752.3480048","content-type":"text\/html","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3466752.3480048","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3466752.3480048","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T21:24:52Z","timestamp":1750195492000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3466752.3480048"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2021,10,17]]},"references-count":69,"alternative-id":["10.1145\/3466752.3480048","10.1145\/3466752"],"URL":"https:\/\/doi.org\/10.1145\/3466752.3480048","relation":{},"subject":[],"published":{"date-parts":[[2021,10,17]]},"assertion":[{"value":"2021-10-17","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}