{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,1,11]],"date-time":"2026-01-11T00:16:53Z","timestamp":1768090613668,"version":"3.49.0"},"publisher-location":"New York, NY, USA","reference-count":34,"publisher":"ACM","license":[{"start":{"date-parts":[[2021,10,17]],"date-time":"2021-10-17T00:00:00Z","timestamp":1634428800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"name":"JUMP"},{"name":"DARPA","award":["FA8650-18-2-7863"],"award-info":[{"award-number":["FA8650-18-2-7863"]}]},{"name":"NSF","award":["1723715"],"award-info":[{"award-number":["1723715"]}]},{"DOI":"10.13039\/100000001","name":"NSF (National Science Foundation)","doi-asserted-by":"publisher","award":["1845952"],"award-info":[{"award-number":["1845952"]}],"id":[{"id":"10.13039\/100000001","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2021,10,18]]},"DOI":"10.1145\/3466752.3480099","type":"proceedings-article","created":{"date-parts":[[2021,10,17]],"date-time":"2021-10-17T19:12:05Z","timestamp":1634497925000},"page":"392-406","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":4,"title":["Software-Defined Vector Processing on Manycore Fabrics"],"prefix":"10.1145","author":[{"given":"Philip","family":"Bedoukian","sequence":"first","affiliation":[{"name":"Cornell University, United States of America"}]},{"given":"Neil","family":"Adit","sequence":"additional","affiliation":[{"name":"Cornell University"}]},{"given":"Edwin","family":"Peguero","sequence":"additional","affiliation":[{"name":"Cornell University"}]},{"given":"Adrian","family":"Sampson","sequence":"additional","affiliation":[{"name":"Cornell University, United States of America"}]}],"member":"320","published-online":{"date-parts":[[2021,10,17]]},"reference":[{"key":"e_1_3_2_1_1_1","unstructured":"Alon Amid Krste Asanovic Allen Baum Alex Bradbury Tony Brewer Chris Celio Aliaksei Chapyzhenka Silviu Chiricescu Ken Dockser Bob Dreyer Roger Espasa Sean Halle John Hauser David Horner Bruce Hoult Bill Huffman Constantine Korikov Ben Korpan Hanna Kruppe Yunsup Lee Guy Lemieux Filip Moc Rich Newell Albert Ou David Patterson Colin Schmidt Alex Solomatnikov Steve Wallach Andrew Waterman and Jim Wilson. 2020. RISC-V \u201cV\u201d Vector Extension version 0.9. https:\/\/github.com\/riscv\/riscv-v-spec.  Alon Amid Krste Asanovic Allen Baum Alex Bradbury Tony Brewer Chris Celio Aliaksei Chapyzhenka Silviu Chiricescu Ken Dockser Bob Dreyer Roger Espasa Sean Halle John Hauser David Horner Bruce Hoult Bill Huffman Constantine Korikov Ben Korpan Hanna Kruppe Yunsup Lee Guy Lemieux Filip Moc Rich Newell Albert Ou David Patterson Colin Schmidt Alex Solomatnikov Steve Wallach Andrew Waterman and Jim Wilson. 2020. RISC-V \u201cV\u201d Vector Extension version 0.9. https:\/\/github.com\/riscv\/riscv-v-spec."},{"key":"e_1_3_2_1_2_1","volume-title":"Framework for Heterogeneous-ISA Research. In International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS).","author":"Balkind Jonathan","year":"2020","unstructured":"Jonathan Balkind , Katie Lim , Michael Schaffner , Fei Gao , Grigory Chirkov , Ang Li , Alexey Lavrov , Tri\u00a0 M. Nguyen , Yaosheng Fu , Florian Zaruba , Kunal Gulati , Luca Benini , and David Wentzlaff . 2020 . BYOC: A \u201cBring Your Own Core \u201d Framework for Heterogeneous-ISA Research. In International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS). Jonathan Balkind, Katie Lim, Michael Schaffner, Fei Gao, Grigory Chirkov, Ang Li, Alexey Lavrov, Tri\u00a0M. Nguyen, Yaosheng Fu, Florian Zaruba, Kunal Gulati, Luca Benini, and David Wentzlaff. 2020. BYOC: A \u201cBring Your Own Core\u201d Framework for Heterogeneous-ISA Research. In International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS)."},{"key":"e_1_3_2_1_3_1","doi-asserted-by":"crossref","unstructured":"D. Bates A. Bradbury A. Koltes and R. Mullins. 2015. Exploiting tightly-coupled cores. In Journal of Signal Processing Systems Vol.\u00a080. 103\u2013120.  D. Bates A. Bradbury A. Koltes and R. Mullins. 2015. Exploiting tightly-coupled cores. In Journal of Signal Processing Systems Vol.\u00a080. 103\u2013120.","DOI":"10.1007\/s11265-014-0944-6"},{"key":"e_1_3_2_1_4_1","volume-title":"Cache Refill\/Access Decoupling for Vector Machines. In IEEE\/ACM International Symposium on Microarchitecture (MICRO).","author":"Batten Christopher","year":"2004","unstructured":"Christopher Batten , Ronny Krashinsky , Steve Gerding , and Krste Asanovi\u0107 . 2004 . Cache Refill\/Access Decoupling for Vector Machines. In IEEE\/ACM International Symposium on Microarchitecture (MICRO). Christopher Batten, Ronny Krashinsky, Steve Gerding, and Krste Asanovi\u0107. 2004. Cache Refill\/Access Decoupling for Vector Machines. In IEEE\/ACM International Symposium on Microarchitecture (MICRO)."},{"key":"e_1_3_2_1_5_1","unstructured":"Bespoke Silicon Group. [n.d.]. HammerBlade. https:\/\/github.com\/bespoke-silicon-group\/bsg_bladerunner.  Bespoke Silicon Group. [n.d.]. HammerBlade. https:\/\/github.com\/bespoke-silicon-group\/bsg_bladerunner."},{"key":"e_1_3_2_1_6_1","doi-asserted-by":"publisher","DOI":"10.1145\/2024716.2024718"},{"key":"e_1_3_2_1_7_1","volume-title":"In Proceedings of the 6th Workshop on Multithreaded Execution, Architecture, and Compilation.","author":"Brown A.","year":"2001","unstructured":"Jeffery\u00a0 A. Brown , Hong Wang , George Chrysos , Perry\u00a0 H. Wang , and John\u00a0 P. Shen . 2001 . Speculative precomputation on chip multiprocessors . In In Proceedings of the 6th Workshop on Multithreaded Execution, Architecture, and Compilation. Jeffery\u00a0A. Brown, Hong Wang, George Chrysos, Perry\u00a0H. Wang, and John\u00a0P. Shen. 2001. Speculative precomputation on chip multiprocessors. In In Proceedings of the 6th Workshop on Multithreaded Execution, Architecture, and Compilation."},{"key":"e_1_3_2_1_8_1","doi-asserted-by":"publisher","DOI":"10.1109\/MM.2018.022071133"},{"key":"e_1_3_2_1_9_1","doi-asserted-by":"publisher","DOI":"10.1109\/MM.2005.20"},{"key":"e_1_3_2_1_10_1","unstructured":"Scott Grauer-Gray and Louis-No\u00ebl Pouchet. 2012. PolyBench\/GPU: Implementation of PolyBench codes for GPU processing. URL: http:\/\/www.cs.ucla.edu\/pouchet\/software\/polybench.  Scott Grauer-Gray and Louis-No\u00ebl Pouchet. 2012. PolyBench\/GPU: Implementation of PolyBench codes for GPU processing. URL: http:\/\/www.cs.ucla.edu\/pouchet\/software\/polybench."},{"key":"e_1_3_2_1_11_1","volume-title":"Erasing Core Boundaries for Robust and Configurable Performance. In IEEE\/ACM International Symposium on Microarchitecture (MICRO).","author":"Gupta S.","unstructured":"S. Gupta , S. Feng , A. Ansari , and S. Mahlke . 2010 . Erasing Core Boundaries for Robust and Configurable Performance. In IEEE\/ACM International Symposium on Microarchitecture (MICRO). S. Gupta, S. Feng, A. Ansari, and S. Mahlke. 2010. Erasing Core Boundaries for Robust and Configurable Performance. In IEEE\/ACM International Symposium on Microarchitecture (MICRO)."},{"key":"e_1_3_2_1_12_1","volume-title":"2018 IEEE International Symposium on High Performance Computer Architecture (HPCA). 608\u2013619","author":"Gutierrez A.","unstructured":"A. Gutierrez , B.\u00a0 M. Beckmann , A. Dutu , J. Gross , M. LeBeane , J. Kalamatianos , O. Kayiran , M. Poremba , B. Potter , S. Puthoor , M.\u00a0 D. Sinclair , M. Wyse , J. Yin , X. Zhang , A. Jain , and T. Rogers . 2018. Lost in Abstraction: Pitfalls of Analyzing GPUs at the Intermediate Language Level . In 2018 IEEE International Symposium on High Performance Computer Architecture (HPCA). 608\u2013619 . A. Gutierrez, B.\u00a0M. Beckmann, A. Dutu, J. Gross, M. LeBeane, J. Kalamatianos, O. Kayiran, M. Poremba, B. Potter, S. Puthoor, M.\u00a0D. Sinclair, M. Wyse, J. Yin, X. Zhang, A. Jain, and T. Rogers. 2018. Lost in Abstraction: Pitfalls of Analyzing GPUs at the Intermediate Language Level. In 2018 IEEE International Symposium on High Performance Computer Architecture (HPCA). 608\u2013619."},{"key":"e_1_3_2_1_13_1","doi-asserted-by":"publisher","DOI":"10.1145\/1250662.1250686"},{"key":"e_1_3_2_1_14_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.micpro.2014.06.007"},{"key":"e_1_3_2_1_15_1","volume-title":"Composable Lightweight Processors. In IEEE\/ACM International Symposium on Microarchitecture (MICRO).","author":"Kim Changkyu","year":"2007","unstructured":"Changkyu Kim , Simha Sethumadhavan , Madhu Saravana\u00a0Sibi Govindan , Nitya Ranganathan , Divya Gulati , Doug Burger , and Stephen\u00a0 W. Keckler . 2007 . Composable Lightweight Processors. In IEEE\/ACM International Symposium on Microarchitecture (MICRO). Changkyu Kim, Simha Sethumadhavan, Madhu Saravana\u00a0Sibi Govindan, Nitya Ranganathan, Divya Gulati, Doug Burger, and Stephen\u00a0W. Keckler. 2007. Composable Lightweight Processors. In IEEE\/ACM International Symposium on Microarchitecture (MICRO)."},{"key":"e_1_3_2_1_16_1","doi-asserted-by":"publisher","DOI":"10.1109\/ISCA.2004.1310763"},{"key":"e_1_3_2_1_17_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICCD.2012.6378694"},{"key":"e_1_3_2_1_18_1","doi-asserted-by":"publisher","DOI":"10.1145\/2000064.2000080"},{"key":"e_1_3_2_1_20_1","unstructured":"Naveen Muralimanohar Ali Shafiee and Vaishnav Srinivas. [n.d.]. CACTI 6.5. https:\/\/github.com\/HewlettPackard\/cacti.  Naveen Muralimanohar Ali Shafiee and Vaishnav Srinivas. [n.d.]. CACTI 6.5. https:\/\/github.com\/HewlettPackard\/cacti."},{"key":"e_1_3_2_1_21_1","doi-asserted-by":"publisher","DOI":"10.1109\/MICRO.2001.991104"},{"key":"e_1_3_2_1_22_1","volume-title":"Libra: Tailoring SIMD Execution Using Heterogeneous Hardware and Dynamic Configurability. In 2012 45th Annual IEEE\/ACM International Symposium on Microarchitecture. 84\u201395","author":"Park Y.","unstructured":"Y. Park , J.\u00a0J.\u00a0 K. Park , H. Park , and S. Mahlke . 2012 . Libra: Tailoring SIMD Execution Using Heterogeneous Hardware and Dynamic Configurability. In 2012 45th Annual IEEE\/ACM International Symposium on Microarchitecture. 84\u201395 . Y. Park, J.\u00a0J.\u00a0K. Park, H. Park, and S. Mahlke. 2012. Libra: Tailoring SIMD Execution Using Heterogeneous Hardware and Dynamic Configurability. In 2012 45th Annual IEEE\/ACM International Symposium on Microarchitecture. 84\u201395."},{"key":"e_1_3_2_1_23_1","doi-asserted-by":"publisher","DOI":"10.1145\/2749469.2750410"},{"key":"e_1_3_2_1_24_1","doi-asserted-by":"publisher","DOI":"10.1145\/359327.359336"},{"key":"e_1_3_2_1_25_1","doi-asserted-by":"publisher","DOI":"10.1145\/1360612.1360617"},{"key":"e_1_3_2_1_26_1","volume-title":"Decoupled Access\/Execute Computer Architectures. In International Symposium on Computer Architecture (ISCA).","author":"Smith E.","year":"1982","unstructured":"James\u00a0 E. Smith . 1982 . Decoupled Access\/Execute Computer Architectures. In International Symposium on Computer Architecture (ISCA). James\u00a0E. Smith. 1982. Decoupled Access\/Execute Computer Architectures. In International Symposium on Computer Architecture (ISCA)."},{"key":"e_1_3_2_1_27_1","doi-asserted-by":"publisher","DOI":"10.1109\/MM.2017.35"},{"key":"e_1_3_2_1_28_1","doi-asserted-by":"publisher","DOI":"10.1145\/378993.379247"},{"key":"e_1_3_2_1_29_1","volume-title":"Stitch: Fusible Heterogeneous Accelerators Enmeshed with Many-Core Architecture for Wearables. In International Symposium on Computer Architecture (ISCA).","author":"Tan Cheng","year":"2018","unstructured":"Cheng Tan , Manupa Karunaratne , Tulika Mitra , and Li-Shiuan Peh . 2018 . Stitch: Fusible Heterogeneous Accelerators Enmeshed with Many-Core Architecture for Wearables. In International Symposium on Computer Architecture (ISCA). Cheng Tan, Manupa Karunaratne, Tulika Mitra, and Li-Shiuan Peh. 2018. Stitch: Fusible Heterogeneous Accelerators Enmeshed with Many-Core Architecture for Wearables. In International Symposium on Computer Architecture (ISCA)."},{"key":"e_1_3_2_1_30_1","doi-asserted-by":"publisher","DOI":"10.1145\/1391469.1391666"},{"key":"e_1_3_2_1_31_1","doi-asserted-by":"publisher","DOI":"10.1109\/MM.2002.997877"},{"key":"e_1_3_2_1_32_1","doi-asserted-by":"publisher","DOI":"10.1145\/305138.305188"},{"key":"e_1_3_2_1_33_1","volume-title":"High-Performance Computing on the Intel Xeon Phi: How to Fully Exploit MIC Architectures","author":"Wang Endong","unstructured":"Endong Wang , Qing Zhang , Bo Shen , Guangyong Zhang , Xiaowei Lu , Qing Wu , and Yajuan Wang . 2014. High-Performance Computing on the Intel Xeon Phi: How to Fully Exploit MIC Architectures . Springer . Endong Wang, Qing Zhang, Bo Shen, Guangyong Zhang, Xiaowei Lu, Qing Wu, and Yajuan Wang. 2014. High-Performance Computing on the Intel Xeon Phi: How to Fully Exploit MIC Architectures. Springer."},{"key":"e_1_3_2_1_35_1","doi-asserted-by":"publisher","DOI":"10.1109\/TVLSI.2019.2926114"},{"key":"e_1_3_2_1_36_1","doi-asserted-by":"publisher","DOI":"10.1109\/HPCA.2007.346182"}],"event":{"name":"MICRO '21: 54th Annual IEEE\/ACM International Symposium on Microarchitecture","location":"Virtual Event Greece","acronym":"MICRO '21","sponsor":["SIGMICRO ACM Special Interest Group on Microarchitectural Research and Processing"]},"container-title":["MICRO-54: 54th Annual IEEE\/ACM International Symposium on Microarchitecture"],"original-title":[],"link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3466752.3480099","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/abs\/10.1145\/3466752.3480099","content-type":"text\/html","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3466752.3480099","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3466752.3480099","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T20:18:56Z","timestamp":1750191536000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3466752.3480099"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2021,10,17]]},"references-count":34,"alternative-id":["10.1145\/3466752.3480099","10.1145\/3466752"],"URL":"https:\/\/doi.org\/10.1145\/3466752.3480099","relation":{},"subject":[],"published":{"date-parts":[[2021,10,17]]},"assertion":[{"value":"2021-10-17","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}