{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,12]],"date-time":"2026-03-12T01:11:22Z","timestamp":1773277882283,"version":"3.50.1"},"publisher-location":"New York, NY, USA","reference-count":60,"publisher":"ACM","license":[{"start":{"date-parts":[[2022,6,11]],"date-time":"2022-06-11T00:00:00Z","timestamp":1654905600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2022,6,18]]},"DOI":"10.1145\/3470496.3533040","type":"proceedings-article","created":{"date-parts":[[2022,5,31]],"date-time":"2022-05-31T19:06:01Z","timestamp":1654023961000},"page":"978-992","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":18,"title":["The Mozart reuse exposed dataflow processor for AI and beyond"],"prefix":"10.1145","author":[{"given":"Karthikeyan","family":"Sankaralingam","sequence":"first","affiliation":[{"name":"SimpleMachines Inc."}]},{"given":"Tony","family":"Nowatzki","sequence":"additional","affiliation":[{"name":"UCLA"}]},{"given":"Vinay","family":"Gangadhar","sequence":"additional","affiliation":[{"name":"SimpleMachines Inc."}]},{"given":"Preyas","family":"Shah","sequence":"additional","affiliation":[{"name":"SimpleMachines Inc."}]},{"given":"Michael","family":"Davies","sequence":"additional","affiliation":[{"name":"SimpleMachines Inc."}]},{"given":"William","family":"Galliher","sequence":"additional","affiliation":[{"name":"SimpleMachines Inc."}]},{"given":"Ziliang","family":"Guo","sequence":"additional","affiliation":[{"name":"SimpleMachines Inc."}]},{"given":"Jitu","family":"Khare","sequence":"additional","affiliation":[{"name":"SimpleMachines Inc."}]},{"given":"Deepak","family":"Vijay","sequence":"additional","affiliation":[{"name":"SimpleMachines Inc."}]},{"given":"Poly","family":"Palamuttam","sequence":"additional","affiliation":[{"name":"SimpleMachines Inc."}]},{"given":"Maghawan","family":"Punde","sequence":"additional","affiliation":[{"name":"SimpleMachines Inc."}]},{"given":"Alex","family":"Tan","sequence":"additional","affiliation":[{"name":"SimpleMachines Inc."}]},{"given":"Vijay","family":"Thiruvengadam","sequence":"additional","affiliation":[{"name":"SimpleMachines Inc."}]},{"given":"Rongyi","family":"Wang","sequence":"additional","affiliation":[{"name":"SimpleMachines Inc."}]},{"given":"Shunmiao","family":"Xu","sequence":"additional","affiliation":[{"name":"SimpleMachines Inc."}]}],"member":"320","published-online":{"date-parts":[[2022,6,11]]},"reference":[{"key":"e_1_3_2_1_1_1","doi-asserted-by":"publisher","DOI":"10.1109\/ISCA45697.2020.00023"},{"key":"e_1_3_2_1_2_1","doi-asserted-by":"crossref","unstructured":"Ehsan K. Ardestani Changkyu Kim Seung Jae Lee Luoshang Pan Valmiki Rampersad Jens Axboe Banit Agrawal Fuxun Yu Ansha Yu Trung Le Hector Yuen Shishir Juluri Akshat Nanda Manoj Wodekar Dheevatsa Mudigere Krishnakumar Nair Maxim Naumov Chris Peterson Mikhail Smelyanskiy and Vijay Rao. 2021. Supporting Massive DLRM Inference Through Software Defined Memory. arXiv:2110.11489 [cs.AR]  Ehsan K. Ardestani Changkyu Kim Seung Jae Lee Luoshang Pan Valmiki Rampersad Jens Axboe Banit Agrawal Fuxun Yu Ansha Yu Trung Le Hector Yuen Shishir Juluri Akshat Nanda Manoj Wodekar Dheevatsa Mudigere Krishnakumar Nair Maxim Naumov Chris Peterson Mikhail Smelyanskiy and Vijay Rao. 2021. Supporting Massive DLRM Inference Through Software Defined Memory. arXiv:2110.11489 [cs.AR]","DOI":"10.1109\/ICDCS54860.2022.00037"},{"key":"e_1_3_2_1_4_1","doi-asserted-by":"publisher","DOI":"10.1145\/3085572"},{"key":"e_1_3_2_1_5_1","unstructured":"Somashekaracharya G. Bhaskaracharya Julien Demouth and Vinod Grover. 2020. Automatic Kernel Generation for Volta Tensor Cores. arXiv:2006.12645 [cs.PL]  Somashekaracharya G. Bhaskaracharya Julien Demouth and Vinod Grover. 2020. Automatic Kernel Generation for Volta Tensor Cores. arXiv:2006.12645 [cs.PL]"},{"key":"e_1_3_2_1_6_1","doi-asserted-by":"publisher","DOI":"10.1145\/2541940.2541967"},{"key":"e_1_3_2_1_7_1","doi-asserted-by":"publisher","DOI":"10.1145\/2939672.2939785"},{"key":"e_1_3_2_1_8_1","volume-title":"13th USENIX Symposium on Operating Systems Design and Implementation (OSDI 18)","author":"Chen Tianqi","year":"2018","unstructured":"Tianqi Chen , Thierry Moreau , Ziheng Jiang , Lianmin Zheng , Eddie Yan , Haichen Shen , Meghan Cowan , Leyuan Wang , Yuwei Hu , Luis Ceze , 2018 . {TVM}: An Automated {End-to-End} Optimizing Compiler for Deep Learning . In 13th USENIX Symposium on Operating Systems Design and Implementation (OSDI 18) . 578--594. Tianqi Chen, Thierry Moreau, Ziheng Jiang, Lianmin Zheng, Eddie Yan, Haichen Shen, Meghan Cowan, Leyuan Wang, Yuwei Hu, Luis Ceze, et al. 2018. {TVM}: An Automated {End-to-End} Optimizing Compiler for Deep Learning. In 13th USENIX Symposium on Operating Systems Design and Implementation (OSDI 18). 578--594."},{"key":"e_1_3_2_1_9_1","doi-asserted-by":"publisher","DOI":"10.1109\/JETCAS.2019.2910232"},{"key":"e_1_3_2_1_10_1","doi-asserted-by":"publisher","DOI":"10.23919\/VLSICircuits52068.2021.9492517"},{"key":"e_1_3_2_1_11_1","volume-title":"Why a 24-Year-Old Chipmaker Is One of Tech's Hot Prospects. New York Times (September","author":"Clark Don","year":"2017","unstructured":"Don Clark . 2017. Why a 24-Year-Old Chipmaker Is One of Tech's Hot Prospects. New York Times (September 2017 ). Don Clark. 2017. Why a 24-Year-Old Chipmaker Is One of Tech's Hot Prospects. New York Times (September 2017)."},{"key":"e_1_3_2_1_12_1","volume-title":"1st Workshop on Computer Architecture Research with RISC-V.","author":"Cook Henry","year":"2017","unstructured":"Henry Cook , Wesley Terpstra , and Yunsup Lee . 2017 . Diplomatic design patterns: A TileLink case study . In 1st Workshop on Computer Architecture Research with RISC-V. Henry Cook, Wesley Terpstra, and Yunsup Lee. 2017. Diplomatic design patterns: A TileLink case study. In 1st Workshop on Computer Architecture Research with RISC-V."},{"key":"e_1_3_2_1_13_1","doi-asserted-by":"publisher","DOI":"10.1145\/115372.115320"},{"key":"e_1_3_2_1_14_1","doi-asserted-by":"publisher","DOI":"10.1109\/ISCA52012.2021.00053"},{"key":"e_1_3_2_1_15_1","doi-asserted-by":"publisher","DOI":"10.1145\/3503222.3507706"},{"key":"e_1_3_2_1_16_1","doi-asserted-by":"publisher","DOI":"10.1145\/3352460.3358276"},{"key":"e_1_3_2_1_17_1","volume-title":"XIA SONG, Subhojit Som, Kaustav Das, Saurabh T, Steve Reinhardt, Sitaram Lanka, Eric Chung, and Doug Burger.","author":"Rouhani Bita Darvish","year":"2020","unstructured":"Bita Darvish Rouhani , Daniel Lo , Ritchie Zhao , Ming Liu , Jeremy Fowers , Kalin Ovtcharov , Anna Vinogradsky , Sarah Massengill , Lita Yang , Ray Bittner , Alessandro Forin , Haishan Zhu , Taesik Na , Prerak Patel , Shuai Che , Lok Chand Koppaka , XIA SONG, Subhojit Som, Kaustav Das, Saurabh T, Steve Reinhardt, Sitaram Lanka, Eric Chung, and Doug Burger. 2020 . Pushing the Limits of Narrow Precision Inferencing at Cloud Scale with Microsoft Floating Point. In Advances in Neural Information Processing Systems, H. Larochelle, M. Ranzato, R. Hadsell, M. F. Balcan, and H. Lin (Eds.), Vol. 33 . Curran Associates, Inc ., 10271--10281. https:\/\/proceedings.neurips.cc\/paper\/2020\/file\/747e32ab0fea7fbd2ad9ec03daa3f840-Paper.pdf Bita Darvish Rouhani, Daniel Lo, Ritchie Zhao, Ming Liu, Jeremy Fowers, Kalin Ovtcharov, Anna Vinogradsky, Sarah Massengill, Lita Yang, Ray Bittner, Alessandro Forin, Haishan Zhu, Taesik Na, Prerak Patel, Shuai Che, Lok Chand Koppaka, XIA SONG, Subhojit Som, Kaustav Das, Saurabh T, Steve Reinhardt, Sitaram Lanka, Eric Chung, and Doug Burger. 2020. Pushing the Limits of Narrow Precision Inferencing at Cloud Scale with Microsoft Floating Point. In Advances in Neural Information Processing Systems, H. Larochelle, M. Ranzato, R. Hadsell, M. F. Balcan, and H. Lin (Eds.), Vol. 33. Curran Associates, Inc., 10271--10281. https:\/\/proceedings.neurips.cc\/paper\/2020\/file\/747e32ab0fea7fbd2ad9ec03daa3f840-Paper.pdf"},{"key":"e_1_3_2_1_18_1","doi-asserted-by":"publisher","DOI":"10.1109\/JPROC.2021.3098483"},{"key":"e_1_3_2_1_19_1","volume-title":"IEEE International Parallel and Distributed Processing Symposium.","author":"Domke Jens","year":"2021","unstructured":"Jens Domke , Emil Vatai , Aleksandr Drozd , Peng Chen , Yosuke Oyama , Lingqi Zhang , Shweta Salaria , Daichi Mukunoki , Artur Podobas , Mohamed Wahib , and Satoshi Matsuoka . 2021 . Matrix Engines for High Performance Computing:A Paragon of Performance or Grasping at Straws? . In IEEE International Parallel and Distributed Processing Symposium. Jens Domke, Emil Vatai, Aleksandr Drozd, Peng Chen, Yosuke Oyama, Lingqi Zhang, Shweta Salaria, Daichi Mukunoki, Artur Podobas, Mohamed Wahib, and Satoshi Matsuoka. 2021. Matrix Engines for High Performance Computing:A Paragon of Performance or Grasping at Straws?. In IEEE International Parallel and Distributed Processing Symposium."},{"key":"e_1_3_2_1_20_1","unstructured":"Junfeng Dong John Morgan and Li Tian. 2019. Accelerating Compute-Intensive Workloads with Intel AVX-512. https:\/\/devblogs.microsoft.com\/cppblog\/accelerating-compute-intensive-workloads-with-intel-avx-512\/  Junfeng Dong John Morgan and Li Tian. 2019. Accelerating Compute-Intensive Workloads with Intel AVX-512. https:\/\/devblogs.microsoft.com\/cppblog\/accelerating-compute-intensive-workloads-with-intel-avx-512\/"},{"key":"e_1_3_2_1_21_1","doi-asserted-by":"publisher","DOI":"10.1109\/SC.2018.00069"},{"key":"e_1_3_2_1_22_1","unstructured":"DFI Group. 2021. DFI Specification. http:\/\/www.ddr-phy.org\/page\/page\/show?id=2351641%3APage%3A301  DFI Group. 2021. DFI Specification. http:\/\/www.ddr-phy.org\/page\/page\/show?id=2351641%3APage%3A301"},{"key":"e_1_3_2_1_23_1","doi-asserted-by":"publisher","DOI":"10.1109\/ISPASS.2019.00011"},{"key":"e_1_3_2_1_24_1","doi-asserted-by":"publisher","DOI":"10.1109\/ISCA.2016.30"},{"key":"e_1_3_2_1_25_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICASSP.2019.8682336"},{"key":"e_1_3_2_1_26_1","doi-asserted-by":"publisher","DOI":"10.1145\/3467017"},{"key":"e_1_3_2_1_27_1","doi-asserted-by":"publisher","DOI":"10.1145\/2541940.2541981"},{"key":"e_1_3_2_1_28_1","doi-asserted-by":"crossref","unstructured":"Qijing Huang Minwoo Kang Grace Dinh Thomas Norell Aravind Kalaiah James Demmel John Wawrzynek and Yakun Sophia Shao. 2021. CoSA: Scheduling by Constrained Optimization for Spatial Accelerators. In ISCA.  Qijing Huang Minwoo Kang Grace Dinh Thomas Norell Aravind Kalaiah James Demmel John Wawrzynek and Yakun Sophia Shao. 2021. CoSA: Scheduling by Constrained Optimization for Spatial Accelerators. In ISCA.","DOI":"10.1109\/ISCA52012.2021.00050"},{"key":"e_1_3_2_1_29_1","unstructured":"Intel. 2022. Intel 64 and IA-32 Architectures Optimization Reference Manual Chapter 15. https:\/\/www.intel.com\/content\/www\/us\/en\/develop\/download\/intel-64-and-ia-32-architectures-optimization-reference-manual.html  Intel. 2022. Intel 64 and IA-32 Architectures Optimization Reference Manual Chapter 15. https:\/\/www.intel.com\/content\/www\/us\/en\/develop\/download\/intel-64-and-ia-32-architectures-optimization-reference-manual.html"},{"key":"e_1_3_2_1_30_1","unstructured":"Jeff Johnson. 2018. Rethinking floating point for deep learning. arXiv:1811.01721 [cs.NA]  Jeff Johnson. 2018. Rethinking floating point for deep learning. arXiv:1811.01721 [cs.NA]"},{"key":"e_1_3_2_1_31_1","doi-asserted-by":"publisher","DOI":"10.1109\/ISCA52012.2021.00010"},{"key":"e_1_3_2_1_32_1","doi-asserted-by":"publisher","DOI":"10.1145\/3360307"},{"key":"e_1_3_2_1_33_1","doi-asserted-by":"publisher","DOI":"10.1109\/ISCA52012.2021.00059"},{"key":"e_1_3_2_1_34_1","doi-asserted-by":"crossref","unstructured":"Yunsup Lee Andrew Waterman Henry Cook Brian Zimmer Ben Keller Alberto Puggelli Jaehwa Kwak Ruzica Jevtic Stevo Bailey Milovan Blagojevic etal 2016. An agile approach to building RISC-V microprocessors. ieee Micro 36 2 (2016) 8--20.  Yunsup Lee Andrew Waterman Henry Cook Brian Zimmer Ben Keller Alberto Puggelli Jaehwa Kwak Ruzica Jevtic Stevo Bailey Milovan Blagojevic et al. 2016. An agile approach to building RISC-V microprocessors. ieee Micro 36 2 (2016) 8--20.","DOI":"10.1109\/MM.2016.11"},{"key":"e_1_3_2_1_35_1","doi-asserted-by":"publisher","DOI":"10.1109\/tpds.2020.3030548"},{"key":"e_1_3_2_1_36_1","doi-asserted-by":"publisher","DOI":"10.1145\/1669112.1669172"},{"key":"e_1_3_2_1_37_1","volume-title":"LISA: Graph Neural Network Based Portable Mapping on Spatial Accelerators. In 2022 IEEE International Symposium on High Performance Computer Architecture (HPCA).","author":"Li Zhaoying","year":"2022","unstructured":"Zhaoying Li , Dan Wu , DM Dhananjaya Wijerathne , and Mitra Tulika . 2022 . LISA: Graph Neural Network Based Portable Mapping on Spatial Accelerators. In 2022 IEEE International Symposium on High Performance Computer Architecture (HPCA). Zhaoying Li, Dan Wu, DM Dhananjaya Wijerathne, and Mitra Tulika. 2022. LISA: Graph Neural Network Based Portable Mapping on Spatial Accelerators. In 2022 IEEE International Symposium on High Performance Computer Architecture (HPCA)."},{"key":"e_1_3_2_1_38_1","doi-asserted-by":"publisher","DOI":"10.1109\/ISCA.2016.42"},{"key":"e_1_3_2_1_39_1","doi-asserted-by":"publisher","DOI":"10.1109\/TNNLS.2021.3071762"},{"key":"e_1_3_2_1_40_1","doi-asserted-by":"publisher","DOI":"10.1145\/3243176.3243212"},{"key":"e_1_3_2_1_41_1","volume-title":"Stream-Dataflow Acceleration. In Proceedings of the 44th International Symposium on Computer Architecture.","author":"Nowatzki Tony","year":"2017","unstructured":"Tony Nowatzki , Vinay Gangadhar , Newsha Ardalani , and Karthikeyan Sankaralingam . 2017 . Stream-Dataflow Acceleration. In Proceedings of the 44th International Symposium on Computer Architecture. Tony Nowatzki, Vinay Gangadhar, Newsha Ardalani, and Karthikeyan Sankaralingam. 2017. Stream-Dataflow Acceleration. In Proceedings of the 44th International Symposium on Computer Architecture."},{"key":"e_1_3_2_1_42_1","doi-asserted-by":"publisher","DOI":"10.1145\/2749469.2750380"},{"key":"e_1_3_2_1_43_1","volume-title":"Proceedings of 34th International Conference on Programming Language Design and Implementation. Distinguished Paper Award. SIGPLAN Research Highlights Nominee.","author":"Nowatzki Tony","year":"2013","unstructured":"Tony Nowatzki , Michael Sartin-Tarm , Lorenzo De Carli , Karthikeyan Sankaralingam , Cristian Estan , and Behnam Robatmili . 2013 . A General Constraintcentric Scheduling Framework for Spatial Architectures . In Proceedings of 34th International Conference on Programming Language Design and Implementation. Distinguished Paper Award. SIGPLAN Research Highlights Nominee. Tony Nowatzki, Michael Sartin-Tarm, Lorenzo De Carli, Karthikeyan Sankaralingam, Cristian Estan, and Behnam Robatmili. 2013. A General Constraintcentric Scheduling Framework for Spatial Architectures. In Proceedings of 34th International Conference on Programming Language Design and Implementation. Distinguished Paper Award. SIGPLAN Research Highlights Nominee."},{"key":"e_1_3_2_1_44_1","unstructured":"NVIDIA. 2021. CUTLASS 2.8. https:\/\/github.com\/NVIDIA\/cutlass  NVIDIA. 2021. CUTLASS 2.8. https:\/\/github.com\/NVIDIA\/cutlass"},{"key":"e_1_3_2_1_45_1","unstructured":"Vijay Pradeep. 2017. Ethereum Memory Hardness Explained. https:\/\/www.vijaypradeep.com\/blog\/2017-04-28-ethereums-memory-hardness-explained  Vijay Pradeep. 2017. Ethereum Memory Hardness Explained. https:\/\/www.vijaypradeep.com\/blog\/2017-04-28-ethereums-memory-hardness-explained"},{"key":"e_1_3_2_1_46_1","doi-asserted-by":"publisher","DOI":"10.1145\/2735841"},{"key":"e_1_3_2_1_47_1","doi-asserted-by":"publisher","DOI":"10.1145\/2491956.2462176"},{"key":"e_1_3_2_1_48_1","doi-asserted-by":"publisher","DOI":"10.1109\/hpec43674.2020.9286149"},{"key":"e_1_3_2_1_49_1","unstructured":"Eric Ries. 2011. The Lean Startup. Currency.  Eric Ries. 2011. The Lean Startup. Currency."},{"key":"e_1_3_2_1_50_1","unstructured":"Karthikeyan Sankaralingam Vinay Gangadhar Anthony Nowatzki and Yunfeng Li. 2021. Method computer program product and apparatus for acceleration of simultaneous access to shared data. https:\/\/patents.google.com\/patent\/US10963384B2\/en?oq=US10963384B2  Karthikeyan Sankaralingam Vinay Gangadhar Anthony Nowatzki and Yunfeng Li. 2021. Method computer program product and apparatus for acceleration of simultaneous access to shared data. https:\/\/patents.google.com\/patent\/US10963384B2\/en?oq=US10963384B2"},{"key":"e_1_3_2_1_51_1","unstructured":"Karthikeyan Sankaralingam Yunfeng Li Vinay Gangadhar and Anthony Nowatzki. 2020. Accelerating parallel processing of data in a recurrent neural network. https:\/\/patents.google.com\/patent\/US20200218965A1\/en?oq=US2020218965A1  Karthikeyan Sankaralingam Yunfeng Li Vinay Gangadhar and Anthony Nowatzki. 2020. Accelerating parallel processing of data in a recurrent neural network. https:\/\/patents.google.com\/patent\/US20200218965A1\/en?oq=US2020218965A1"},{"key":"e_1_3_2_1_52_1","unstructured":"Karthikeyan Sankaralingam Anthony Nowatzki Vinay Gangadhar Preyas Shah and Newsha Ardalani. 2021. Systems and methods for stream-dataflow acceleration wherein a delay is implemented so as to equalize arrival times of data packets at a destination functional unit. https:\/\/patents.google.com\/patent\/US11048661B2\/en?oq=US11048661B2  Karthikeyan Sankaralingam Anthony Nowatzki Vinay Gangadhar Preyas Shah and Newsha Ardalani. 2021. Systems and methods for stream-dataflow acceleration wherein a delay is implemented so as to equalize arrival times of data packets at a destination functional unit. https:\/\/patents.google.com\/patent\/US11048661B2\/en?oq=US11048661B2"},{"key":"e_1_3_2_1_53_1","volume-title":"Measuring the Effects of Data Parallelism on Neural Network Training. Journal of Machine Learning Research (11","author":"Shallue Christopher","year":"2018","unstructured":"Christopher Shallue , Jaehoon Lee , Joe Antognini , Jascha Sohl-Dickstein , Roy Frostig , and George Dahl . 2018. Measuring the Effects of Data Parallelism on Neural Network Training. Journal of Machine Learning Research (11 2018 ). Christopher Shallue, Jaehoon Lee, Joe Antognini, Jascha Sohl-Dickstein, Roy Frostig, and George Dahl. 2018. Measuring the Effects of Data Parallelism on Neural Network Training. Journal of Machine Learning Research (11 2018)."},{"key":"e_1_3_2_1_54_1","unstructured":"TESLA. 2021. Tesla Dojo Technology --- A Guide to Tesla's Configurable Floating Point Formats Arithmetic. https:\/\/tesla-cdn.thron.com\/delivery\/public\/document\/tesla\/bc895d60-8220-4323-a5ba-e21452d786c0\/bvlatuR\/WEB\/tesla-dojo-technology  TESLA. 2021. Tesla Dojo Technology --- A Guide to Tesla's Configurable Floating Point Formats Arithmetic. https:\/\/tesla-cdn.thron.com\/delivery\/public\/document\/tesla\/bc895d60-8220-4323-a5ba-e21452d786c0\/bvlatuR\/WEB\/tesla-dojo-technology"},{"key":"e_1_3_2_1_55_1","volume-title":"\u0141 ukasz Kaiser, and Illia Polosukhin","author":"Vaswani Ashish","year":"2017","unstructured":"Ashish Vaswani , Noam Shazeer , Niki Parmar , Jakob Uszkoreit , Llion Jones , Aidan N Gomez , \u0141 ukasz Kaiser, and Illia Polosukhin . 2017 . Attention is All you Need. In Advances in Neural Information Processing Systems, I. Guyon, U. Von Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett (Eds.), Vol. 30 . Curran Associates, Inc . https:\/\/proceedings.neurips.cc\/paper\/2017\/file\/3f5ee243547dee91fbd053c1c4a845aa-Paper.pdf Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, \u0141 ukasz Kaiser, and Illia Polosukhin. 2017. Attention is All you Need. In Advances in Neural Information Processing Systems, I. Guyon, U. Von Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett (Eds.), Vol. 30. Curran Associates, Inc. https:\/\/proceedings.neurips.cc\/paper\/2017\/file\/3f5ee243547dee91fbd053c1c4a845aa-Paper.pdf"},{"key":"e_1_3_2_1_56_1","doi-asserted-by":"publisher","DOI":"10.1109\/LCA.2018.2849064"},{"key":"e_1_3_2_1_57_1","doi-asserted-by":"publisher","DOI":"10.1145\/3307650.3322229"},{"key":"e_1_3_2_1_58_1","volume-title":"Near-Stream Computing: General and Transparent Near-Cache Acceleration. HPCA. https:\/\/seanzw.github.io\/pub\/hpca2022-near-stream-computing.pdf","author":"Wang Zhengrong","year":"2022","unstructured":"Zhengrong Wang , Jian Weng , Sihao Liu , and Tony Nowatzki . 2022. Near-Stream Computing: General and Transparent Near-Cache Acceleration. HPCA. https:\/\/seanzw.github.io\/pub\/hpca2022-near-stream-computing.pdf ( 2022 ). Zhengrong Wang, Jian Weng, Sihao Liu, and Tony Nowatzki. 2022. Near-Stream Computing: General and Transparent Near-Cache Acceleration. HPCA. https:\/\/seanzw.github.io\/pub\/hpca2022-near-stream-computing.pdf (2022)."},{"key":"e_1_3_2_1_59_1","doi-asserted-by":"publisher","DOI":"10.1109\/HPCA51647.2021.00060"},{"key":"e_1_3_2_1_60_1","doi-asserted-by":"publisher","DOI":"10.1109\/HPCA47549.2020.00063"},{"key":"e_1_3_2_1_61_1","volume-title":"SambaNova Takes On Nvidia's DGX. (Feb","author":"Wheeler Bob","year":"2021","unstructured":"Bob Wheeler . 2021. SambaNova Takes On Nvidia's DGX. (Feb 2021 ). Bob Wheeler. 2021. SambaNova Takes On Nvidia's DGX. (Feb 2021)."}],"event":{"name":"ISCA '22: The 49th Annual International Symposium on Computer Architecture","location":"New York New York","acronym":"ISCA '22","sponsor":["SIGARCH ACM Special Interest Group on Computer Architecture","IEEE CS TCAA IEEE CS technical committee on architectural acoustics"]},"container-title":["Proceedings of the 49th Annual International Symposium on Computer Architecture"],"original-title":[],"link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3470496.3533040","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3470496.3533040","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T20:18:54Z","timestamp":1750191534000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3470496.3533040"}},"subtitle":["industrial product"],"short-title":[],"issued":{"date-parts":[[2022,6,11]]},"references-count":60,"alternative-id":["10.1145\/3470496.3533040","10.1145\/3470496"],"URL":"https:\/\/doi.org\/10.1145\/3470496.3533040","relation":{},"subject":[],"published":{"date-parts":[[2022,6,11]]},"assertion":[{"value":"2022-06-11","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}