{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,5,2]],"date-time":"2026-05-02T15:14:43Z","timestamp":1777734883760,"version":"3.51.4"},"publisher-location":"New York, NY, USA","reference-count":29,"publisher":"ACM","license":[{"start":{"date-parts":[[2018,2,15]],"date-time":"2018-02-15T00:00:00Z","timestamp":1518652800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"name":"Australian Research Councils Linkage Projects","award":["LP130101034"],"award-info":[{"award-number":["LP130101034"]}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2018,2,15]]},"DOI":"10.1145\/3174243.3174258","type":"proceedings-article","created":{"date-parts":[[2018,2,23]],"date-time":"2018-02-23T16:12:59Z","timestamp":1519402379000},"page":"107-116","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":60,"title":["A Customizable Matrix Multiplication Framework for the Intel HARPv2 Xeon+FPGA Platform"],"prefix":"10.1145","author":[{"given":"Duncan J.M","family":"Moss","sequence":"first","affiliation":[{"name":"University of Sydney, Sydney, Australia"}]},{"given":"Srivatsan","family":"Krishnan","sequence":"additional","affiliation":[{"name":"Intel Corporation, Hillsboro, OR, USA"}]},{"given":"Eriko","family":"Nurvitadhi","sequence":"additional","affiliation":[{"name":"Intel Corporation, Hillsboro, OR, USA"}]},{"given":"Piotr","family":"Ratuszniak","sequence":"additional","affiliation":[{"name":"Intel Corporation &amp; Koszalin University of Technology, Gdansk, Poland"}]},{"given":"Chris","family":"Johnson","sequence":"additional","affiliation":[{"name":"Intel Corporation, Hillsboro, OR, USA"}]},{"given":"Jaewoong","family":"Sim","sequence":"additional","affiliation":[{"name":"Intel Corporation, Hillsboro, OR, USA"}]},{"given":"Asit","family":"Mishra","sequence":"additional","affiliation":[{"name":"Intel Corporation, Hillsboro, OR, USA"}]},{"given":"Debbie","family":"Marr","sequence":"additional","affiliation":[{"name":"Intel Corporation, Hillsboro, OR, USA"}]},{"given":"Suchit","family":"Subhaschandra","sequence":"additional","affiliation":[{"name":"Intel Corporation, Hillsboro, OR, USA"}]},{"given":"Philip H.W.","family":"Leong","sequence":"additional","affiliation":[{"name":"University of Sydney, Sydney, Australia"}]}],"member":"320","published-online":{"date-parts":[[2018,2,15]]},"reference":[{"key":"e_1_3_2_1_1_1","doi-asserted-by":"publisher","DOI":"10.1145\/567806.567807"},{"key":"e_1_3_2_1_2_1","unstructured":"Firas Abuzaid Stefan Hadjis Ce Zhang and Christopher R\u00e9. 2015. Caffe con Troll: Shallow Ideas to Speed Up Deep Learning. CoRR abs\/1504.04343 (2015). http:\/\/arxiv.org\/abs\/1504.04343  Firas Abuzaid Stefan Hadjis Ce Zhang and Christopher R\u00e9. 2015. Caffe con Troll: Shallow Ideas to Speed Up Deep Learning. CoRR abs\/1504.04343 (2015). http:\/\/arxiv.org\/abs\/1504.04343"},{"key":"e_1_3_2_1_3_1","doi-asserted-by":"publisher","DOI":"10.1145\/3020078.3021738"},{"key":"e_1_3_2_1_4_1","unstructured":"Sharan Chetlur Cliff Woolley Philippe Vandermersch Jonathan Cohen John Tran Bryan Catanzaro and Evan Shelhamer. 2014. cuDNN: Efficient Primitives for Deep Learning. CoRR abs\/1410.0759 (2014). http:\/\/arxiv.org\/abs\/1410.0759  Sharan Chetlur Cliff Woolley Philippe Vandermersch Jonathan Cohen John Tran Bryan Catanzaro and Evan Shelhamer. 2014. cuDNN: Efficient Primitives for Deep Learning. CoRR abs\/1410.0759 (2014). http:\/\/arxiv.org\/abs\/1410.0759"},{"key":"e_1_3_2_1_5_1","doi-asserted-by":"publisher","DOI":"10.1145\/2897937.2897972"},{"key":"e_1_3_2_1_6_1","unstructured":"Taiwan Semiconductor Manufacturing Company. 2013. TSMC 16\/12nm Technology. (2013). http:\/\/www.tsmc.com\/english\/dedicatedFoundry\/technology\/16nm. htm  Taiwan Semiconductor Manufacturing Company. 2013. TSMC 16\/12nm Technology. (2013). http:\/\/www.tsmc.com\/english\/dedicatedFoundry\/technology\/16nm. htm"},{"key":"e_1_3_2_1_8_1","unstructured":"Matthieu Courbariaux Yoshua Bengio and Jean-Pierre David. 2015. BinaryConnect: Training Deep Neural Networks with Binary Weights During Propagations. In NIPS.   Matthieu Courbariaux Yoshua Bengio and Jean-Pierre David. 2015. BinaryConnect: Training Deep Neural Networks with Binary Weights During Propagations. In NIPS."},{"key":"e_1_3_2_1_9_1","unstructured":"Matthieu Courbariaux Itay Hubara Daniel Soudry Ran El-Yaniv and Yoshua Bengio. 2016. Binarized Neural Networks: Training Deep Neural Networks with Weights and Activations Constrained to +1 or -1. arXiv:1602.02830 (2016).  Matthieu Courbariaux Itay Hubara Daniel Soudry Ran El-Yaniv and Yoshua Bengio. 2016. Binarized Neural Networks: Training Deep Neural Networks with Weights and Activations Constrained to +1 or -1. arXiv:1602.02830 (2016)."},{"key":"e_1_3_2_1_10_1","doi-asserted-by":"publisher","DOI":"10.1145\/3029580.3029586"},{"key":"e_1_3_2_1_11_1","doi-asserted-by":"crossref","unstructured":"Yijin Guan Hao Liang Ningyi Xu Wenqiang Wang Shaoshuai Shi Xi Chen Guangyu Sun Wei Zhang and Jason Cong. 2017. FP-DNN: An Automated Framework for Mapping Deep Neural Networks onto FPGAs with RTL-HLS Hybrid Templates. In FCCM.  Yijin Guan Hao Liang Ningyi Xu Wenqiang Wang Shaoshuai Shi Xi Chen Guangyu Sun Wei Zhang and Jason Cong. 2017. FP-DNN: An Automated Framework for Mapping Deep Neural Networks onto FPGAs with RTL-HLS Hybrid Templates. In FCCM.","DOI":"10.1109\/FCCM.2017.25"},{"key":"e_1_3_2_1_12_1","unstructured":"PK Gupta. 2016. Accelerating Datacenter Workloads. In FPL.  PK Gupta. 2016. Accelerating Datacenter Workloads. In FPL."},{"key":"e_1_3_2_1_13_1","unstructured":"Kaiming He Xiangyu Zhang Shaoqing Ren and Jian Sun. 2016. Deep Residual Learning for Image Recognition. In CVPR.  Kaiming He Xiangyu Zhang Shaoqing Ren and Jian Sun. 2016. Deep Residual Learning for Image Recognition. In CVPR."},{"key":"e_1_3_2_1_14_1","doi-asserted-by":"publisher","DOI":"10.1145\/3079856.3080246"},{"key":"e_1_3_2_1_15_1","unstructured":"Alex Krizhevsky Ilya Sutskever and Geoffrey E. Hinton. 2012. ImageNet Classification with Deep Convolutional Neural Networks. In NIPS.   Alex Krizhevsky Ilya Sutskever and Geoffrey E. Hinton. 2012. ImageNet Classification with Deep Convolutional Neural Networks. In NIPS."},{"key":"e_1_3_2_1_16_1","unstructured":"Andrew Lavin. 2015. Fast Algorithms for Convolutional Neural Networks. CoRR abs\/1509.09308 (2015). http:\/\/arxiv.org\/abs\/1509.09308  Andrew Lavin. 2015. Fast Algorithms for Convolutional Neural Networks. CoRR abs\/1509.09308 (2015). http:\/\/arxiv.org\/abs\/1509.09308"},{"key":"e_1_3_2_1_17_1","volume-title":"WRPN: Training and Inference using Wide Reduced-Precision Networks. CoRR abs\/1709.01134","author":"Mishra Asit K.","year":"2017"},{"key":"e_1_3_2_1_18_1","doi-asserted-by":"crossref","unstructured":"Duncan Moss Eriko Nurvitadhi Jaewoong Sim Asit Mishra Suchit Subhaschandra and Debbie Marr. 2017. High Performance Binary Neural Networks on the Xeon+FPGA Platform. In FPL.  Duncan Moss Eriko Nurvitadhi Jaewoong Sim Asit Mishra Suchit Subhaschandra and Debbie Marr. 2017. High Performance Binary Neural Networks on the Xeon+FPGA Platform. In FPL.","DOI":"10.23919\/FPL.2017.8056823"},{"key":"e_1_3_2_1_19_1","unstructured":"Sharan Narang. 2016. DeepBench. (2016). https:\/\/svail.github.io\/DeepBench\/  Sharan Narang. 2016. DeepBench. (2016). https:\/\/svail.github.io\/DeepBench\/"},{"key":"e_1_3_2_1_20_1","doi-asserted-by":"crossref","unstructured":"Eriko Nurvitadhi David Sheffield Jaewoong Sim Asit Mishra Ganesh Venkatesh and Debbie Marr. 2016. Accelerating Binarized Neural Networks: Comparison of FPGA CPU GPU and ASIC. In FPT.  Eriko Nurvitadhi David Sheffield Jaewoong Sim Asit Mishra Ganesh Venkatesh and Debbie Marr. 2016. Accelerating Binarized Neural Networks: Comparison of FPGA CPU GPU and ASIC. In FPT.","DOI":"10.1109\/FPT.2016.7929192"},{"key":"e_1_3_2_1_21_1","doi-asserted-by":"publisher","DOI":"10.1145\/3020078.3021740"},{"key":"e_1_3_2_1_22_1","doi-asserted-by":"crossref","unstructured":"Mohammad Rastegari Vicente Ordonez Joseph Redmon and Ali Farhadi. 2016. XNOR-Net: Imagenet Classification Using Binary Convolutional Neural Networks. In ECCV.  Mohammad Rastegari Vicente Ordonez Joseph Redmon and Ali Farhadi. 2016. XNOR-Net: Imagenet Classification Using Binary Convolutional Neural Networks. In ECCV.","DOI":"10.1007\/978-3-319-46493-0_32"},{"key":"e_1_3_2_1_23_1","doi-asserted-by":"publisher","DOI":"10.1007\/s11263-015-0816-y"},{"key":"e_1_3_2_1_24_1","doi-asserted-by":"publisher","DOI":"10.1145\/3035918.3035954"},{"key":"e_1_3_2_1_25_1","unstructured":"Karen Simonyan and Andrew Zisserman. 2014. Very Deep Convolutional Networks for Large-scale Image Recognition. arXiv:1409.1556 (2014).  Karen Simonyan and Andrew Zisserman. 2014. Very Deep Convolutional Networks for Large-scale Image Recognition. arXiv:1409.1556 (2014)."},{"key":"e_1_3_2_1_26_1","doi-asserted-by":"publisher","DOI":"10.1145\/3020078.3021744"},{"key":"e_1_3_2_1_27_1","unstructured":"Xuechao Wei Yun Liang Tao Wang Songwu Lu and Jason Cong. 2017. Throughput optimization for streaming applications on CPU-FPGA heterogeneous systems. In ASP-DAC. IEEE.  Xuechao Wei Yun Liang Tao Wang Songwu Lu and Jason Cong. 2017. Throughput optimization for streaming applications on CPU-FPGA heterogeneous systems. In ASP-DAC. IEEE."},{"key":"e_1_3_2_1_28_1","doi-asserted-by":"publisher","DOI":"10.1145\/2847263.2847269"},{"key":"e_1_3_2_1_29_1","doi-asserted-by":"publisher","DOI":"10.1145\/3020078.3021727"},{"key":"e_1_3_2_1_30_1","doi-asserted-by":"publisher","DOI":"10.1145\/3020078.3021741"}],"event":{"name":"FPGA '18: The 2018 ACM\/SIGDA International Symposium on Field-Programmable Gate Arrays","location":"Monterey CALIFORNIA USA","acronym":"FPGA '18","sponsor":["SIGDA ACM Special Interest Group on Design Automation"]},"container-title":["Proceedings of the 2018 ACM\/SIGDA International Symposium on Field-Programmable Gate Arrays"],"original-title":[],"link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3174243.3174258","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3174243.3174258","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T01:08:55Z","timestamp":1750208935000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3174243.3174258"}},"subtitle":["A Deep Learning Case Study"],"short-title":[],"issued":{"date-parts":[[2018,2,15]]},"references-count":29,"alternative-id":["10.1145\/3174243.3174258","10.1145\/3174243"],"URL":"https:\/\/doi.org\/10.1145\/3174243.3174258","relation":{},"subject":[],"published":{"date-parts":[[2018,2,15]]},"assertion":[{"value":"2018-02-15","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}