{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,6,12]],"date-time":"2026-06-12T17:51:34Z","timestamp":1781286694279,"version":"3.54.1"},"reference-count":75,"publisher":"Association for Computing Machinery (ACM)","issue":"3","license":[{"start":{"date-parts":[[2018,9,30]],"date-time":"2018-09-30T00:00:00Z","timestamp":1538265600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"DOI":"10.13039\/100000001","name":"National Science Foundation","doi-asserted-by":"publisher","award":["1717213"],"award-info":[{"award-number":["1717213"]}],"id":[{"id":"10.13039\/100000001","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Reconfigurable Technol. Syst."],"published-print":{"date-parts":[[2018,9,30]]},"abstract":"<jats:p>Convolutional Neural Networks have rapidly become the most successful machine-learning algorithm, enabling ubiquitous machine vision and intelligent decisions on even embedded computing systems. While the underlying arithmetic is structurally simple, compute and memory requirements are challenging. One of the promising opportunities is leveraging reduced-precision representations for inputs, activations, and model parameters. The resulting scalability in performance, power efficiency, and storage footprint provides interesting design compromises in exchange for a small reduction in accuracy. FPGAs are ideal for exploiting low-precision inference engines leveraging custom precisions to achieve the required numerical accuracy for a given application. In this article, we describe the second generation of the FINN framework, an end-to-end tool that enables design-space exploration and automates the creation of fully customized inference engines on FPGAs. Given a neural network description, the tool optimizes for given platforms, design targets, and a specific precision. We introduce formalizations of resource cost functions and performance predictions and elaborate on the optimization algorithms. Finally, we evaluate a selection of reduced precision neural networks ranging from CIFAR-10 classifiers to YOLO-based object detection on a range of platforms including PYNQ and AWS\u00a0F1, demonstrating new unprecedented measured throughput at 50\u00a0TOp\/s on AWS\u00a0F1 and 5\u00a0TOp\/s on embedded devices.<\/jats:p>","DOI":"10.1145\/3242897","type":"journal-article","created":{"date-parts":[[2018,12,17]],"date-time":"2018-12-17T13:17:16Z","timestamp":1545052636000},"page":"1-23","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":385,"title":["FINN-\n            <i>R<\/i>"],"prefix":"10.1145","volume":"11","author":[{"given":"Michaela","family":"Blott","sequence":"first","affiliation":[{"name":"Xilinx Research Labs, Dublin, Ireland"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Thomas B.","family":"Preu\u00dfer","sequence":"additional","affiliation":[{"name":"Xilinx Research Labs, Dublin, Ireland"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Nicholas J.","family":"Fraser","sequence":"additional","affiliation":[{"name":"Xilinx Research Labs, Dublin, Ireland"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Giulio","family":"Gambardella","sequence":"additional","affiliation":[{"name":"Xilinx Research Labs, Dublin, Ireland"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Kenneth","family":"O\u2019brien","sequence":"additional","affiliation":[{"name":"Xilinx Research Labs, Ireland"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Yaman","family":"Umuroglu","sequence":"additional","affiliation":[{"name":"Xilinx Research Labs, Ireland"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-5624-056X","authenticated-orcid":false,"given":"Miriam","family":"Leeser","sequence":"additional","affiliation":[{"name":"Northeastern University, U.S."}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Kees","family":"Vissers","sequence":"additional","affiliation":[{"name":"Xilinx Research, U.S."}],"role":[{"vocabulary":"crossref","role":"author"}]}],"member":"320","published-online":{"date-parts":[[2018,12,15]]},"reference":[{"key":"e_1_2_1_1_1","unstructured":"ImageNet Large Scale Visual Recognition Challenge (ILSVRC). 2017. Retrieved from http:\/\/image-net.org\/challenges\/talks_2017\/ILSVRC2017_overview.pdf.  ImageNet Large Scale Visual Recognition Challenge (ILSVRC). 2017. Retrieved from http:\/\/image-net.org\/challenges\/talks_2017\/ILSVRC2017_overview.pdf."},{"key":"e_1_2_1_2_1","unstructured":"M. Abadi A. Agarwal P. Barham E. Brevdo Z. Chen C. Citro G. S. Corrado A. Davis J. Dean M. Devin etal 2016. TensorFlow: Large-scale machine learning on heterogeneous distributed systems. CoRR abs\/1603.04467.  M. Abadi A. Agarwal P. Barham E. Brevdo Z. Chen C. Citro G. S. Corrado A. Davis J. Dean M. Devin et al. 2016. TensorFlow: Large-scale machine learning on heterogeneous distributed systems. CoRR abs\/1603.04467."},{"key":"e_1_2_1_3_1","doi-asserted-by":"crossref","unstructured":"K. Abdelouahab M. Pelcat J. S\u00e9rot C. Bourrasset and F. Berry. 2017. Tactics to directly map CNN graphs on embedded FPGAs. IEEE Embed. Syst. Lett. (2017).  K. Abdelouahab M. Pelcat J. S\u00e9rot C. Bourrasset and F. Berry. 2017. Tactics to directly map CNN graphs on embedded FPGAs. IEEE Embed. Syst. Lett. (2017).","DOI":"10.1109\/LES.2017.2743247"},{"key":"e_1_2_1_4_1","doi-asserted-by":"crossref","unstructured":"H. Alemdar N. Caldwell V. Leroy A. Prost-Boucle and F. P\u00e9trot. 2016. Ternary neural networks for resource-efficient AI applications. CoRR abs\/1609.00222.  H. Alemdar N. Caldwell V. Leroy A. Prost-Boucle and F. P\u00e9trot. 2016. Ternary neural networks for resource-efficient AI applications. CoRR abs\/1609.00222.","DOI":"10.1109\/IJCNN.2017.7966166"},{"key":"e_1_2_1_5_1","volume-title":"Proceedings of the ISVLSI. IEEE, 236--241","author":"Andri R.","unstructured":"R. Andri , L. Cavigelli , D. Rossi , and L. Benini . 2016. YodaNN: An ultra-low power convolutional neural network accelerator based on binary weights . In Proceedings of the ISVLSI. IEEE, 236--241 . R. Andri, L. Cavigelli, D. Rossi, and L. Benini. 2016. YodaNN: An ultra-low power convolutional neural network accelerator based on binary weights. In Proceedings of the ISVLSI. IEEE, 236--241."},{"key":"e_1_2_1_6_1","doi-asserted-by":"publisher","DOI":"10.1145\/3020078.3021738"},{"key":"e_1_2_1_7_1","doi-asserted-by":"crossref","unstructured":"C. Baskin N. Liss A. Mendelson and E. Zheltonozhskii. 2017. Streaming architecture for large-scale quantized neural networks on an FPGA-based dataflow platform. arXiv preprint arXiv:1708.00052.  C. Baskin N. Liss A. Mendelson and E. Zheltonozhskii. 2017. Streaming architecture for large-scale quantized neural networks on an FPGA-based dataflow platform. arXiv preprint arXiv:1708.00052.","DOI":"10.1109\/IPDPSW.2018.00032"},{"key":"e_1_2_1_8_1","unstructured":"Doug Burger. 2017. Microsoft Unveils Project Brainwave for Real-Rime AI. Retrieved from https:\/\/www.microsoft.com\/en-us\/research\/blog\/microsoft-unveils-project-brainwave\/.  Doug Burger. 2017. Microsoft Unveils Project Brainwave for Real-Rime AI. Retrieved from https:\/\/www.microsoft.com\/en-us\/research\/blog\/microsoft-unveils-project-brainwave\/."},{"key":"e_1_2_1_9_1","volume-title":"Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR\u201917)","author":"Cai Z.","unstructured":"Z. Cai , X. He , J. Sun , and N. Vasconcelos . 2017. Deep learning with low precision by half-wave gaussian quantization . In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR\u201917) . Z. Cai, X. He, J. Sun, and N. Vasconcelos. 2017. Deep learning with low precision by half-wave gaussian quantization. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR\u201917)."},{"key":"e_1_2_1_10_1","volume-title":"Proceedings of the 10th International Workshop on Frontiers in Handwriting Recognition. Suvisoft.","author":"Chellapilla K.","unstructured":"K. Chellapilla , S. Puri , and P. Simard . 2006. High performance convolutional neural networks for document processing . In Proceedings of the 10th International Workshop on Frontiers in Handwriting Recognition. Suvisoft. K. Chellapilla, S. Puri, and P. Simard. 2006. High performance convolutional neural networks for document processing. In Proceedings of the 10th International Workshop on Frontiers in Handwriting Recognition. Suvisoft."},{"key":"e_1_2_1_11_1","doi-asserted-by":"publisher","DOI":"10.1145\/2996864"},{"key":"e_1_2_1_12_1","doi-asserted-by":"publisher","DOI":"10.1109\/ISCA.2016.40"},{"key":"e_1_2_1_13_1","volume-title":"Binarized neural networks: Training neural networks with weights and activations constrained to +1 or -1. CoRR abs\/1602.0","author":"Courbariaux Matthieu","year":"2016","unstructured":"Matthieu Courbariaux , Itay Hubara , Daniel Soudry , Ran El-Yaniv , and Yoshua Bengio . 2016. Binarized neural networks: Training neural networks with weights and activations constrained to +1 or -1. CoRR abs\/1602.0 ( 2016 ). Matthieu Courbariaux, Itay Hubara, Daniel Soudry, Ran El-Yaniv, and Yoshua Bengio. 2016. Binarized neural networks: Training neural networks with weights and activations constrained to +1 or -1. CoRR abs\/1602.0 (2016)."},{"key":"e_1_2_1_14_1","volume-title":"Proceedings of the 27th International Conference on Neural Information Processing Systems (NIPS'14)","volume":"1","author":"Denton E. L.","unstructured":"E. L. Denton , W. Zaremba , J. Bruna , Y. LeCun , and R. Fergus . 2014. Exploiting linear structure within convolutional networks for efficient evaluation . In Proceedings of the 27th International Conference on Neural Information Processing Systems (NIPS'14) , Vol. 1 . MIT Press, 1269--1277. http:\/\/dl.acm.org\/citation.cfm?id&equals;2968826.2968968. E. L. Denton, W. Zaremba, J. Bruna, Y. LeCun, and R. Fergus. 2014. Exploiting linear structure within convolutional networks for efficient evaluation. In Proceedings of the 27th International Conference on Neural Information Processing Systems (NIPS'14), Vol. 1. MIT Press, 1269--1277. http:\/\/dl.acm.org\/citation.cfm?id&equals;2968826.2968968."},{"key":"e_1_2_1_15_1","doi-asserted-by":"publisher","DOI":"10.1073\/pnas.1604850113"},{"key":"e_1_2_1_16_1","unstructured":"Benoit Jacob et al. 2017. gemmlowp: A Small Self-Contained Low-Precision GEMM Library. Retrieved from https:\/\/github.com\/google\/gemmlowp.  Benoit Jacob et al. 2017. gemmlowp: A Small Self-Contained Low-Precision GEMM Library. Retrieved from https:\/\/github.com\/google\/gemmlowp."},{"key":"e_1_2_1_17_1","volume-title":"Proceedings of the IEEE FPL. IEEE, 32--37","author":"Farabet C.","unstructured":"C. Farabet , C. Poulet , J. Y. Han , and Y. LeCun . 2009. CNP: An FPGA-based processor for convolutional networks . In Proceedings of the IEEE FPL. IEEE, 32--37 . C. Farabet, C. Poulet, J. Y. Han, and Y. LeCun. 2009. CNP: An FPGA-based processor for convolutional networks. In Proceedings of the IEEE FPL. IEEE, 32--37."},{"key":"e_1_2_1_18_1","volume-title":"Proceedings of the ICONIP. Springer, 393--404","author":"Faraone J.","unstructured":"J. Faraone , N. Fraser , G. Gambardella , P. H. W. Blott , and M. Leong . 2017. Compressing low precision deep neural networks using sparsity-induced regularization in ternary networks . In Proceedings of the ICONIP. Springer, 393--404 . J. Faraone, N. Fraser, G. Gambardella, P. H. W. Blott, and M. Leong. 2017. Compressing low precision deep neural networks using sparsity-induced regularization in ternary networks. In Proceedings of the ICONIP. Springer, 393--404."},{"key":"e_1_2_1_19_1","volume-title":"Leong","author":"Faraone Julian","year":"2018","unstructured":"Julian Faraone , Giulio Gambardella , David Boland , Nicholas J. Fraser , Michaela Blott , and Philip H. W . Leong . 2018 . Customizing low-precision deep neural networks For FPGAs . Julian Faraone, Giulio Gambardella, David Boland, Nicholas J. Fraser, Michaela Blott, and Philip H. W. Leong. 2018. Customizing low-precision deep neural networks For FPGAs."},{"key":"e_1_2_1_20_1","doi-asserted-by":"publisher","DOI":"10.1145\/3029580.3029586"},{"key":"e_1_2_1_21_1","unstructured":"S. Han H. Mao and W. J. Dally. 2015. Deep compression: Compressing deep neural network with pruning trained quantization and huffman coding. CoRR abs\/1510.00149 (2015).  S. Han H. Mao and W. J. Dally. 2015. Deep compression: Compressing deep neural network with pruning trained quantization and huffman coding. CoRR abs\/1510.00149 (2015)."},{"key":"e_1_2_1_22_1","unstructured":"S. Han J. Pool J. Tran and W. J. Dally. 2015. Learning both weights and connections for efficient neural networks. CoRR abs\/1506.02626 (2015).   S. Han J. Pool J. Tran and W. J. Dally. 2015. Learning both weights and connections for efficient neural networks. CoRR abs\/1506.02626 (2015)."},{"key":"e_1_2_1_23_1","doi-asserted-by":"publisher","DOI":"10.1145\/2968455.2968511"},{"key":"e_1_2_1_24_1","doi-asserted-by":"publisher","DOI":"10.1109\/ISSCC.2014.6757323"},{"key":"#cr-split#-e_1_2_1_25_1.1","unstructured":"F. N. Iandola M. W. Moskewicz K. Ashraf S. Han W. J. Dally and K. Keutzer. 2016. SqueezeNet: AlexNet-level accuracy with 50\u00d7 fewer parameters and &lt"},{"key":"#cr-split#-e_1_2_1_25_1.2","unstructured":"1 MB model size. CoRR abs\/1602.07630 (2016). F. N. Iandola M. W. Moskewicz K. Ashraf S. Han W. J. Dally and K. Keutzer. 2016. SqueezeNet: AlexNet-level accuracy with 50\u00d7 fewer parameters and &lt"},{"key":"#cr-split#-e_1_2_1_25_1.3","unstructured":"1 MB model size. CoRR abs\/1602.07630 (2016)."},{"key":"e_1_2_1_26_1","doi-asserted-by":"publisher","DOI":"10.23919\/FPL.2017.8056820"},{"key":"e_1_2_1_27_1","doi-asserted-by":"publisher","DOI":"10.1145\/3079856.3080246"},{"key":"e_1_2_1_28_1","volume-title":"Proceedings of the MICRO. IEEE, 1--12","author":"Judd P.","unstructured":"P. Judd , J. Albericio , T. Hetherington , T. M. Aamodt , and A. Moshovos . 2016. Stripes: Bit-serial deep neural network computing . In Proceedings of the MICRO. IEEE, 1--12 . P. Judd, J. Albericio, T. Hetherington, T. M. Aamodt, and A. Moshovos. 2016. Stripes: Bit-serial deep neural network computing. In Proceedings of the MICRO. IEEE, 1--12."},{"key":"e_1_2_1_29_1","volume-title":"Bitwise neural networks. CoRR abs\/1601.0","author":"Kim Minje","year":"2016","unstructured":"Minje Kim and Paris Smaragdis . 2016. Bitwise neural networks. CoRR abs\/1601.0 ( 2016 ). Minje Kim and Paris Smaragdis. 2016. Bitwise neural networks. CoRR abs\/1601.0 (2016)."},{"key":"e_1_2_1_30_1","volume-title":"Proceedings of the NIPS. 1097--1105","author":"Krizhevsky A.","unstructured":"A. Krizhevsky , I. Sutskever , and G. E. Hinton . 2012. ImageNet classification with deep convolutional neural networks . In Proceedings of the NIPS. 1097--1105 . A. Krizhevsky, I. Sutskever, and G. E. Hinton. 2012. ImageNet classification with deep convolutional neural networks. In Proceedings of the NIPS. 1097--1105."},{"key":"e_1_2_1_31_1","unstructured":"Xilinx Research Labs. 2017. BNN-PYNQ. Retrieved from https:\/\/github.com\/Xilinx\/BNN-PYNQ.  Xilinx Research Labs. 2017. BNN-PYNQ. Retrieved from https:\/\/github.com\/Xilinx\/BNN-PYNQ."},{"key":"e_1_2_1_32_1","unstructured":"Xilinx Research Labs. 2017. FINN-R. Retrieved from https:\/\/github.com\/XilinxDublinLabs\/FINN-R.  Xilinx Research Labs. 2017. FINN-R. Retrieved from https:\/\/github.com\/XilinxDublinLabs\/FINN-R."},{"key":"e_1_2_1_33_1","unstructured":"Xilinx Research Labs. 2018. QNN-MO-PYNQ. Retrieved from https:\/\/github.com\/Xilinx\/QNN-MO-PYNQ.  Xilinx Research Labs. 2018. QNN-MO-PYNQ. Retrieved from https:\/\/github.com\/Xilinx\/QNN-MO-PYNQ."},{"key":"e_1_2_1_34_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.neucom.2017.09.046"},{"key":"e_1_2_1_35_1","unstructured":"ARM Limited. 2017. Compute Library. Retrieved from https:\/\/developer.arm.com\/technologies\/compute-library.  ARM Limited. 2017. Compute Library. Retrieved from https:\/\/developer.arm.com\/technologies\/compute-library."},{"key":"e_1_2_1_36_1","volume-title":"Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR\u201915)","author":"Liu B.","unstructured":"B. Liu , M. Wang , H. Foroosh , M. F. Tappen , and M. Pensky . 2015. Sparse convolutional neural networks . In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR\u201915) . 806--814. Retrieved from B. Liu, M. Wang, H. Foroosh, M. F. Tappen, and M. Pensky. 2015. Sparse convolutional neural networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR\u201915). 806--814. Retrieved from"},{"key":"e_1_2_1_37_1","volume-title":"Proceedings of the FPL. IEEE, 1--8.","author":"Ma Y.","unstructured":"Y. Ma , Y. Cao , S. Vrudhula , and J. Seo . 2017. An automatic RTL compiler for high-throughput FPGA implementation of diverse deep convolutional neural networks . In Proceedings of the FPL. IEEE, 1--8. Y. Ma, Y. Cao, S. Vrudhula, and J. Seo. 2017. An automatic RTL compiler for high-throughput FPGA implementation of diverse deep convolutional neural networks. In Proceedings of the FPL. IEEE, 1--8."},{"key":"e_1_2_1_38_1","doi-asserted-by":"publisher","DOI":"10.1145\/3020078.3021736"},{"key":"e_1_2_1_39_1","volume-title":"WRPN: Wide reduced-precision networks. CoRR abs\/1709.01134.","author":"Mishra A. K.","year":"2017","unstructured":"A. K. Mishra , E. Nurvitadhi , J. J. Cook , and D. Marr . 2017 . WRPN: Wide reduced-precision networks. CoRR abs\/1709.01134. A. K. Mishra, E. Nurvitadhi, J. J. Cook, and D. Marr. 2017. WRPN: Wide reduced-precision networks. CoRR abs\/1709.01134."},{"key":"e_1_2_1_40_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.neucom.2010.03.021"},{"key":"e_1_2_1_41_1","volume-title":"Proceedings of the FPL. IEEE.","author":"Moss D.","unstructured":"D. Moss , E. Nurvitadhi , J. Sim , A. Mishra , D. Marr , S. Subhaschandra , and P. Leong . 2017. High-performance binary neural networks on the Xeon+ FPGA platform . In Proceedings of the FPL. IEEE. D. Moss, E. Nurvitadhi, J. Sim, A. Mishra, D. Marr, S. Subhaschandra, and P. Leong. 2017. High-performance binary neural networks on the Xeon+ FPGA platform. In Proceedings of the FPL. IEEE."},{"key":"e_1_2_1_42_1","volume-title":"Proceedings of the FPL. IEEE, 1--4.","author":"Nakahara H.","unstructured":"H. Nakahara , T. Fujii , and S. Sato . 2017. A fully connected layer elimination for a binarized convolutional neural network on an FPGA . In Proceedings of the FPL. IEEE, 1--4. H. Nakahara, T. Fujii, and S. Sato. 2017. A fully connected layer elimination for a binarized convolutional neural network on an FPGA. In Proceedings of the FPL. IEEE, 1--4."},{"key":"e_1_2_1_43_1","volume-title":"Proceedings of the FPL. IEEE, 1--1.","author":"Nakahara H.","unstructured":"H. Nakahara , H. Yonekawa , T. Fujii , M. Shimoda , and S. Sato . 2017. A demonstration of the GUINNESS: A GUI -based neural network synthesizer for an FPGA . In Proceedings of the FPL. IEEE, 1--1. H. Nakahara, H. Yonekawa, T. Fujii, M. Shimoda, and S. Sato. 2017. A demonstration of the GUINNESS: A GUI -based neural network synthesizer for an FPGA. In Proceedings of the FPL. IEEE, 1--1."},{"key":"e_1_2_1_44_1","volume-title":"Proceedings of the FPT. 77--84","author":"Nurvitadhi E.","unstructured":"E. Nurvitadhi , D. Sheffield , Jaewoong Sim , A. Mishra , G. Venkatesh , and D. Marr . 2016. Accelerating binarized neural networks: Comparison of FPGA, CPU, GPU, and ASIC . In Proceedings of the FPT. 77--84 . E. Nurvitadhi, D. Sheffield, Jaewoong Sim, A. Mishra, G. Venkatesh, and D. Marr. 2016. Accelerating binarized neural networks: Comparison of FPGA, CPU, GPU, and ASIC. In Proceedings of the FPT. 77--84."},{"key":"e_1_2_1_45_1","doi-asserted-by":"publisher","DOI":"10.1145\/3020078.3021740"},{"key":"e_1_2_1_46_1","unstructured":"K. Ovtcharov O. Ruwase J. Kim J. Fowers K. Strauss and E. Chung. 2015. Accelerating deep convolutional neural networks using specialized hardware. https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2016\/02\/CNN20Whitepaper.pdf.  K. Ovtcharov O. Ruwase J. Kim J. Fowers K. Strauss and E. Chung. 2015. Accelerating deep convolutional neural networks using specialized hardware. https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2016\/02\/CNN20Whitepaper.pdf."},{"key":"e_1_2_1_47_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICASSP.2016.7471828"},{"key":"e_1_2_1_48_1","doi-asserted-by":"publisher","DOI":"10.23919\/FPL.2017.8056834"},{"key":"e_1_2_1_49_1","volume-title":"Proceedings of the FPL. IEEE.","author":"Prost-Boucle A.","unstructured":"A. Prost-Boucle , A. Bourge , F. P\u00e9trot , H. Alemdar , N. Caldwell , and V. Leroy . 2017. Scalable high-performance architecture for convolutional ternary neural networks on FPGA . In Proceedings of the FPL. IEEE. A. Prost-Boucle, A. Bourge, F. P\u00e9trot, H. Alemdar, N. Caldwell, and V. Leroy. 2017. Scalable high-performance architecture for convolutional ternary neural networks on FPGA. In Proceedings of the FPL. IEEE."},{"key":"e_1_2_1_50_1","doi-asserted-by":"crossref","unstructured":"M. Rastegari V. Ordonez J. Redmon and A. Farhadi. 2016. XNOR-Net: ImageNet classification using binary convolutional neural networks. CoRR abs\/1603.05279 (2016).  M. Rastegari V. Ordonez J. Redmon and A. Farhadi. 2016. XNOR-Net: ImageNet classification using binary convolutional neural networks. CoRR abs\/1603.05279 (2016).","DOI":"10.1007\/978-3-319-46493-0_32"},{"key":"e_1_2_1_51_1","doi-asserted-by":"publisher","DOI":"10.1109\/ISCA.2016.32"},{"key":"e_1_2_1_52_1","unstructured":"J. Redmon. 2013--2016. Darknet: Open Source Neural Networks in C. Retrieved from http:\/\/pjreddie.com\/darknet\/.  J. Redmon. 2013--2016. Darknet: Open Source Neural Networks in C. Retrieved from http:\/\/pjreddie.com\/darknet\/."},{"key":"e_1_2_1_53_1","volume-title":"Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'17)","author":"Redmon J.","unstructured":"J. Redmon and A. Farhadi . 2017. YOLO9000: Better, faster, stronger . In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'17) . 6517--6525. J. Redmon and A. Farhadi. 2017. YOLO9000: Better, faster, stronger. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'17). 6517--6525."},{"key":"e_1_2_1_54_1","volume-title":"Proceedings of the Workshop on Cognitive Architectures.","author":"Sharma H.","unstructured":"H. Sharma , J. Park , E. Amaro , B. Thwaites , P. Kotha , A. Gupta , J. K. Kim , A. Mishra , and H. Esmaeilzadeh . 2016. D<scp>nn<\/scp>W<scp>eaver<\/scp>: From high-level deep network models to FPGA acceleration . In Proceedings of the Workshop on Cognitive Architectures. H. Sharma, J. Park, E. Amaro, B. Thwaites, P. Kotha, A. Gupta, J. K. Kim, A. Mishra, and H. Esmaeilzadeh. 2016. D<scp>nn<\/scp>W<scp>eaver<\/scp>: From high-level deep network models to FPGA acceleration. In Proceedings of the Workshop on Cognitive Architectures."},{"key":"e_1_2_1_55_1","unstructured":"K. Simonyan and A. Zisserman. 2014. Very deep convolutional networks for large-scale image recognition. CoRR abs\/1409.1556.  K. Simonyan and A. Zisserman. 2014. Very deep convolutional networks for large-scale image recognition. CoRR abs\/1409.1556."},{"key":"e_1_2_1_56_1","volume-title":"Cheung","author":"Su Jiang","year":"2018","unstructured":"Jiang Su , Nicholas J. Fraser , Giulio Gambardella , Michaela Blott , Gianluca Durelli , David B. Thomas , Philip H. W. Leong , and Peter Y. K . Cheung . 2018 . Accuracy to throughput trade-offs for reduced precision neural networks on reconfigurable logic. In Proceedings of the ARC. ACM, to Appear . Jiang Su, Nicholas J. Fraser, Giulio Gambardella, Michaela Blott, Gianluca Durelli, David B. Thomas, Philip H. W. Leong, and Peter Y. K. Cheung. 2018. Accuracy to throughput trade-offs for reduced precision neural networks on reconfigurable logic. In Proceedings of the ARC. ACM, to Appear."},{"key":"e_1_2_1_57_1","unstructured":"Wonyong Sung Sungho Shin and Kyuyeon Hwang. 2015. Resiliency of deep neural networks under quantization. abs\/1511.0.  Wonyong Sung Sungho Shin and Kyuyeon Hwang. 2015. Resiliency of deep neural networks under quantization. abs\/1511.0."},{"key":"e_1_2_1_58_1","doi-asserted-by":"publisher","DOI":"10.1145\/3020078.3021744"},{"key":"e_1_2_1_59_1","unstructured":"Y. Umuroglu and M. Jahre. 2017. Streamlined deployment for quantized neural networks. arXiv preprint arXiv:1709.04060.  Y. Umuroglu and M. Jahre. 2017. Streamlined deployment for quantized neural networks. arXiv preprint arXiv:1709.04060."},{"key":"e_1_2_1_60_1","volume-title":"Proceedings of the CCM. IEEE, 40--47","author":"Venieris S. I.","unstructured":"S. I. Venieris and C. Bouganis . 2016. fpgaConvNet: A framework for mapping convolutional neural networks on FPGAs . In Proceedings of the CCM. IEEE, 40--47 . S. I. Venieris and C. Bouganis. 2016. fpgaConvNet: A framework for mapping convolutional neural networks on FPGAs. In Proceedings of the CCM. IEEE, 40--47."},{"key":"e_1_2_1_61_1","doi-asserted-by":"publisher","DOI":"10.1145\/3061639.3062207"},{"key":"e_1_2_1_62_1","doi-asserted-by":"publisher","DOI":"10.1145\/1498765.1498785"},{"key":"e_1_2_1_63_1","unstructured":"Xilinx Inc. 2017. Zynq-7000 All Programmable SoC Data Sheet: Overview. Xilinx Inc.  Xilinx Inc. 2017. Zynq-7000 All Programmable SoC Data Sheet: Overview. Xilinx Inc."},{"key":"e_1_2_1_64_1","volume-title":"Proceedings of the IPDPSW. IEEE, 98--105","author":"Yonekawa H.","unstructured":"H. Yonekawa and H. Nakahara . 2017. On-chip memory-based binarized convolutional deep neural network applying batch normalization free technique on an FPGA . In Proceedings of the IPDPSW. IEEE, 98--105 . H. Yonekawa and H. Nakahara. 2017. On-chip memory-based binarized convolutional deep neural network applying batch normalization free technique on an FPGA. In Proceedings of the IPDPSW. IEEE, 98--105."},{"key":"e_1_2_1_65_1","doi-asserted-by":"publisher","DOI":"10.1145\/3079856.3080215"},{"key":"e_1_2_1_66_1","doi-asserted-by":"crossref","unstructured":"S. Zagoruyko and N. Komodakis. 2016. Wide residual networks. arXiv preprint arXiv:1605.07146.  S. Zagoruyko and N. Komodakis. 2016. Wide residual networks. arXiv preprint arXiv:1605.07146.","DOI":"10.5244\/C.30.87"},{"key":"e_1_2_1_67_1","doi-asserted-by":"publisher","DOI":"10.1145\/2966986.2967011"},{"key":"e_1_2_1_68_1","doi-asserted-by":"publisher","DOI":"10.1145\/2684746.2689060"},{"key":"e_1_2_1_69_1","doi-asserted-by":"publisher","DOI":"10.1145\/3020078.3021698"},{"key":"e_1_2_1_70_1","doi-asserted-by":"publisher","DOI":"10.1145\/3020078.3021741"},{"key":"e_1_2_1_71_1","unstructured":"A. Zhou A. Yao Y. Guo L. Xu and Y. Chen. 2017. Incremental network quantization: Towards lossless CNNs with low-precision weights. CoRR abs\/1702.03044. Retrieved from http:\/\/arxiv.org\/abs\/1702.03044.  A. Zhou A. Yao Y. Guo L. Xu and Y. Chen. 2017. Incremental network quantization: Towards lossless CNNs with low-precision weights. CoRR abs\/1702.03044. Retrieved from http:\/\/arxiv.org\/abs\/1702.03044."},{"key":"e_1_2_1_72_1","unstructured":"S. Zhou Z. Ni X. Zhou H. Wen Y. Wu and Y. Zou. 2016. DoReFa-Net: Training low bitwidth convolutional neural networks with low bitwidth gradients. CoRR abs\/1606.06160.  S. Zhou Z. Ni X. Zhou H. Wen Y. Wu and Y. Zou. 2016. DoReFa-Net: Training low bitwidth convolutional neural networks with low bitwidth gradients. CoRR abs\/1606.06160."},{"key":"e_1_2_1_73_1","unstructured":"C. Zhu S. Han H. Mao and W. J. Dally. 2016. Trained ternary quantization. CoRR abs\/1612.01064.  C. Zhu S. Han H. Mao and W. J. Dally. 2016. Trained ternary quantization. CoRR abs\/1612.01064."}],"container-title":["ACM Transactions on Reconfigurable Technology and Systems"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3242897","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3242897","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3242897","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T01:39:24Z","timestamp":1750210764000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3242897"}},"subtitle":["An End-to-End Deep-Learning Framework for Fast Exploration of Quantized Neural Networks"],"short-title":[],"issued":{"date-parts":[[2018,9,30]]},"references-count":75,"journal-issue":{"issue":"3","published-print":{"date-parts":[[2018,9,30]]}},"alternative-id":["10.1145\/3242897"],"URL":"https:\/\/doi.org\/10.1145\/3242897","relation":{},"ISSN":["1936-7406","1936-7414"],"issn-type":[{"value":"1936-7406","type":"print"},{"value":"1936-7414","type":"electronic"}],"subject":[],"published":{"date-parts":[[2018,9,30]]},"assertion":[{"value":"2017-12-01","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2018-07-01","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2018-12-15","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}