{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,1,10]],"date-time":"2026-01-10T00:39:49Z","timestamp":1768005589738,"version":"3.49.0"},"reference-count":38,"publisher":"Association for Computing Machinery (ACM)","issue":"6","license":[{"start":{"date-parts":[[2022,11,30]],"date-time":"2022-11-30T00:00:00Z","timestamp":1669766400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"DOI":"10.13039\/501100001602","name":"Science Foundation Ireland","doi-asserted-by":"crossref","award":["13\/RC\/2094_P2"],"award-info":[{"award-number":["13\/RC\/2094_P2"]}],"id":[{"id":"10.13039\/501100001602","id-type":"DOI","asserted-by":"crossref"}]},{"name":"European Union\u2019s Horizon 2020","award":["754489"],"award-info":[{"award-number":["754489"]}]},{"name":"Science Foundation Ireland and European Union\u2019s Horizon 2020 programme"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Embed. Comput. Syst."],"published-print":{"date-parts":[[2022,11,30]]},"abstract":"<jats:p>Convolutional neural networks (CNNs) have dramatically improved the accuracy of image, video, and audio processing for tasks such as object recognition, image segmentation, and interactive speech systems. CNNs require large amounts of computing resources for both training and inference, primarily because the convolution layers are computationally intensive. Fast convolution algorithms such as Winograd convolution can greatly reduce the computational cost of these layers. However, Winograd convolution has poor numeric properties, such that greater savings in computation cause exponentially increasing floating point errors.<\/jats:p>\n          <jats:p>\n            A defining feature of each Winograd convolution algorithm is a set of real-value points where polynomials are sampled. The choice of points impacts the numeric accuracy of the algorithm, but the optimal set of points for small convolutions remains unknown. Existing work considers only small integers and simple fractions as candidate points. In this work, we propose a novel approach to point selection using points of the form\n            <jats:inline-formula content-type=\"math\/tex\">\n              <jats:tex-math notation=\"LaTeX\" version=\"MathJax\">\\(\\lbrace -\\frac{1}{c},-c,c,\\frac{1}{c}\\rbrace\\)<\/jats:tex-math>\n            <\/jats:inline-formula>\n            using the full range of real-valued numbers for\n            <jats:italic>c<\/jats:italic>\n            . We show that groups of this form cause cancellations in the Winograd transform matrices that reduce numeric error. We find empirically that the error for different values of\n            <jats:italic>c<\/jats:italic>\n            forms a rough curve across the range of real-value numbers. It is therefore possible to localize the values of\n            <jats:italic>c<\/jats:italic>\n            that lead to lower error. We show that it is not necessary to choose integers or simple fractions as evaluation points, and that lower errors can be achieved with non-obvious real-valued points. We study a range of sizes for small convolutions and achieve reduction in error ranging from 2% to around 59% for both 1D and 2D convolution, when compared to state of the art. Furthermore, we identify patterns in cases when we select a subset of our proposed points that will always lead to a lower error. Finally, we implement a complete Winograd convolution layer and use it to run state-of-the-art deep convolution neural networks on real datasets and show that our proposed points achieve reduction in error, ranging from 22% to 63%, while also showing how an increased Winograd output size can result in execution speed-up for some cases.\n          <\/jats:p>","DOI":"10.1145\/3524069","type":"journal-article","created":{"date-parts":[[2022,3,21]],"date-time":"2022-03-21T12:37:55Z","timestamp":1647866275000},"page":"1-28","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":21,"title":["Winograd Convolution for Deep Neural Networks: Efficient Point Selection"],"prefix":"10.1145","volume":"21","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-1509-9678","authenticated-orcid":false,"given":"Syed Asad","family":"Alam","sequence":"first","affiliation":[{"name":"Lero, Trinity College Dublin, the University of Dublin, Dublin, Ireland"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-4357-4739","authenticated-orcid":false,"given":"Andrew","family":"Anderson","sequence":"additional","affiliation":[{"name":"Lero, Trinity College Dublin, the University of Dublin, Dublin, Ireland"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-7303-2511","authenticated-orcid":false,"given":"Barbara","family":"Barabasz","sequence":"additional","affiliation":[{"name":"Lero, Trinity College Dublin, the University of Dublin, Dublin, Ireland"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-3782-4612","authenticated-orcid":false,"given":"David","family":"Gregg","sequence":"additional","affiliation":[{"name":"Lero, Trinity College Dublin, the University of Dublin, Dublin, Ireland"}]}],"member":"320","published-online":{"date-parts":[[2022,12,12]]},"reference":[{"key":"e_1_3_1_2_2","doi-asserted-by":"publisher","DOI":"10.1145\/3412380"},{"key":"e_1_3_1_3_2","first-page":"307","volume-title":"Proc. Int. Conf. Italian Association for Artificial Intelligence","author":"Barabasz Barbara","year":"2019","unstructured":"Barbara Barabasz and David Gregg. 2019. Winograd convolution for DNNs: Beyond linear polynomials. In Proc. Int. Conf. Italian Association for Artificial Intelligence. Springer International Publishing, 307\u2013320."},{"key":"e_1_3_1_4_2","doi-asserted-by":"publisher","DOI":"10.1017\/CBO9780511760921"},{"key":"e_1_3_1_5_2","doi-asserted-by":"crossref","first-page":"116","DOI":"10.1007\/978-3-540-73074-3_10","volume-title":"Arithmetic of Finite Fields","author":"Bodrato Marco","year":"2007","unstructured":"Marco Bodrato. 2007. Towards optimal Toom-Cook multiplication for univariate and multivariate polynomials in characteristic 2 and 0. In Arithmetic of Finite Fields, Claude Carlet and Berk Sunar (Eds.). Springer, Berlin, 116\u2013133."},{"key":"e_1_3_1_6_2","doi-asserted-by":"crossref","unstructured":"J. Chen and X. Ran. 2019. Deep learning with edge computing: A review. 107 8 (2019) 1655\u20131674.","DOI":"10.1109\/JPROC.2019.2921977"},{"key":"e_1_3_1_7_2","volume-title":"On the Minimum Computation Time of Functions","author":"Cook S. A.","year":"1966","unstructured":"S. A. Cook. 1966. On the Minimum Computation Time of Functions. Ph.D. Dissertation. Cambridge, MA."},{"key":"e_1_3_1_8_2","unstructured":"Javier Fernandez-Marques Paul N. Whatmough Andrew Mundy and Matthew Mattina. 2020. Searching for Winograd-aware Quantized Networks. arXiv:2002.10711."},{"key":"e_1_3_1_9_2","doi-asserted-by":"publisher","DOI":"10.1007\/BF01438260"},{"key":"e_1_3_1_10_2","first-page":"193","article-title":"How (un)stable are Vandermonde systems","volume":"124","author":"Gautschi Walter","year":"1990","unstructured":"Walter Gautschi. 1990. How (un)stable are Vandermonde systems. Asymptotic and Computational Analysis 124 (1990), 193\u2013210.","journal-title":"Asymptotic and Computational Analysis"},{"key":"e_1_3_1_11_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2016.90"},{"key":"e_1_3_1_12_2","doi-asserted-by":"publisher","DOI":"10.5555\/579525"},{"key":"e_1_3_1_13_2","doi-asserted-by":"publisher","DOI":"10.1609\/aaai.v34i04.5838"},{"key":"e_1_3_1_14_2","article-title":"SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and <1MB model size","volume":"1602","author":"Iandola Forrest N.","year":"2016","unstructured":"Forrest N. Iandola, Matthew W. Moskewicz, Khalid Ashraf, Song Han, William J. Dally, and Kurt Keutzer. 2016. SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and <1MB model size. CoRR abs\/1602.07360 (2016). http:\/\/arxiv.org\/abs\/1602.07360.","journal-title":"CoRR"},{"key":"e_1_3_1_15_2","doi-asserted-by":"publisher","DOI":"10.1145\/2647868.2654889"},{"key":"e_1_3_1_16_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.jsc.2002.06.001"},{"key":"e_1_3_1_17_2","volume-title":"Learning Multiple Layers of Features from Tiny Images","author":"Krizhevsky Alex","year":"2009","unstructured":"Alex Krizhevsky. 2009. Learning Multiple Layers of Features from Tiny Images. Technical Report. https:\/\/www.cs.toronto.edu\/kriz\/learning-features-2009-TR.pdf."},{"key":"e_1_3_1_18_2","first-page":"1097","volume-title":"Proc. Int. Conf. Neural Inf. Proc. Sys. - Volume 1 (NIPS\u201912)","author":"Krizhevsky Alex","year":"2012","unstructured":"Alex Krizhevsky, Ilya Sutskever, and Geoffrey E. Hinton. 2012. ImageNet classification with deep convolutional neural networks. In Proc. Int. Conf. Neural Inf. Proc. Sys. - Volume 1 (NIPS\u201912). 1097\u20131105."},{"key":"e_1_3_1_19_2","doi-asserted-by":"publisher","DOI":"10.1145\/3065386"},{"key":"e_1_3_1_20_2","first-page":"4013","volume-title":"Proc. IEEE Conf. Comput. Vision Pattern Recog.","author":"Lavin A.","year":"2016","unstructured":"A. Lavin and S. Gray. 2016. Fast algorithms for convolutional neural networks. In Proc. IEEE Conf. Comput. Vision Pattern Recog.IEEE, 4013\u20134021."},{"key":"e_1_3_1_21_2","doi-asserted-by":"publisher","DOI":"10.23919\/DATE.2018.8342166"},{"key":"e_1_3_1_22_2","doi-asserted-by":"publisher","DOI":"10.1016\/S0377-0427(00)00393-9"},{"key":"e_1_3_1_23_2","doi-asserted-by":"publisher","DOI":"10.1109\/EMC249363.2019.00008"},{"key":"e_1_3_1_24_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-662-00551-4"},{"key":"e_1_3_1_25_2","volume-title":"VLSI Digital Signal Processing Systems, Design and Implementation","author":"Parhi Keshab K.","year":"1999","unstructured":"Keshab K. Parhi. 1999. VLSI Digital Signal Processing Systems, Design and Implementation. Wiley-Interscience."},{"key":"e_1_3_1_26_2","doi-asserted-by":"publisher","DOI":"10.1109\/ASAP.2017.7995253"},{"key":"e_1_3_1_27_2","doi-asserted-by":"publisher","DOI":"10.1109\/MC.2017.9"},{"key":"e_1_3_1_28_2","doi-asserted-by":"publisher","DOI":"10.1109\/JIOT.2016.2579198"},{"key":"e_1_3_1_29_2","unstructured":"Karen Simonyan and Andrew Zisserman. 2014. Very Deep Convolutional Networks for Large-scale Image Recognition. arXiv:1409.1556."},{"key":"e_1_3_1_30_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2015.7298594"},{"key":"e_1_3_1_31_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2015.7298594"},{"key":"e_1_3_1_32_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-1-4757-2767-8_4"},{"key":"e_1_3_1_33_2","first-page":"714","article-title":"The complexity of a scheme of functional elements realizing multiplication of integers","author":"Toom A. L.","year":"1963","unstructured":"A. L. Toom. 1963. The complexity of a scheme of functional elements realizing multiplication of integers. Soviet Mathematics - Doklady 3 (1963), 714\u2013716.","journal-title":"Soviet Mathematics - Doklady 3"},{"key":"e_1_3_1_34_2","volume-title":"Proc. Int. Conf. Learning Representation","author":"Vincent Kevin","year":"2017","unstructured":"Kevin Vincent, Kevin Stephano, Michael Frumkin, Boris Ginsburg, and Julien Demouth. 2017. On improving the numerical stability of Winograd convolution. In Proc. Int. Conf. Learning Representation."},{"key":"e_1_3_1_35_2","doi-asserted-by":"publisher","DOI":"10.1137\/1.9781611970364"},{"key":"e_1_3_1_36_2","first-page":"94","volume-title":"Proc. IEEE Int. Conf. Acoust. Speech Signal Process.","author":"Winograd S.","year":"1980","unstructured":"S. Winograd. 1980. Signal processing and complexity of computation. In Proc. IEEE Int. Conf. Acoust. Speech Signal Process.IEEE, 94\u2013101."},{"key":"e_1_3_1_37_2","doi-asserted-by":"publisher","DOI":"10.1145\/3061639.3062244"},{"key":"e_1_3_1_38_2","doi-asserted-by":"publisher","DOI":"10.1109\/DAC.2018.8465825"},{"key":"e_1_3_1_39_2","doi-asserted-by":"publisher","DOI":"10.3390\/a11100159"}],"container-title":["ACM Transactions on Embedded Computing Systems"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3524069","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3524069","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T19:31:05Z","timestamp":1750188665000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3524069"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,11,30]]},"references-count":38,"journal-issue":{"issue":"6","published-print":{"date-parts":[[2022,11,30]]}},"alternative-id":["10.1145\/3524069"],"URL":"https:\/\/doi.org\/10.1145\/3524069","relation":{},"ISSN":["1539-9087","1558-3465"],"issn-type":[{"value":"1539-9087","type":"print"},{"value":"1558-3465","type":"electronic"}],"subject":[],"published":{"date-parts":[[2022,11,30]]},"assertion":[{"value":"2021-07-16","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2022-03-03","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2022-12-12","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}