{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T04:08:20Z","timestamp":1750219700765,"version":"3.41.0"},"reference-count":19,"publisher":"Association for Computing Machinery (ACM)","issue":"10","license":[{"start":{"date-parts":[[2024,9,26]],"date-time":"2024-09-26T00:00:00Z","timestamp":1727308800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["Commun. ACM"],"published-print":{"date-parts":[[2024,10]]},"abstract":"<jats:p>Improving the performance of deep neural networks (DNNs) is important to both the compiler and neural architecture search (NAS) communities. Compilers apply program transformations in order to exploit hardware parallelism and memory hierarchy. However, legality concerns mean they fail to exploit the natural robustness of neural networks. In contrast, NAS techniques mutate networks by operations such as the grouping or bottlenecking of convolutions, exploiting the resilience of DNNs. In this work, we express such neural architecture operations as program transformations whose legality depends on a notion of representational capacity. This allows them to be combined with existing transformations into a unified optimization framework. This unification allows us to express existing NAS operations as combinations of simpler transformations. Crucially, it allows us to generate and explore new tensor convolutions. We prototyped the combined framework in TVM and were able to significantly reduce inference time and NAS search time.<\/jats:p>","DOI":"10.1145\/3624775","type":"journal-article","created":{"date-parts":[[2024,9,25]],"date-time":"2024-09-25T18:22:09Z","timestamp":1727288529000},"page":"92-100","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":0,"title":["Neural Architecture Search as Program Transformation Exploration"],"prefix":"10.1145","volume":"67","author":[{"given":"Jack","family":"Turner","sequence":"first","affiliation":[{"name":"University of Edinburgh, Edinburgh, Scotland Uk"}]},{"given":"Elliot J.","family":"Crowley","sequence":"additional","affiliation":[{"name":"University of Edinburgh, Edinburgh, Scotland Uk"}]},{"given":"Michael F.P.","family":"O'Boyle","sequence":"additional","affiliation":[{"name":"University of Edinburgh, Edinburgh, Scotland Uk"}]}],"member":"320","published-online":{"date-parts":[[2024,9,26]]},"reference":[{"key":"e_1_3_1_2_2","doi-asserted-by":"crossref","unstructured":"Chen T. et al. Benchnn: On the broad potential application scope of hardware neural network accelerators. In Intern. Symp. on Workload Characterization 2012.","DOI":"10.1109\/IISWC.2012.6402898"},{"key":"e_1_3_1_3_2","doi-asserted-by":"crossref","unstructured":"Chen T. et al. Diannao: A small-footprint high-throughput accelerator for ubiquitous machine-learning. In Proceedings of the Intern. Conf. on Architectural Support for Programming Languages and Operating Systems 2014.","DOI":"10.1145\/2541940.2541967"},{"key":"e_1_3_1_4_2","unstructured":"Chen T. et al. {TVM}: An automated end-to-end optimizing compiler for deep learning. In USENIX Symp. on Operating Systems Design and Implementation 2018."},{"key":"e_1_3_1_5_2","unstructured":"Chowdhery A. et al. Visual wake words dataset. arXiv preprint arXiv:1906.05721 2019."},{"key":"e_1_3_1_6_2","unstructured":"Dong X. and Yang Y. Nas-bench-201: Extending the scope of reproducible neural architecture search. In Intern. Conf. on Learning Representations 2020."},{"key":"e_1_3_1_7_2","unstructured":"Grosser T. et al. Polly-polyhedral optimization in llvm. In Intern. Workshop on Polyhedral Compilation Techniques 2011."},{"key":"e_1_3_1_8_2","doi-asserted-by":"crossref","unstructured":"He K. Zhang X. Ren S. and Sun J. Deep residual learning for image recognition. In Proceedings of the IEEE Conf. on Computer Vision and Pattern Recognition 2016.","DOI":"10.1109\/CVPR.2016.90"},{"key":"e_1_3_1_9_2","unstructured":"Hinton G. Vinyals O. and Dean J. Distilling the knowledge in a neural network. arXiv preprint arXiv:1503.02531 2015."},{"key":"e_1_3_1_10_2","doi-asserted-by":"crossref","unstructured":"Huang G. Liu Z. Van Der Maaten L. and Weinberger K.Q. Densely connected convolutional networks. In Proceedings of the IEEE conf. on computer vision and pattern recognition 2017 4700\u20134708.","DOI":"10.1109\/CVPR.2017.243"},{"key":"e_1_3_1_11_2","unstructured":"Huang Y. et al. Gpipe: Efficient training of giant neural networks using pipeline parallelism. In Advances in Neural Information Processing Systems 2019."},{"key":"e_1_3_1_12_2","unstructured":"Peng J. et al. Accelerating deep neural networks with spatial bottleneck modules. arXiv preprint arXiv:1809.02601 2018."},{"key":"e_1_3_1_13_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.neunet.2014.09.003"},{"key":"e_1_3_1_14_2","doi-asserted-by":"crossref","unstructured":"Tan M. et al. MnasNet: Platform-aware neural architecture search for mobile. In Proceedings of the IEEE Conf. on Computer Vision and Pattern Recognition 2019.","DOI":"10.1109\/CVPR.2019.00293"},{"key":"e_1_3_1_15_2","unstructured":"Turner J. et al. Blockswap: Fisher-guided block substitution for network compression on a budget. In Intern. Conf. on Learning Representations 2020."},{"key":"e_1_3_1_16_2","doi-asserted-by":"crossref","unstructured":"Vasilache N. Bastoul C. and Cohen A. Polyhedral code generation in the real world. In Intern. Conf. on Compiler Construction 2006.","DOI":"10.1007\/11688839_16"},{"key":"e_1_3_1_17_2","doi-asserted-by":"crossref","unstructured":"Wu B. et al. FBNet: Hardware-aware efficient convnet design via differentiable neural architecture search. In Proceedings of the IEEE Conf. on Computer Vision and Pattern Recognition 2019.","DOI":"10.1109\/CVPR.2019.01099"},{"key":"e_1_3_1_18_2","doi-asserted-by":"crossref","unstructured":"Xie S. et al. Aggregated residual transformations for deep neural networks. In Proceedings of the IEEE Conf. on Computer Vision and Pattern Recognition 2017.","DOI":"10.1109\/CVPR.2017.634"},{"key":"e_1_3_1_19_2","unstructured":"Ying C. et al. Nas-bench-101: Towards reproducible neural architecture search. arXiv preprint arXiv:1902.09635 2019."},{"key":"e_1_3_1_20_2","unstructured":"Zheng L. et al. Ansor: Generating high-performance tensor programs for deep learning. arXiv preprint arXiv:2006.06762 2020."}],"container-title":["Communications of the ACM"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3624775","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3624775","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T16:35:44Z","timestamp":1750178144000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3624775"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,9,26]]},"references-count":19,"journal-issue":{"issue":"10","published-print":{"date-parts":[[2024,10]]}},"alternative-id":["10.1145\/3624775"],"URL":"https:\/\/doi.org\/10.1145\/3624775","relation":{},"ISSN":["0001-0782","1557-7317"],"issn-type":[{"type":"print","value":"0001-0782"},{"type":"electronic","value":"1557-7317"}],"subject":[],"published":{"date-parts":[[2024,9,26]]},"assertion":[{"value":"2024-09-26","order":3,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}