{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,10]],"date-time":"2026-04-10T00:09:09Z","timestamp":1775779749252,"version":"3.50.1"},"reference-count":92,"publisher":"Association for Computing Machinery (ACM)","issue":"3","license":[{"start":{"date-parts":[[2023,4,20]],"date-time":"2023-04-20T00:00:00Z","timestamp":1681948800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"name":"NSF","award":["CCF-1811109"],"award-info":[{"award-number":["CCF-1811109"]}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Embed. Comput. Syst."],"published-print":{"date-parts":[[2023,5,31]]},"abstract":"<jats:p>Recently, automated co-design of machine learning (ML) models and accelerator architectures has attracted significant attention from both the industry and academia. However, most co-design frameworks either explore a limited search space or employ suboptimal exploration techniques for simultaneous design decision investigations of the ML model and the accelerator. Furthermore, training the ML model and simulating the accelerator performance is computationally expensive. To address these limitations, this work proposes a novel neural architecture and hardware accelerator co-design framework, called CODEBench. It comprises two new benchmarking sub-frameworks, CNNBench and AccelBench, which explore expanded design spaces of convolutional neural networks (CNNs) and CNN accelerators. CNNBench leverages an advanced search technique, Bayesian Optimization using Second-order Gradients and Heteroscedastic Surrogate Model for Neural Architecture Search, to efficiently train a neural heteroscedastic surrogate model to converge to an optimal CNN architecture by employing second-order gradients. AccelBench performs cycle-accurate simulations for diverse accelerator architectures in a vast design space. With the proposed co-design method, called Bayesian Optimization using Second-order Gradients and Heteroscedastic Surrogate Model for Co-Design of CNNs and Accelerators, our best CNN\u2013accelerator pair achieves 1.4% higher accuracy on the CIFAR-10 dataset compared to the state-of-the-art pair while enabling 59.1% lower latency and 60.8% lower energy consumption. On the ImageNet dataset, it achieves 3.7% higher Top1 accuracy at 43.8% lower latency and 11.2% lower energy consumption. CODEBench outperforms the state-of-the-art framework, i.e., Auto-NBA, by achieving 1.5% higher accuracy and 34.7\u00d7 higher throughput while enabling 11.0\u00d7 lower energy-delay product and 4.0\u00d7 lower chip area on CIFAR-10.<\/jats:p>","DOI":"10.1145\/3575798","type":"journal-article","created":{"date-parts":[[2022,12,8]],"date-time":"2022-12-08T14:26:11Z","timestamp":1670509571000},"page":"1-30","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":20,"title":["CODEBench: A Neural Architecture and Hardware Accelerator Co-Design Framework"],"prefix":"10.1145","volume":"22","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-9230-5877","authenticated-orcid":false,"given":"Shikhar","family":"Tuli","sequence":"first","affiliation":[{"name":"Princeton University, Princeton, USA"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-9557-6050","authenticated-orcid":false,"given":"Chia-Hao","family":"Li","sequence":"additional","affiliation":[{"name":"Princeton University, Princeton, USA"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-5809-7031","authenticated-orcid":false,"given":"Ritvik","family":"Sharma","sequence":"additional","affiliation":[{"name":"Stanford University, Stanford, USA"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-1539-0369","authenticated-orcid":false,"given":"Niraj K.","family":"Jha","sequence":"additional","affiliation":[{"name":"Princeton University, USA"}]}],"member":"320","published-online":{"date-parts":[[2023,4,20]]},"reference":[{"key":"e_1_3_2_2_2","volume-title":"CHaiDNNv2\u2014HLS based DNN Accelerator Library for Xilinx Ultrascale + MPSoCs","year":"2022","unstructured":"2022. CHaiDNNv2\u2014HLS based DNN Accelerator Library for Xilinx Ultrascale + MPSoCs. Retrieved from https:\/\/github.com\/Xilinx\/CHaiDNN."},{"key":"e_1_3_2_3_2","volume-title":"NVIDIA Deep Learning Accelerator","year":"2022","unstructured":"2022. NVIDIA Deep Learning Accelerator. Retrieved from http:\/\/nvdla.org."},{"key":"e_1_3_2_4_2","volume-title":"Synopsys Design Compiler","year":"2022","unstructured":"2022. Synopsys Design Compiler. Retrieved from https:\/\/www.synopsys.com\/implementation-and-signoff\/rtl-synthesis-test\/dc-ultra.html."},{"key":"e_1_3_2_5_2","volume-title":"Proceedings of the ACM\/EDAC\/IEEE Design Automation Conference","author":"Abdelfattah Mohamed S.","year":"2020","unstructured":"Mohamed S. Abdelfattah, \u0141ukasz Dudziak, Thomas Chau, Royson Lee, Hyeji Kim, and Nicholas D. Lane. 2020. Best of both worlds: AutoML codesign of a CNN and its hardware accelerator. In Proceedings of the ACM\/EDAC\/IEEE Design Automation Conference. Article 192, 6 pages."},{"key":"e_1_3_2_6_2","first-page":"271","volume-title":"Proceedings of the International Conference on Pattern Recognition Applications and Methods","volume":"1","author":"Abu-Aisheh Zeina","year":"2015","unstructured":"Zeina Abu-Aisheh, Romain Raveaux, Jean-Yves Ramel, and Patrick Martineau. 2015. An exact graph edit distance algorithm for solving pattern recognition problems. In Proceedings of the International Conference on Pattern Recognition Applications and Methods, Vol. 1. 271\u2013278."},{"key":"e_1_3_2_7_2","first-page":"1","volume-title":"Proceedings of the ACM\/IEEE International Symposium of Computer Architecture","volume":"44","author":"Albericio Jorge","year":"2016","unstructured":"Jorge Albericio, Patrick Judd, Tayler Hetherington, Tor Aamodt, Natalie Enright Jerger, and Andreas Moshovos. 2016. Cnvlutin: Ineffectual-neuron-free deep neural network computing. In Proceedings of the ACM\/IEEE International Symposium of Computer Architecture, Vol. 44. 1\u201313."},{"key":"e_1_3_2_8_2","first-page":"4322","volume-title":"Proceedings of the International Joint Conference on Artificial Intelligence","author":"Benmeziane Hadjer","year":"2021","unstructured":"Hadjer Benmeziane, Kaoutar El Maghraoui, Hamza Ouarnoughi, Smail Niar, Martin Wistuba, and Naigang Wang. 2021. Hardware-aware neural architecture search: Survey and taxonomy. In Proceedings of the International Joint Conference on Artificial Intelligence. 4322\u20134329."},{"key":"e_1_3_2_9_2","volume-title":"Proceedings of the AAAI Conference on Artificial Intelligence and Innovative Applications of Artificial Intelligence Conference and AAAI Symposium on Educational Advances in Artificial Intelligence","author":"Cai Han","year":"2018","unstructured":"Han Cai, Tianyao Chen, Weinan Zhang, Yong Yu, and Jun Wang. 2018. Efficient architecture search by network transformation. In Proceedings of the AAAI Conference on Artificial Intelligence and Innovative Applications of Artificial Intelligence Conference and AAAI Symposium on Educational Advances in Artificial Intelligence. Article 340, 2787\u20132794."},{"key":"e_1_3_2_10_2","volume-title":"Proceedings of the International Conference on Learning Representations","author":"Cai Han","year":"2020","unstructured":"Han Cai, Chuang Gan, Tianzhe Wang, Zhekai Zhang, and Song Han. 2020. Once-for-all: Train one network and specialize it for efficient deployment. In Proceedings of the International Conference on Learning Representations."},{"key":"e_1_3_2_11_2","volume-title":"Proceedings of the International Conference on Learning Representations","author":"Cai Han","year":"2019","unstructured":"Han Cai, Ligeng Zhu, and Song Han. 2019. ProxylessNAS: Direct neural architecture search on target task and hardware. In Proceedings of the International Conference on Learning Representations."},{"key":"e_1_3_2_12_2","doi-asserted-by":"crossref","first-page":"269","DOI":"10.1145\/2541940.2541967","volume-title":"Proceedings of the International Conference on Architectural Support for Programming Languages and Operating Systems","volume":"42","author":"Chen Tianshi","year":"2014","unstructured":"Tianshi Chen, Zidong Du, Ninghui Sun, Jia Wang, Chengyong Wu, Yunji Chen, and Olivier Temam. 2014. DianNao: A small-footprint high-throughput accelerator for ubiquitous machine-learning. In Proceedings of the International Conference on Architectural Support for Programming Languages and Operating Systems, Vol. 42. 269\u2013284."},{"key":"e_1_3_2_13_2","first-page":"609","volume-title":"Proceedings of the IEEE\/ACM International Symposium on Microarchitecture","author":"Chen Yunji","year":"2014","unstructured":"Yunji Chen, Tao Luo, Shaoli Liu, Shijin Zhang, Liqiang He, Jia Wang, Ling Li, Tianshi Chen, Zhiwei Xu, Ninghui Sun, and Olivier Temam. 2014. DaDianNao: A machine-learning supercomputer. In Proceedings of the IEEE\/ACM International Symposium on Microarchitecture. 609\u2013622."},{"key":"e_1_3_2_14_2","doi-asserted-by":"publisher","DOI":"10.1109\/JETCAS.2019.2910232"},{"key":"e_1_3_2_15_2","first-page":"7090","volume-title":"Proceedings of the AAAI Conference on Artificial Intelligence","volume":"35","author":"Cheng Hsin-Pai","year":"2021","unstructured":"Hsin-Pai Cheng, Tunhou Zhang, Yixing Zhang, Shiyu Li, Feng Liang, Feng Yan, Meng Li, Vikas Chandra, Hai Li, and Yiran Chen. 2021. NASGEM: Neural architecture search via graph embedding method. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 35. 7090\u20137098."},{"key":"e_1_3_2_16_2","article-title":"PACT: Parameterized clipping activation for quantized neural networks","volume":"1805","author":"Choi Jungwook","unstructured":"Jungwook Choi, Zhuo Wang, Swagath Venkataramani, Pierce I-Jen Chuang, Vijayalakshmi Srinivasan, and Kailash Gopalakrishnan. 2018. PACT: Parameterized clipping activation for quantized neural networks. CoRR abs\/1805.06085.","journal-title":"CoRR"},{"key":"e_1_3_2_17_2","first-page":"337","volume-title":"Proceedings of the ACM\/IEEE Design Automation Conference","author":"Choi Kanghyun","year":"2021","unstructured":"Kanghyun Choi, Deokki Hong, Hojae Yoon, Joonsang Yu, Youngsok Kim, and Jinho Lee. 2021. DANCE: Differentiable accelerator\/network co-exploration. In Proceedings of the ACM\/IEEE Design Automation Conference. 337\u2013342."},{"key":"e_1_3_2_18_2","first-page":"1800","volume-title":"Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition","author":"Chollet Fran\u00e7ois","year":"2017","unstructured":"Fran\u00e7ois Chollet. 2017. Xception: Deep learning with depthwise separable convolutions. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 1800\u20131807."},{"key":"e_1_3_2_19_2","doi-asserted-by":"publisher","DOI":"10.1109\/TC.2019.2914438"},{"key":"e_1_3_2_20_2","first-page":"11390","volume-title":"Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition","volume":"10","author":"Dai Xiaoliang","year":"2019","unstructured":"Xiaoliang Dai, Peizhao Zhang, Bichen Wu, Hongxu Yin, Fei Sun, Yanghan Wang, Marat Dukhan, Yunqing Hu, Yiming Wu, Yangqing Jia, Peter Vajda, Matt Uyttendaele, and Niraj K. Jha. 2019. ChamNet: Towards efficient network design through platform-aware model adaptation. In Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Vol. 10. 11390\u201311399."},{"key":"e_1_3_2_21_2","first-page":"12275","volume-title":"Proceedings of the AAAI Conference on Artificial Intelligence","volume":"35","author":"Dang Duc-Cuong","year":"2021","unstructured":"Duc-Cuong Dang, Anton Eremeev, and Per Kristian Lehre. 2021. Escaping local optima with non-elitist evolutionary algorithms. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 35. 12275\u201312283."},{"key":"e_1_3_2_22_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2009.5206848"},{"key":"e_1_3_2_23_2","doi-asserted-by":"publisher","DOI":"10.1109\/TCAD.2012.2185930"},{"key":"e_1_3_2_24_2","first-page":"92","volume-title":"Proceedings of the ACM\/IEEE International Symposium on Computer Architecture","author":"Du Zidong","year":"2015","unstructured":"Zidong Du, Robert Fasthuber, Tianshi Chen, Paolo Ienne, Ling Li, Tao Luo, Xiaobing Feng, Yunji Chen, and Olivier Temam. 2015. ShiDianNao: Shifting vision processing closer to the sensor. In Proceedings of the ACM\/IEEE International Symposium on Computer Architecture. 92\u2013104."},{"key":"e_1_3_2_25_2","volume-title":"Proceedings of the International Conference on Learning Representations","author":"Elsken Thomas","year":"2019","unstructured":"Thomas Elsken, Jan Hendrik Metzen, and Frank Hutter. 2019. Efficient multi-objective neural architecture search via Lamarckian evolution. In Proceedings of the International Conference on Learning Representations."},{"issue":"55","key":"e_1_3_2_26_2","first-page":"1","article-title":"Neural architecture search: A survey","volume":"20","author":"Elsken Thomas","year":"2019","unstructured":"Thomas Elsken, Jan Hendrik Metzen, and Frank Hutter. 2019. Neural architecture search: A survey. J. Mach. Learn. Res. 20, 55 (2019), 1\u201321.","journal-title":"J. Mach. Learn. Res."},{"key":"e_1_3_2_27_2","first-page":"3505","volume-title":"Proceedings of the International Conference on Machine Learning","volume":"139","author":"Fu Yonggan","year":"2021","unstructured":"Yonggan Fu, Yongan Zhang, Yang Zhang, David Cox, and Yingyan Lin. 2021. Auto-NBA: Efficient and effective search over the joint space of networks, bitwidths, and accelerators. In Proceedings of the International Conference on Machine Learning, Vol. 139. 3505\u20133517."},{"key":"e_1_3_2_28_2","first-page":"1050","volume-title":"Proceedings of the International Conference on Machine Learning","volume":"48","author":"Gal Yarin","year":"2016","unstructured":"Yarin Gal and Zoubin Ghahramani. 2016. Dropout as a Bayesian approximation: Representing model uncertainty in deep learning. In Proceedings of the International Conference on Machine Learning, Vol. 48. 1050\u20131059."},{"issue":"10","key":"e_1_3_2_29_2","doi-asserted-by":"crossref","first-page":"1868","DOI":"10.1109\/TVLSI.2018.2832607","article-title":"Hybrid monolithic 3-D IC floorplanner","volume":"26","author":"Guler Abdullah","year":"2018","unstructured":"Abdullah Guler and Niraj K. Jha. 2018. Hybrid monolithic 3-D IC floorplanner. IEEE Trans. VLSI Syst. 26, 10 (2018), 1868\u20131880.","journal-title":"IEEE Trans. VLSI Syst."},{"key":"e_1_3_2_30_2","first-page":"1737","volume-title":"Proceedings of the International Conference on Machine Learning","author":"Gupta Suyog","year":"2015","unstructured":"Suyog Gupta, Ankur Agrawal, Kailash Gopalakrishnan, and Pritish Narayanan. 2015. Deep learning with limited numerical precision. In Proceedings of the International Conference on Machine Learning. 1737\u20131746."},{"key":"e_1_3_2_31_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2016.90"},{"key":"e_1_3_2_32_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2017.243"},{"key":"e_1_3_2_33_2","article-title":"SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and <1MB model size","volume":"1602","author":"Iandola Forrest N.","unstructured":"Forrest N. Iandola, Matthew W. Moskewicz, Khalid Ashraf, Song Han, William J. Dally, and Kurt Keutzer. 2016. SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and <1MB model size. CoRR abs\/1602.07360.","journal-title":"CoRR"},{"key":"e_1_3_2_34_2","first-page":"448","volume-title":"Proceedings of the International Conference on Machine Learning","volume":"37","author":"Ioffe Sergey","year":"2015","unstructured":"Sergey Ioffe and Christian Szegedy. 2015. Batch normalization: Accelerating deep network training by reducing internal covariate shift. In Proceedings of the International Conference on Machine Learning, Vol. 37. 448\u2013456."},{"key":"e_1_3_2_35_2","doi-asserted-by":"publisher","DOI":"10.1109\/TCAD.2020.2986127"},{"key":"e_1_3_2_36_2","doi-asserted-by":"publisher","DOI":"10.1007\/s10462-020-09825-6"},{"key":"e_1_3_2_37_2","article-title":"Learning Multiple Layers of Features from Tiny Images","author":"Krizhevsky Alex","unstructured":"Alex Krizhevsky. 2009. Learning Multiple Layers of Features from Tiny Images. Master\u2019s thesis, Department of Computer Science, University of Toronto.","journal-title":"Master\u2019s thesis, Department of Computer Science, University of Toronto"},{"key":"e_1_3_2_38_2","first-page":"1097","volume-title":"Proceedings of the International Conference on Neural Information Processing Systems","volume":"1","author":"Krizhevsky Alex","year":"2012","unstructured":"Alex Krizhevsky, Ilya Sutskever, and Geoffrey E. Hinton. 2012. ImageNet classification with deep convolutional neural networks. In Proceedings of the International Conference on Neural Information Processing Systems, Vol. 1. 1097\u20131105."},{"key":"e_1_3_2_39_2","doi-asserted-by":"publisher","DOI":"10.1109\/5.726791"},{"key":"e_1_3_2_40_2","volume-title":"Proceedings of the International Conference on Learning Representations","author":"Li Chaojian","year":"2021","unstructured":"Chaojian Li, Zhongzhi Yu, Yonggan Fu, Yongan Zhang, Yang Zhao, Haoran You, Qixuan Yu, Yue Wang, Cong Hao, and Yingyan Lin. 2021. HW-NAS-Bench: Hardware-aware neural architecture search benchmark. In Proceedings of the International Conference on Learning Representations."},{"key":"e_1_3_2_41_2","first-page":"230","volume-title":"Proceedings of the Machine Learning and Systems","volume":"2","author":"Li Liam","year":"2020","unstructured":"Liam Li, Kevin Jamieson, Afshin Rostamizadeh, Ekaterina Gonina, Jonathan Ben-tzur, Moritz Hardt, Benjamin Recht, and Ameet Talwalkar. 2020. A system for massively parallel hyperparameter tuning. In Proceedings of the Machine Learning and Systems, Vol. 2. 230\u2013246."},{"key":"e_1_3_2_42_2","volume-title":"Proceedings of the ACM\/EDAC\/IEEE Design Automation Conference","author":"Li Yuhong","year":"2020","unstructured":"Yuhong Li, Cong Hao, Xiaofan Zhang, Xinheng Liu, Yao Chen, Jinjun Xiong, Wen-mei Hwu, and Deming Chen. 2020. EDD: Efficient differentiable DNN architecture and implementation co-search for embedded AI solutions. In Proceedings of the ACM\/EDAC\/IEEE Design Automation Conference. Article 130, 6 pages."},{"key":"e_1_3_2_43_2","first-page":"11711","volume-title":"Proceedings of the International Conference on Neural Information Processing Systems","volume":"33","author":"Lin Ji","year":"2020","unstructured":"Ji Lin, Wei-Ming Chen, Yujun Lin, John Cohn, Chuang Gan, and Song Han. 2020. MCUNet: Tiny deep learning on IoT devices. In Proceedings of the International Conference on Neural Information Processing Systems, Vol. 33. 11711\u201311722."},{"key":"e_1_3_2_44_2","article-title":"Network in network","volume":"1312","author":"Lin Min","unstructured":"Min Lin, Qiang Chen, and Shuicheng Yan. 2014. Network in network. CoRR abs\/1312.4400.","journal-title":"CoRR"},{"key":"e_1_3_2_45_2","first-page":"1051","volume-title":"Proceedings of the ACM\/IEEE Design Automation Conference","author":"Lin Yujun","year":"2021","unstructured":"Yujun Lin, Mengtian Yang, and Song Han. 2021. NAAS: Neural accelerator architecture search. In Proceedings of the ACM\/IEEE Design Automation Conference. 1051\u20131056."},{"key":"e_1_3_2_46_2","volume-title":"Proceedings of the International Conference on Learning Representations","author":"Liu Hanxiao","year":"2018","unstructured":"Hanxiao Liu, Karen Simonyan, and Yiming Yang. 2018. DARTS: Differentiable architecture search. In Proceedings of the International Conference on Learning Representations."},{"key":"e_1_3_2_47_2","volume-title":"Proceedings of the International Conference on Learning Representations","author":"Loshchilov Ilya","year":"2019","unstructured":"Ilya Loshchilov and Frank Hutter. 2019. Decoupled weight decay regularization. In Proceedings of the International Conference on Learning Representations."},{"issue":"9","key":"e_1_3_2_48_2","doi-asserted-by":"crossref","first-page":"2970","DOI":"10.1109\/TCAD.2021.3118963","article-title":"Fast design space exploration of nonlinear systems: Part I","volume":"41","author":"Narain Sanjai","year":"2022","unstructured":"Sanjai Narain, Emily Mak, Dana Chee, Brendan Englot, Kishore Pochiraju, Niraj K. Jha, and Karthik Narayan. 2022. Fast design space exploration of nonlinear systems: Part I. IEEE Trans. Comput.-Aid. Des. Integr. Circ. Syst. 41, 9 (2022), 2970\u20132983.","journal-title":"IEEE Trans. Comput.-Aid. Des. Integr. Circ. Syst."},{"key":"e_1_3_2_49_2","first-page":"189","volume-title":"Proceedings of the European Conference on Computer Vision","author":"Ning Xuefei","year":"2020","unstructured":"Xuefei Ning, Yin Zheng, Tianchen Zhao, Yu Wang, and Huazhong Yang. 2020. A generic graph-based neural architecture encoding scheme for predictor-based NAS. In Proceedings of the European Conference on Computer Vision. 189\u2013204."},{"key":"e_1_3_2_50_2","doi-asserted-by":"publisher","DOI":"10.1109\/LCA.2015.2402435"},{"key":"e_1_3_2_51_2","article-title":"Searching for activation functions","volume":"1710","author":"Ramachandran Prajit","unstructured":"Prajit Ramachandran, Barret Zoph, and Quoc V. Le. 2017. Searching for activation functions. CoRR abs\/1710.05941.","journal-title":"CoRR"},{"key":"e_1_3_2_52_2","first-page":"1","volume-title":"Proceedings of the IEEE\/ACM International Symposium on Low Power Electronics and Design","author":"Reagen Brandon","year":"2017","unstructured":"Brandon Reagen, Jos\u00e9 Miguel Hern\u00e1ndez-Lobato, Robert Adolf, Michael Gelbart, Paul Whatmough, Gu-Yeon Wei, and David Brooks. 2017. A case for efficient accelerator design space exploration via Bayesian optimization. In Proceedings of the IEEE\/ACM International Symposium on Low Power Electronics and Design. 1\u20136."},{"key":"e_1_3_2_53_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2018.00474"},{"key":"e_1_3_2_54_2","unstructured":"David Schor. 2018. Hot Chips 30: Nvidia Xavier SoC. Retrieved from https:\/\/hc30.hotchips.org\/."},{"key":"e_1_3_2_55_2","first-page":"290","volume-title":"Proceedings of the IEEE Annual Symposium on VLSI","author":"Shafaei Alireza","year":"2014","unstructured":"Alireza Shafaei, Yanzhi Wang, Xue Lin, and Massoud Pedram. 2014. FinCACTI: Architectural analysis and modeling of caches with deeply-scaled FinFET devices. In Proceedings of the IEEE Annual Symposium on VLSI. 290\u2013295."},{"key":"e_1_3_2_56_2","first-page":"97","volume-title":"Proceedings of the ACM\/IEEE International Symposium on Computer Architecture","author":"Shao Yakun Sophia","year":"2014","unstructured":"Yakun Sophia Shao, Brandon Reagen, Gu-Yeon Wei, and David Brooks. 2014. Aladdin: A pre-RTL, power-performance accelerator simulator enabling large design space exploration of customized architectures. In Proceedings of the ACM\/IEEE International Symposium on Computer Architecture. 97\u2013108."},{"key":"e_1_3_2_57_2","doi-asserted-by":"publisher","DOI":"10.1145\/3140659.3080221"},{"key":"e_1_3_2_58_2","volume-title":"Proceedings of the International Conference on Learning Representations","author":"Siems Julien","year":"2022","unstructured":"Julien Siems, Lucas Zimmer, Arber Zela, Jovita Lukasik, Margret Keuper, and Frank Hutter. 2022. Surrogate NAS benchmarks: Going beyond the limited search spaces of tabular NAS benchmarks. In Proceedings of the International Conference on Learning Representations."},{"key":"e_1_3_2_59_2","volume-title":"Proceedings of the International Conference on Learning Representations","author":"Simonyan Karen","year":"2015","unstructured":"Karen Simonyan and Andrew Zisserman. 2015. Very deep convolutional networks for large-scale image recognition. In Proceedings of the International Conference on Learning Representations."},{"key":"e_1_3_2_60_2","first-page":"2951","volume-title":"Proceedings of the International Conference on Neural Information Processing Systems","volume":"25","author":"Snoek Jasper","year":"2012","unstructured":"Jasper Snoek, Hugo Larochelle, and Ryan P. Adams. 2012. Practical Bayesian optimization of machine learning algorithms. In Proceedings of the International Conference on Neural Information Processing Systems, Vol. 25. 2951\u20132959."},{"issue":"56","key":"e_1_3_2_61_2","first-page":"1929","article-title":"Dropout: A simple way to prevent neural networks from overfitting","volume":"15","author":"Srivastava Nitish","year":"2014","unstructured":"Nitish Srivastava, Geoffrey Hinton, Alex Krizhevsky, Ilya Sutskever, and Ruslan Salakhutdinov. 2014. Dropout: A simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15, 56 (2014), 1929\u20131958.","journal-title":"J. Mach. Learn. Res."},{"key":"e_1_3_2_62_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.vlsi.2017.02.002"},{"key":"e_1_3_2_63_2","first-page":"867","volume-title":"Proceedings of the Design, Automation & Test in Europe Conference & Exhibition","author":"Sun Hanbo","year":"2022","unstructured":"Hanbo Sun, Chenyu Wang, Zhenhua Zhu, Xuefei Ning, Guohao Dai, Huazhong Yang, and Yu Wang. 2022. Gibbon: Efficient co-exploration of NN model and processing-in-memory architecture. In Proceedings of the Design, Automation & Test in Europe Conference & Exhibition. 867\u2013872."},{"key":"e_1_3_2_64_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-031-01766-7"},{"key":"e_1_3_2_65_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2015.7298594"},{"key":"e_1_3_2_66_2","first-page":"6105","volume-title":"Proceedings of the International Conference on Machine Learning","volume":"97","author":"Tan Mingxing","year":"2019","unstructured":"Mingxing Tan and Quoc Le. 2019. EfficientNet: Rethinking model scaling for convolutional neural networks. In Proceedings of the International Conference on Machine Learning, Vol. 97. 6105\u20136114."},{"key":"e_1_3_2_67_2","volume-title":"Proceedings of the International Conference on Neuromorphic Systems 2020","author":"Tuli Shikhar","year":"2020","unstructured":"Shikhar Tuli and Debanjan Bhowmik. 2020. Design of a conventional-transistor-based analog integrated circuit for on-chip learning in a spiking neural network. In Proceedings of the International Conference on Neuromorphic Systems 2020. Article 14, 8 pages."},{"key":"e_1_3_2_68_2","article-title":"FlexiBERT: Are current transformer architectures too homogeneous and rigid?","volume":"2205","author":"Tuli Shikhar","unstructured":"Shikhar Tuli, Bhishma Dedhia, Shreshth Tuli, and Niraj K. Jha. (2022)FlexiBERT: Are current transformer architectures too homogeneous and rigid? CoRR abs\/2205.11656.","journal-title":"CoRR"},{"key":"e_1_3_2_69_2","doi-asserted-by":"publisher","DOI":"10.1109\/TPDS.2021.3087349"},{"key":"e_1_3_2_70_2","first-page":"325","volume-title":"Proceesings of the IEEE\/ACM International Symposium on Code Generation and Optimization","author":"Vaidya Miheer","year":"2022","unstructured":"Miheer Vaidya, Aravind Sukumaran-Rajam, Atanas Rountev, and P. Sadayappan. 2022. Comprehensive accelerator-dataflow co-design optimization for convolutional neural networks. In Proceesings of the IEEE\/ACM International Symposium on Code Generation and Optimization. 325\u2013335."},{"key":"e_1_3_2_71_2","first-page":"5998","volume-title":"Proceedings of the International Conference on Neural Information Processing Systems","volume":"30","author":"Vaswani Ashish","year":"2017","unstructured":"Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, \u0141ukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. In Proceedings of the International Conference on Neural Information Processing Systems, Vol. 30. 5998\u20136008."},{"key":"e_1_3_2_72_2","first-page":"118","volume-title":"Proceedings of the International Conference on Neural Information Processing Systems","author":"Wang Hao","year":"2016","unstructured":"Hao Wang, Xingjian Shi, and Dit-Yan Yeung. 2016. Natural-parameter networks: A class of probabilistic neural networks. In Proceedings of the International Conference on Neural Information Processing Systems. 118\u2013126."},{"key":"e_1_3_2_73_2","first-page":"10293","volume-title":"Proceedings of the AAAI Conference on Artificial Intelligence","volume":"35","author":"White Colin","year":"2021","unstructured":"Colin White, Willie Neiswanger, and Yash Savani. 2021. BANANAS: Bayesian optimization with neural architectures for neural architecture search. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 35. 10293\u201310301."},{"key":"e_1_3_2_74_2","article-title":"Local search is state of the art for NAS benchmarks","volume":"2005","author":"White Colin","unstructured":"Colin White, Sam Nolen, and Yash Savani. 2020. Local search is state of the art for NAS benchmarks. CoRR abs\/2005.02960.","journal-title":"CoRR"},{"key":"e_1_3_2_75_2","doi-asserted-by":"publisher","DOI":"10.1007\/BF00992696"},{"key":"e_1_3_2_76_2","first-page":"1","volume-title":"Proceedings of the IEEE\/ACM International Conference on Computer-Aided Design","author":"Wu Yannan Nellie","year":"2019","unstructured":"Yannan Nellie Wu, Joel S. Emer, and Vivienne Sze. 2019. Accelergy: An architecture-level energy estimation methodology for accelerator designs. In Proceedings of the IEEE\/ACM International Conference on Computer-Aided Design. 1\u20138."},{"issue":"2","key":"e_1_3_2_77_2","first-page":"962","article-title":"Fully dynamic inference with deep neural networks","volume":"10","author":"Xia Wenhan","year":"2022","unstructured":"Wenhan Xia, Hongxu Yin, Xiaoliang Dai, and Niraj K. Jha. 2022. Fully dynamic inference with deep neural networks. IEEE Trans. Emerg. Top. Comput. 10, 2 (2022), 962\u2013972.","journal-title":"IEEE Trans. Emerg. Top. Comput."},{"key":"e_1_3_2_78_2","volume-title":"Proceedings of the International Conference on Learning Representations","author":"Xu Keyulu","year":"2019","unstructured":"Keyulu Xu, Weihua Hu, Jure Leskovec, and Stefanie Jegelka. 2019. How powerful are graph neural networks? In Proceedings of the International Conference on Learning Representations."},{"key":"e_1_3_2_79_2","volume-title":"Proceedings of the ACM\/EDAC\/IEEE Design Automation Conference","author":"Yang Lei","year":"2020","unstructured":"Lei Yang, Zheyu Yan, Meng Li, Hyoukjun Kwon, Weiwen Jiang, Liangzhen Lai, Yiyu Shi, Tushar Krishna, and Vikas Chandra. 2020. Co-exploration of neural architectures and heterogeneous ASIC accelerator designs targeting multiple tasks. In Proceedings of the ACM\/EDAC\/IEEE Design Automation Conference. Article 163, 6 pages."},{"key":"e_1_3_2_80_2","first-page":"10665","volume-title":"Proceedings of the AAAI Conference on Artificial Intelligence","volume":"35","author":"Yao Zhewei","year":"2021","unstructured":"Zhewei Yao, Amir Gholami, Sheng Shen, Mustafa Mustafa, Kurt Keutzer, and Michael Mahoney. 2021. ADAHESSIAN: An adaptive second order optimizer for machine learning. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 35. 10665\u201310673."},{"key":"e_1_3_2_81_2","first-page":"7105","volume-title":"Proceedings of the International Conference on Machine Learning","volume":"97","author":"Ying Chris","year":"2019","unstructured":"Chris Ying, Aaron Klein, Eric Christiansen, Esteban Real, Kevin Murphy, and Frank Hutter. 2019. NAS-Bench-101: Towards reproducible neural architecture search. In Proceedings of the International Conference on Machine Learning, Vol. 97. 7105\u20137114."},{"key":"e_1_3_2_82_2","article-title":"Image classification at supercomputer scale","volume":"1811","author":"Ying Chris","unstructured":"Chris Ying, Sameer Kumar, Dehao Chen, Tao Wang, and Youlong Cheng. 2018. Image classification at supercomputer scale. CoRR abs\/1811.06992.","journal-title":"CoRR"},{"key":"e_1_3_2_83_2","doi-asserted-by":"publisher","DOI":"10.1109\/TNANO.2017.2731871"},{"key":"e_1_3_2_84_2","doi-asserted-by":"publisher","DOI":"10.1109\/TETC.2020.3003328"},{"key":"e_1_3_2_85_2","doi-asserted-by":"publisher","DOI":"10.1109\/TC.2020.2983694"},{"key":"e_1_3_2_86_2","doi-asserted-by":"publisher","DOI":"10.1145\/2684746.2689060"},{"key":"e_1_3_2_87_2","first-page":"1588","volume-title":"Proceedings of the International Conference on Neural Information Processing Systems","volume":"32","author":"Zhang Muhan","year":"2019","unstructured":"Muhan Zhang, Shali Jiang, Zhicheng Cui, Roman Garnett, and Yixin Chen. 2019. D-VAE: A variational autoencoder for directed acyclic graphs. In Proceedings of the International Conference on Neural Information Processing Systems, Vol. 32. 1588\u20131600."},{"key":"e_1_3_2_88_2","first-page":"1","volume-title":"Proceedings of the IEEE\/ACM International Symposium on Microarchitecture","author":"Zhang Shijin","year":"2016","unstructured":"Shijin Zhang, Zidong Du, Lei Zhang, Huiying Lan, Shaoli Liu, Ling Li, Qi Guo, Tianshi Chen, and Yunji Chen. 2016. Cambricon-X: An accelerator for sparse neural networks. In Proceedings of the IEEE\/ACM International Symposium on Microarchitecture. 1\u201312."},{"key":"e_1_3_2_89_2","first-page":"1","volume-title":"Proceedings of the IEEE\/ACM International Conference on Computer-Aided Design","author":"Zhang Xiaofan","year":"2018","unstructured":"Xiaofan Zhang, Junsong Wang, Chao Zhu, Yonghua Lin, Jinjun Xiong, Wen-Mei Hwu, and Deming Chen. 2018. DNNBuilder: An automated tool for building high-performance DNN hardware accelerators for FPGAs. In Proceedings of the IEEE\/ACM International Conference on Computer-Aided Design. 1\u20138."},{"key":"e_1_3_2_90_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2018.00716"},{"key":"e_1_3_2_91_2","first-page":"15","volume-title":"Proceedings of the IEEE\/ACM International Symposium on Microarchitecture","author":"Zhou Xuda","year":"2018","unstructured":"Xuda Zhou, Zidong Du, Qi Guo, Shaoli Liu, Chengsi Liu, Chao Wang, Xuehai Zhou, Ling Li, Tianshi Chen, and Yunji Chen. 2018. Cambricon-S: Addressing irregularity in sparse neural networks through a cooperative software\/hardware approach. In Proceedings of the IEEE\/ACM International Symposium on Microarchitecture. 15\u201328."},{"key":"e_1_3_2_92_2","unstructured":"Yanqi Zhou Xuanyi Dong Daiyi Peng Ethan Zhu Amir Yazdanbakhsh Berkin Akin Mingxing Tan and James Laudon. 2021. NAHAS: Neural Architecture and Hardware Accelerator Search. Retrieved from https:\/\/openreview.net\/forum?id=fgpXAu8puGj."},{"key":"e_1_3_2_93_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2018.00907"}],"container-title":["ACM Transactions on Embedded Computing Systems"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3575798","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3575798","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T17:51:21Z","timestamp":1750182681000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3575798"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,4,20]]},"references-count":92,"journal-issue":{"issue":"3","published-print":{"date-parts":[[2023,5,31]]}},"alternative-id":["10.1145\/3575798"],"URL":"https:\/\/doi.org\/10.1145\/3575798","relation":{},"ISSN":["1539-9087","1558-3465"],"issn-type":[{"value":"1539-9087","type":"print"},{"value":"1558-3465","type":"electronic"}],"subject":[],"published":{"date-parts":[[2023,4,20]]},"assertion":[{"value":"2022-08-25","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2022-11-29","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2023-04-20","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}