{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,14]],"date-time":"2026-03-14T22:07:27Z","timestamp":1773526047525,"version":"3.50.1"},"reference-count":651,"publisher":"Association for Computing Machinery (ACM)","issue":"1","license":[{"start":{"date-parts":[[2024,12,10]],"date-time":"2024-12-10T00:00:00Z","timestamp":1733788800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"name":"Ministry of Education, Singapore, under its Academic Research Fund Tier\u00a01","award":["RG94\/23"],"award-info":[{"award-number":["RG94\/23"]}]},{"DOI":"10.13039\/501100001475","name":"Nanyang Technological University, Singapore","doi-asserted-by":"crossref","award":["M4082282\/ 04INS000515C130"],"award-info":[{"award-number":["M4082282\/ 04INS000515C130"]}],"id":[{"id":"10.13039\/501100001475","id-type":"DOI","asserted-by":"crossref"}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Embed. Comput. Syst."],"published-print":{"date-parts":[[2025,1,31]]},"abstract":"<jats:p>\n            Deep neural networks (DNNs) have recently achieved impressive success across a wide range of real-world vision and language processing tasks, spanning from image classification to many other downstream vision tasks, such as object detection, tracking, and segmentation. However, previous well-established DNNs, despite being able to maintain superior accuracy, have also been evolving to be deeper and wider and thus inevitably necessitate prohibitive computational resources for both training and inference. This trend further enlarges the computational gap between computation-intensive DNNs and resource-constrained embedded computing systems, making it challenging to deploy powerful DNNs in real-world embedded computing systems towards ubiquitous embedded intelligence. To alleviate this computational gap and enable ubiquitous embedded intelligence, we focus in this survey on discussing recent efficient deep learning infrastructures for embedded computing systems, spanning\n            <jats:bold>from training to inference<\/jats:bold>\n            ,\n            <jats:bold>from manual to automated<\/jats:bold>\n            ,\n            <jats:bold>from convolutional neural networks to transformers<\/jats:bold>\n            ,\n            <jats:bold>from transformers to vision transformers<\/jats:bold>\n            ,\n            <jats:bold>from vision models to large language models<\/jats:bold>\n            ,\n            <jats:bold>from software to hardware<\/jats:bold>\n            , and\n            <jats:bold>from algorithms to applications<\/jats:bold>\n            . Specifically, we discuss recent efficient deep learning infrastructures for embedded computing systems from the lens of (1) efficient manual network design for embedded computing systems, (2) efficient automated network design for embedded computing systems, (3) efficient network compression for embedded computing systems, (4) efficient on-device learning for embedded computing systems, (5) efficient large language models for embedded computing systems, (6) efficient deep learning software and hardware for embedded computing systems, and (7) efficient intelligent applications for embedded computing systems. We also envision promising future directions and trends, which have the potential to deliver more ubiquitous embedded intelligence. We believe this survey has its merits and can shed light on future research, which can largely help researchers to quickly and smoothly get started in this emerging field.\n          <\/jats:p>","DOI":"10.1145\/3701728","type":"journal-article","created":{"date-parts":[[2024,10,24]],"date-time":"2024-10-24T09:40:38Z","timestamp":1729762838000},"page":"1-100","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":12,"title":["Efficient Deep Learning Infrastructures for Embedded Computing Systems: A Comprehensive Survey and Future Envision"],"prefix":"10.1145","volume":"24","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-0758-2248","authenticated-orcid":false,"given":"Xiangzhong","family":"Luo","sequence":"first","affiliation":[{"name":"Nanyang Technological University, Singapore, Singapore"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-4365-2768","authenticated-orcid":false,"given":"Di","family":"Liu","sequence":"additional","affiliation":[{"name":"Department of Computer Science, Norwegian University of Science and Technology, Trondheim, Norway"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-1378-0056","authenticated-orcid":false,"given":"Hao","family":"Kong","sequence":"additional","affiliation":[{"name":"Nanyang Technological University, Singapore, Singapore"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-4744-304X","authenticated-orcid":false,"given":"Shuo","family":"Huai","sequence":"additional","affiliation":[{"name":"Nanyang Technological University, Singapore, Singapore"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-1614-9929","authenticated-orcid":false,"given":"Hui","family":"Chen","sequence":"additional","affiliation":[{"name":"Nanyang Technological University, Singapore, Singapore"}]},{"ORCID":"https:\/\/orcid.org\/0009-0002-7485-6787","authenticated-orcid":false,"given":"Guochu","family":"Xiong","sequence":"additional","affiliation":[{"name":"Nanyang Technological University, Singapore, Singapore"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-9348-4662","authenticated-orcid":false,"given":"Weichen","family":"Liu","sequence":"additional","affiliation":[{"name":"School of Computer Science and Engineering, Nanyang Technological University, Singapore, Singapore"}]}],"member":"320","published-online":{"date-parts":[[2024,12,10]]},"reference":[{"key":"e_1_3_2_2_2","article-title":"Very deep convolutional networks for large-scale image recognition","author":"Simonyan Karen","year":"2014","unstructured":"Karen Simonyan and Andrew Zisserman. 2014. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014).","journal-title":"arXiv preprint arXiv:1409.1556"},{"key":"e_1_3_2_3_2","first-page":"770","volume-title":"Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition","author":"He Kaiming","year":"2016","unstructured":"Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 770\u2013778."},{"key":"e_1_3_2_4_2","first-page":"4700","volume-title":"Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition","author":"Huang Gao","year":"2017","unstructured":"Gao Huang, Zhuang Liu, Laurens Van Der Maaten, and Kilian Q. Weinberger. 2017. Densely connected convolutional networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 4700\u20134708."},{"key":"e_1_3_2_5_2","doi-asserted-by":"crossref","first-page":"21","DOI":"10.1007\/978-3-319-46448-0_2","volume-title":"Computer Vision\u2013ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, October 11\u201314, 2016, Proceedings, Part I 14","author":"Liu Wei","year":"2016","unstructured":"Wei Liu, Dragomir Anguelov, Dumitru Erhan, Christian Szegedy, Scott Reed, Cheng-Yang Fu, and Alexander C. Berg. 2016. SSD: Single shot multibox detector. In Computer Vision\u2013ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, October 11\u201314, 2016, Proceedings, Part I 14. Springer, 21\u201337."},{"key":"e_1_3_2_6_2","first-page":"1","volume-title":"Proceedings of the IEEE International Conference on Computer Vision Workshops","author":"Kristan Matej","year":"2015","unstructured":"Matej Kristan, Jiri Matas, Ales Leonardis, Michael Felsberg, Luka Cehovin, Gustavo Fernandez, Tomas Vojir, Gustav Hager, Georg Nebehay, and Roman Pflugfelder. 2015. The Visual Object Tracking VOT2015 challenge results. In Proceedings of the IEEE International Conference on Computer Vision Workshops. 1\u201323."},{"key":"e_1_3_2_7_2","first-page":"280","volume-title":"Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition","author":"Li Yin","year":"2014","unstructured":"Yin Li, Xiaodi Hou, Christof Koch, James M. Rehg, and Alan L. Yuille. 2014. The secrets of salient object segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 280\u2013287."},{"key":"e_1_3_2_8_2","volume-title":"Automatic Speech Recognition","author":"Yu Dong","year":"2016","unstructured":"Dong Yu and Lin Deng. 2016. Automatic Speech Recognition. Vol. 1. Springer."},{"key":"e_1_3_2_9_2","article-title":"Google\u2019s neural machine translation system: Bridging the gap between human and machine translation","author":"Wu Yonghui","year":"2016","unstructured":"Yonghui Wu, Mike Schuster, Zhifeng Chen, Quoc V. Le, Mohammad Norouzi, Wolfgang Macherey, Maxim Krikun, Yuan Cao, Qin Gao, Klaus Macherey, et\u00a0al. 2016. Google\u2019s neural machine translation system: Bridging the gap between human and machine translation. arXiv preprint arXiv:1609.08144 (2016).","journal-title":"arXiv preprint arXiv:1609.08144"},{"key":"e_1_3_2_10_2","article-title":"QuAC: Question answering in context","author":"Choi Eunsol","year":"2018","unstructured":"Eunsol Choi, He He, Mohit Iyyer, Mark Yatskar, Wen-tau Yih, Yejin Choi, Percy Liang, and Luke Zettlemoyer. 2018. QuAC: Question answering in context. arXiv preprint arXiv:1808.07036 (2018).","journal-title":"arXiv preprint arXiv:1808.07036"},{"key":"e_1_3_2_11_2","first-page":"1492","volume-title":"Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition","author":"Xie Saining","year":"2017","unstructured":"Saining Xie, Ross Girshick, Piotr Doll\u00e1r, Zhuowen Tu, and Kaiming He. 2017. Aggregated residual transformations for deep neural networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 1492\u20131500."},{"key":"e_1_3_2_12_2","article-title":"Distilling the knowledge in a neural network","author":"Hinton Geoffrey","year":"2015","unstructured":"Geoffrey Hinton, Oriol Vinyals, and Jeff Dean. 2015. Distilling the knowledge in a neural network. arXiv preprint arXiv:1503.02531 (2015).","journal-title":"arXiv preprint arXiv:1503.02531"},{"key":"e_1_3_2_13_2","article-title":"mixup: Beyond empirical risk minimization","author":"Zhang Hongyi","year":"2017","unstructured":"Hongyi Zhang, Moustapha Cisse, Yann N. Dauphin, and David Lopez-Paz. 2017. mixup: Beyond empirical risk minimization. arXiv preprint arXiv:1710.09412 (2017).","journal-title":"arXiv preprint arXiv:1710.09412"},{"key":"e_1_3_2_14_2","first-page":"1","volume-title":"Proceedings of the European Conference on Computer Vision (ECCV) Workshops","author":"Ignatov Andrey","year":"2018","unstructured":"Andrey Ignatov, Radu Timofte, William Chou, Ke Wang, Max Wu, Tim Hartley, and Luc Van Gool. 2018. AI benchmark: Running deep neural networks on Android smartphones. In Proceedings of the European Conference on Computer Vision (ECCV) Workshops. 1\u201327."},{"key":"e_1_3_2_15_2","doi-asserted-by":"crossref","first-page":"1853","DOI":"10.1109\/TASLP.2021.3082318","article-title":"Deep learning based real-time speech enhancement for dual-microphone mobile phones","volume":"29","author":"Tan Ke","year":"2021","unstructured":"Ke Tan, Xueliang Zhang, and DeLiang Wang. 2021. Deep learning based real-time speech enhancement for dual-microphone mobile phones. IEEE\/ACM Transactions on Audio, Speech, and Language Processing 29 (2021), 1853\u20131863.","journal-title":"IEEE\/ACM Transactions on Audio, Speech, and Language Processing"},{"key":"e_1_3_2_16_2","doi-asserted-by":"crossref","first-page":"142","DOI":"10.1109\/ISMVL.2017.49","volume-title":"2017 IEEE 47th International Symposium on Multiple-Valued Logic (ISMVL\u201917)","author":"Kisa\u010danin Branislav","year":"2017","unstructured":"Branislav Kisa\u010danin. 2017. Deep learning for autonomous vehicles. In 2017 IEEE 47th International Symposium on Multiple-Valued Logic (ISMVL\u201917). IEEE, 142\u2013142."},{"issue":"15","key":"e_1_3_2_17_2","doi-asserted-by":"crossref","first-page":"4220","DOI":"10.3390\/s20154220","article-title":"Deep learning sensor fusion for autonomous vehicle perception and localization: A review","volume":"20","author":"Fayyad Jamil","year":"2020","unstructured":"Jamil Fayyad, Mohammad A. Jaradat, Dominique Gruyer, and Homayoun Najjaran. 2020. Deep learning sensor fusion for autonomous vehicle perception and localization: A review. Sensors 20, 15 (2020), 4220.","journal-title":"Sensors"},{"issue":"1","key":"e_1_3_2_18_2","doi-asserted-by":"crossref","first-page":"14","DOI":"10.1038\/s41591-018-0320-3","article-title":"A call for deep-learning healthcare","volume":"25","author":"Norgeot Beau","year":"2019","unstructured":"Beau Norgeot, Benjamin S. Glicksberg, and Atul J. Butte. 2019. A call for deep-learning healthcare. Nature Medicine 25, 1 (2019), 14\u201315.","journal-title":"Nature Medicine"},{"issue":"1","key":"e_1_3_2_19_2","doi-asserted-by":"crossref","first-page":"24","DOI":"10.1038\/s41591-018-0316-z","article-title":"A guide to deep learning in healthcare","volume":"25","author":"Esteva Andre","year":"2019","unstructured":"Andre Esteva, Alexandre Robicquet, Bharath Ramsundar, Volodymyr Kuleshov, Mark DePristo, Katherine Chou, Claire Cui, Greg Corrado, Sebastian Thrun, and Jeff Dean. 2019. A guide to deep learning in healthcare. Nature Medicine 25, 1 (2019), 24\u201329.","journal-title":"Nature Medicine"},{"key":"e_1_3_2_20_2","first-page":"331","volume-title":"2019 IEEE International Symposium on High Performance Computer Architecture (HPCA\u201919)","author":"Wu Carole-Jean","year":"2019","unstructured":"Carole-Jean Wu, David Brooks, Kevin Chen, Douglas Chen, Sy Choudhury, Marat Dukhan, Kim Hazelwood, Eldad Isaac, Yangqing Jia, Bill Jia, et\u00a0al. 2019. Machine learning at Facebook: Understanding inference at the edge. In 2019 IEEE International Symposium on High Performance Computer Architecture (HPCA\u201919). IEEE, 331\u2013344."},{"key":"e_1_3_2_21_2","doi-asserted-by":"crossref","first-page":"297","DOI":"10.1016\/j.neucom.2021.04.141","article-title":"Bringing AI to edge: From deep learning\u2019s perspective","volume":"485","author":"Liu Di","year":"2022","unstructured":"Di Liu, Hao Kong, Xiangzhong Luo, Weichen Liu, and Ravi Subramaniam. 2022. Bringing AI to edge: From deep learning\u2019s perspective. Neurocomputing 485 (2022), 297\u2013320.","journal-title":"Neurocomputing"},{"key":"e_1_3_2_22_2","volume-title":"International Conference on Learning Representations","author":"Li Hao","year":"2017","unstructured":"Hao Li, Asim Kadav, Igor Durdanovic, Hanan Samet, and Hans Peter Graf. 2017. Pruning filters for efficient ConvNets. In International Conference on Learning Representations."},{"key":"e_1_3_2_23_2","volume-title":"International Joint Conference on Artificial Intelligence","author":"He Yang","year":"2018","unstructured":"Yang He, Guoliang Kang, Xuanyi Dong, Yanwei Fu, and Yi Yang. 2018. Soft filter pruning for accelerating deep convolutional neural networks. In International Joint Conference on Artificial Intelligence."},{"key":"e_1_3_2_24_2","first-page":"4340","volume-title":"Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition","author":"He Yang","year":"2019","unstructured":"Yang He, Ping Liu, Ziwei Wang, Zhilan Hu, and Yi Yang. 2019. Filter pruning via geometric median for deep convolutional neural networks acceleration. In Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition. 4340\u20134349."},{"key":"e_1_3_2_25_2","article-title":"BinaryConnect: Training deep neural networks with binary weights during propagations","volume":"28","author":"Courbariaux Matthieu","year":"2015","unstructured":"Matthieu Courbariaux, Yoshua Bengio, and Jean-Pierre David. 2015. BinaryConnect: Training deep neural networks with binary weights during propagations. Advances in Neural Information Processing Systems 28 (2015).","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_3_2_26_2","article-title":"Binarized neural networks","volume":"29","author":"Hubara Itay","year":"2016","unstructured":"Itay Hubara, Matthieu Courbariaux, Daniel Soudry, Ran El-Yaniv, and Yoshua Bengio. 2016. Binarized neural networks. Advances in Neural Information Processing Systems 29 (2016).","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_3_2_27_2","doi-asserted-by":"crossref","first-page":"525","DOI":"10.1007\/978-3-319-46493-0_32","volume-title":"Computer Vision\u2013ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, October 11\u201314, 2016, Proceedings, Part IV","author":"Rastegari Mohammad","year":"2016","unstructured":"Mohammad Rastegari, Vicente Ordonez, Joseph Redmon, and Ali Farhadi. 2016. XNOR-Net: ImageNet classification using binary convolutional neural networks. In Computer Vision\u2013ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, October 11\u201314, 2016, Proceedings, Part IV. Springer, 525\u2013542."},{"key":"e_1_3_2_28_2","article-title":"Do deep nets really need to be deep?","volume":"27","author":"Ba Jimmy","year":"2014","unstructured":"Jimmy Ba and Rich Caruana. 2014. Do deep nets really need to be deep? Advances in Neural Information Processing Systems 27 (2014).","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_3_2_29_2","article-title":"FitNets: Hints for thin deep nets","author":"Romero Adriana","year":"2014","unstructured":"Adriana Romero, Nicolas Ballas, Samira Ebrahimi Kahou, Antoine Chassang, Carlo Gatta, and Yoshua Bengio. 2014. FitNets: Hints for thin deep nets. arXiv preprint arXiv:1412.6550 (2014).","journal-title":"arXiv preprint arXiv:1412.6550"},{"key":"e_1_3_2_30_2","article-title":"Learning both weights and connections for efficient neural network","volume":"28","author":"Han Song","year":"2015","unstructured":"Song Han, Jeff Pool, John Tran, and William Dally. 2015. Learning both weights and connections for efficient neural network. Advances in Neural Information Processing Systems 28 (2015).","journal-title":"Advances in Neural Information Processing Systems"},{"issue":"4","key":"e_1_3_2_31_2","doi-asserted-by":"crossref","first-page":"828","DOI":"10.1109\/JSTSP.2020.2975987","article-title":"Discriminative layer pruning for convolutional neural networks","volume":"14","author":"Jordao Artur","year":"2020","unstructured":"Artur Jordao, Maiko Lie, and William Robson Schwartz. 2020. Discriminative layer pruning for convolutional neural networks. IEEE Journal of Selected Topics in Signal Processing 14, 4 (2020), 828\u2013837.","journal-title":"IEEE Journal of Selected Topics in Signal Processing"},{"key":"e_1_3_2_32_2","first-page":"7132","volume-title":"Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition","author":"Hu Jie","year":"2018","unstructured":"Jie Hu, Li Shen, and Gang Sun. 2018. Squeeze-and-excitation networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 7132\u20137141."},{"key":"e_1_3_2_33_2","article-title":"MobileNets: Efficient convolutional neural networks for mobile vision applications","author":"Howard Andrew G.","year":"2017","unstructured":"Andrew G. Howard, Menglong Zhu, Bo Chen, Dmitry Kalenichenko, Weijun Wang, Tobias Weyand, Marco Andreetto, and Hartwig Adam. 2017. MobileNets: Efficient convolutional neural networks for mobile vision applications. arXiv preprint arXiv:1704.04861 (2017).","journal-title":"arXiv preprint arXiv:1704.04861"},{"key":"e_1_3_2_34_2","doi-asserted-by":"crossref","first-page":"137","DOI":"10.1007\/978-3-031-19775-8_9","volume-title":"Computer Vision\u2013ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23\u201327, 2022, Proceedings, Part XII","author":"Molchanov Pavlo","year":"2022","unstructured":"Pavlo Molchanov, Jimmy Hall, Hongxu Yin, Jan Kautz, Nicolo Fusi, and Arash Vahdat. 2022. LANA: Latency aware network acceleration. In Computer Vision\u2013ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23\u201327, 2022, Proceedings, Part XII. Springer, 137\u2013156."},{"key":"e_1_3_2_35_2","first-page":"116","volume-title":"Proceedings of the European Conference on Computer Vision (ECCV\u201918)","author":"Ma Ningning","year":"2018","unstructured":"Ningning Ma, Xiangyu Zhang, Hai-Tao Zheng, and Jian Sun. 2018. ShuffleNet v2: Practical guidelines for efficient CNN architecture design. In Proceedings of the European Conference on Computer Vision (ECCV\u201918). 116\u2013131."},{"key":"e_1_3_2_36_2","first-page":"6848","volume-title":"Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition","author":"Zhang Xiangyu","year":"2018","unstructured":"Xiangyu Zhang, Xinyu Zhou, Mengxiao Lin, and Jian Sun. 2018. ShuffleNet: An extremely efficient convolutional neural network for mobile devices. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 6848\u20136856."},{"key":"e_1_3_2_37_2","first-page":"1580","volume-title":"Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition","author":"Han Kai","year":"2020","unstructured":"Kai Han, Yunhe Wang, Qi Tian, Jianyuan Guo, Chunjing Xu, and Chang Xu. 2020. GhostNet: More features from cheap operations. In Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition. 1580\u20131589."},{"key":"e_1_3_2_38_2","article-title":"GhostNetV2: Enhance cheap operation with long-range attention","author":"Tang Yehui","year":"2022","unstructured":"Yehui Tang, Kai Han, Jianyuan Guo, Chang Xu, Chao Xu, and Yunhe Wang. 2022. GhostNetV2: Enhance cheap operation with long-range attention. arXiv preprint arXiv:2211.12905 (2022).","journal-title":"arXiv preprint arXiv:2211.12905"},{"key":"e_1_3_2_39_2","first-page":"2820","volume-title":"Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition","author":"Tan Mingxing","year":"2019","unstructured":"Mingxing Tan, Bo Chen, Ruoming Pang, Vijay Vasudevan, Mark Sandler, Andrew Howard, and Quoc V. Le. 2019. MnasNet: Platform-aware neural architecture search for mobile. In Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition. 2820\u20132828."},{"key":"e_1_3_2_40_2","first-page":"10734","volume-title":"Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition","author":"Wu Bichen","year":"2019","unstructured":"Bichen Wu, Xiaoliang Dai, Peizhao Zhang, Yanghan Wang, Fei Sun, Yiming Wu, Yuandong Tian, Peter Vajda, Yangqing Jia, and Kurt Keutzer. 2019. FBNet: Hardware-aware efficient ConvNet design via differentiable neural architecture search. In Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition. 10734\u201310742."},{"key":"e_1_3_2_41_2","volume-title":"International Conference on Learning Representations","author":"Cai Han","year":"2019","unstructured":"Han Cai, Ligeng Zhu, and Song Han. 2019. ProxylessNAS: Direct neural architecture search on target task and hardware. In International Conference on Learning Representations."},{"key":"e_1_3_2_42_2","article-title":"Neural architecture search: Insights from 1000 papers","author":"White Colin","year":"2023","unstructured":"Colin White, Mahmoud Safari, Rhea Sukthanker, Binxin Ru, Thomas Elsken, Arber Zela, Debadeepta Dey, and Frank Hutter. 2023. Neural architecture search: Insights from 1000 papers. arXiv preprint arXiv:2301.08727 (2023).","journal-title":"arXiv preprint arXiv:2301.08727"},{"key":"e_1_3_2_43_2","volume-title":"International Conference on Learning Representations","author":"Cai Han","year":"2020","unstructured":"Han Cai, Chuang Gan, Tianzhe Wang, Zhekai Zhang, and Song Han. 2020. Once for all: Train one network and specialize it for efficient deployment. In International Conference on Learning Representations."},{"key":"e_1_3_2_44_2","article-title":"A comprehensive survey on hardware-aware neural architecture search","author":"Benmeziane Hadjer","year":"2021","unstructured":"Hadjer Benmeziane, Kaoutar El Maghraoui, Hamza Ouarnoughi, Smail Niar, Martin Wistuba, and Naigang Wang. 2021. A comprehensive survey on hardware-aware neural architecture search. arXiv preprint arXiv:2101.09336 (2021).","journal-title":"arXiv preprint arXiv:2101.09336"},{"key":"e_1_3_2_45_2","first-page":"11285","article-title":"Tinytl: Reduce memory, not parameters for efficient on-device learning","volume":"33","author":"Cai Han","year":"2020","unstructured":"Han Cai, Chuang Gan, Ligeng Zhu, and Song Han. 2020. Tinytl: Reduce memory, not parameters for efficient on-device learning. Advances in Neural Information Processing Systems 33 (2020), 11285\u201311297.","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_3_2_46_2","article-title":"On-device training under 256kb memory","author":"Lin Ji","year":"2022","unstructured":"Ji Lin, Ligeng Zhu, Wei-Ming Chen, Wei-Chen Wang, Chuang Gan, and Song Han. 2022. On-device training under 256kb memory. Advances in Neural Information Processing Systems (2022).","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_3_2_47_2","article-title":"Three scenarios for continual learning","author":"Ven Gido M. Van de","year":"2019","unstructured":"Gido M. Van de Ven and Andreas S. Tolias. 2019. Three scenarios for continual learning. arXiv preprint arXiv:1904.07734 (2019).","journal-title":"arXiv preprint arXiv:1904.07734"},{"key":"e_1_3_2_48_2","article-title":"ZeroFL: Efficient on-device training for federated learning with local sparsity","author":"Qiu Xinchi","year":"2022","unstructured":"Xinchi Qiu, Javier Fernandez-Marques, Pedro P. B. Gusmao, Yan Gao, Titouan Parcollet, and Nicholas Donald Lane. 2022. ZeroFL: Efficient on-device training for federated learning with local sparsity. International Conference on Learning Representations (2022).","journal-title":"International Conference on Learning Representations"},{"key":"e_1_3_2_49_2","first-page":"1877","article-title":"Language models are few-shot learners","volume":"33","author":"Brown Tom","year":"2020","unstructured":"Tom Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared D. Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, et\u00a0al. 2020. Language models are few-shot learners. Advances in Neural Information Processing Systems 33 (2020), 1877\u20131901.","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_3_2_50_2","article-title":"GPT-4 technical report","year":"2023","unstructured":"OpenAI. 2023. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023).","journal-title":"arXiv preprint arXiv:2303.08774"},{"key":"e_1_3_2_51_2","article-title":"Beyond efficiency: A systematic survey of resource-efficient large language models","author":"Bai Guangji","year":"2024","unstructured":"Guangji Bai, Zheng Chai, Chen Ling, Shiyu Wang, Jiaying Lu, Nan Zhang, Tingwei Shi, Ziyang Yu, Mengdan Zhu, Yifei Zhang, et\u00a0al. 2024. Beyond efficiency: A systematic survey of resource-efficient large language models. arXiv preprint arXiv:2401.00625 (2024).","journal-title":"arXiv preprint arXiv:2401.00625"},{"issue":"240","key":"e_1_3_2_52_2","first-page":"1","article-title":"PaLM: Scaling language modeling with pathways","volume":"24","author":"Chowdhery Aakanksha","year":"2023","unstructured":"Aakanksha Chowdhery, Sharan Narang, Jacob Devlin, Maarten Bosma, Gaurav Mishra, Adam Roberts, Paul Barham, Hyung Won Chung, Charles Sutton, Sebastian Gehrmann, et\u00a0al. 2023. PaLM: Scaling language modeling with pathways. Journal of Machine Learning Research 24, 240 (2023), 1\u2013113.","journal-title":"Journal of Machine Learning Research"},{"key":"e_1_3_2_53_2","unstructured":"Teven Le Scao Angela Fan Christopher Akiki Ellie Pavlick Suzana Ili\u0107 Daniel Hesslow Roman Castagn\u00e9 Alexandra Sasha Luccioni Fran\u00e7ois Yvon Matthias Gall\u00e9 et\u00a0al. 2023. BLOOM: A 176B-parameter open-access multilingual language model. arXiv preprint arXiv:2211.05100 (2023)."},{"key":"e_1_3_2_54_2","doi-asserted-by":"crossref","first-page":"611","DOI":"10.1145\/3600006.3613165","volume-title":"Proceedings of the 29th Symposium on Operating Systems Principles","author":"Kwon Woosuk","year":"2023","unstructured":"Woosuk Kwon, Zhuohan Li, Siyuan Zhuang, Ying Sheng, Lianmin Zheng, Cody Hao Yu, Joseph Gonzalez, Hao Zhang, and Ion Stoica. 2023. Efficient memory management for large language model serving with PagedAttention. In Proceedings of the 29th Symposium on Operating Systems Principles. 611\u2013626."},{"key":"e_1_3_2_55_2","first-page":"16344","article-title":"FlashAttention: Fast and memory-efficient exact attention with IO-awareness","volume":"35","author":"Dao Tri","year":"2022","unstructured":"Tri Dao, Dan Fu, Stefano Ermon, Atri Rudra, and Christopher R\u00e9. 2022. FlashAttention: Fast and memory-efficient exact attention with IO-awareness. Advances in Neural Information Processing Systems 35 (2022), 16344\u201316359.","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_3_2_56_2","article-title":"FlashAttention-2: Faster attention with better parallelism and work partitioning","author":"Dao Tri","year":"2023","unstructured":"Tri Dao. 2023. FlashAttention-2: Faster attention with better parallelism and work partitioning. arXiv preprint arXiv:2307.08691 (2023).","journal-title":"arXiv preprint arXiv:2307.08691"},{"key":"e_1_3_2_57_2","article-title":"Efficient streaming language models with attention sinks","author":"Xiao Guangxuan","year":"2023","unstructured":"Guangxuan Xiao, Yuandong Tian, Beidi Chen, Song Han, and Mike Lewis. 2023. Efficient streaming language models with attention sinks. arXiv preprint arXiv:2309.17453 (2023).","journal-title":"arXiv preprint arXiv:2309.17453"},{"key":"e_1_3_2_58_2","first-page":"21702","article-title":"LLM-Pruner: On the structural pruning of large language models","volume":"36","author":"Ma Xinyin","year":"2023","unstructured":"Xinyin Ma, Gongfan Fang, and Xinchao Wang. 2023. LLM-Pruner: On the structural pruning of large language models. Advances in Neural Information Processing Systems 36 (2023), 21702\u201321720.","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_3_2_59_2","article-title":"A simple and effective pruning approach for large language models","author":"Sun Mingjie","year":"2023","unstructured":"Mingjie Sun, Zhuang Liu, Anna Bair, and J. Zico Kolter. 2023. A simple and effective pruning approach for large language models. arXiv preprint arXiv:2306.11695 (2023).","journal-title":"arXiv preprint arXiv:2306.11695"},{"key":"e_1_3_2_60_2","first-page":"38087","volume-title":"International Conference on Machine Learning","author":"Xiao Guangxuan","year":"2023","unstructured":"Guangxuan Xiao, Ji Lin, Mickael Seznec, Hao Wu, Julien Demouth, and Song Han. 2023. SmoothQuant: Accurate and efficient post-training quantization for large language models. In International Conference on Machine Learning. PMLR, 38087\u201338099."},{"key":"e_1_3_2_61_2","article-title":"AWQ: Activation-aware weight quantization for LLM compression and acceleration","author":"Lin Ji","year":"2023","unstructured":"Ji Lin, Jiaming Tang, Haotian Tang, Shang Yang, Xingyu Dang, and Song Han. 2023. AWQ: Activation-aware weight quantization for LLM compression and acceleration. arXiv preprint arXiv:2306.00978 (2023).","journal-title":"arXiv preprint arXiv:2306.00978"},{"key":"e_1_3_2_62_2","article-title":"Self-Instruct: Aligning language models with self-generated instructions","author":"Wang Yizhong","year":"2022","unstructured":"Yizhong Wang, Yeganeh Kordi, Swaroop Mishra, Alisa Liu, Noah A. Smith, Daniel Khashabi, and Hannaneh Hajishirzi. 2022. Self-Instruct: Aligning language models with self-generated instructions. arXiv preprint arXiv:2212.10560 (2022).","journal-title":"arXiv preprint arXiv:2212.10560"},{"key":"e_1_3_2_63_2","volume-title":"12th International Conference on Learning Representations","author":"Gu Yuxian","year":"2023","unstructured":"Yuxian Gu, Li Dong, Furu Wei, and Minlie Huang. 2023. MiniLLM: Knowledge distillation of large language models. In 12th International Conference on Learning Representations."},{"key":"e_1_3_2_64_2","first-page":"31094","volume-title":"International Conference on Machine Learning","author":"Sheng Ying","year":"2023","unstructured":"Ying Sheng, Lianmin Zheng, Binhang Yuan, Zhuohan Li, Max Ryabinin, Beidi Chen, Percy Liang, Christopher R\u00e9, Ion Stoica, and Ce Zhang. 2023. FlexGen: High-throughput generative inference of large language models with a single GPU. In International Conference on Machine Learning. PMLR, 31094\u201331116."},{"key":"e_1_3_2_65_2","article-title":"Petals: Collaborative inference and fine-tuning of large models","author":"Borzunov Alexander","year":"2022","unstructured":"Alexander Borzunov, Dmitry Baranchuk, Tim Dettmers, Max Ryabinin, Younes Belkada, Artem Chumachenko, Pavel Samygin, and Colin Raffel. 2022. Petals: Collaborative inference and fine-tuning of large models. arXiv preprint arXiv:2209.01188 (2022).","journal-title":"arXiv preprint arXiv:2209.01188"},{"key":"e_1_3_2_66_2","first-page":"233","volume-title":"Proceedings of the 18th European Conference on Computer Systems","author":"Wang Yiding","year":"2023","unstructured":"Yiding Wang, Kai Chen, Haisheng Tan, and Kun Guo. 2023. Tabi: An efficient multi-level inference system for large language models. In Proceedings of the 18th European Conference on Computer Systems. 233\u2013248."},{"key":"e_1_3_2_67_2","unstructured":"Mart\u00edn Abadi Ashish Agarwal Paul Barham Eugene Brevdo et\u00a0al. 2015. TensorFlow: Large-Scale Machine Learning on Heterogeneous Systems. (2015). https:\/\/www.tensorflow.org\/Software available from tensorflow.org."},{"key":"e_1_3_2_68_2","article-title":"PyTorch: An imperative style, high-performance deep learning library","volume":"32","author":"Paszke Adam","year":"2019","unstructured":"Adam Paszke, Sam Gross, Francisco Massa, Adam Lerer, James Bradbury, Gregory Chanan, Trevor Killeen, Zeming Lin, Natalia Gimelshein, Luca Antiga, et\u00a0al. 2019. PyTorch: An imperative style, high-performance deep learning library. Advances in Neural Information Processing Systems 32 (2019).","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_3_2_69_2","unstructured":"Google. Google Edge TPU. Retrieved from https:\/\/cloud.google.com\/edge-tpu\/ ([n. d.])."},{"key":"e_1_3_2_70_2","unstructured":"NVIDIA. Nvidia Jetson. Retrieved from https:\/\/www.nvidia.com\/en-sg\/autonomous-machines\/embedded-systems\/ ([n. d.])."},{"key":"e_1_3_2_71_2","unstructured":"Intel. Intel Movidius Neural Compute Stick. Retrieved from https:\/\/movidius.github.io\/ncsdk\/ncs.html ([n. d.])."},{"issue":"12","key":"e_1_3_2_72_2","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1145\/3578938","article-title":"Efficient deep learning: A survey on making deep learning models smaller, faster, and better","volume":"55","author":"Menghani Gaurav","year":"2023","unstructured":"Gaurav Menghani. 2023. Efficient deep learning: A survey on making deep learning models smaller, faster, and better. Comput. Surveys 55, 12 (2023), 1\u201337.","journal-title":"Comput. Surveys"},{"key":"e_1_3_2_73_2","article-title":"A survey of model compression and acceleration for deep neural networks","author":"Cheng Yu","year":"2017","unstructured":"Yu Cheng, Duo Wang, Pan Zhou, and Tao Zhang. 2017. A survey of model compression and acceleration for deep neural networks. arXiv preprint arXiv:1710.09282 (2017).","journal-title":"arXiv preprint arXiv:1710.09282"},{"key":"e_1_3_2_74_2","doi-asserted-by":"crossref","first-page":"5113","DOI":"10.1007\/s10462-020-09816-7","article-title":"A comprehensive survey on model compression and acceleration","volume":"53","author":"Choudhary Tejalal","year":"2020","unstructured":"Tejalal Choudhary, Vipul Mishra, Anurag Goswami, and Jagannathan Sarangapani. 2020. A comprehensive survey on model compression and acceleration. Artificial Intelligence Review 53 (2020), 5113\u20135155.","journal-title":"Artificial Intelligence Review"},{"issue":"3","key":"e_1_3_2_75_2","doi-asserted-by":"crossref","first-page":"60","DOI":"10.3390\/computers12030060","article-title":"Model compression for deep neural networks: A survey","volume":"12","author":"Li Zhuo","year":"2023","unstructured":"Zhuo Li, Hengyi Li, and Lin Meng. 2023. Model compression for deep neural networks: A survey. Computers 12, 3 (2023), 60.","journal-title":"Computers"},{"issue":"4","key":"e_1_3_2_76_2","doi-asserted-by":"crossref","first-page":"1050","DOI":"10.1007\/s11263-022-01575-y","article-title":"GhostNets on heterogeneous devices via cheap operations","volume":"130","author":"Han Kai","year":"2022","unstructured":"Kai Han, Yunhe Wang, Chang Xu, Jianyuan Guo, Chunjing Xu, Enhua Wu, and Qi Tian. 2022. GhostNets on heterogeneous devices via cheap operations. International Journal of Computer Vision 130, 4 (2022), 1050\u20131069.","journal-title":"International Journal of Computer Vision"},{"key":"e_1_3_2_77_2","first-page":"1097","article-title":"ImageNet classification with deep convolutional neural networks","volume":"25","author":"Krizhevsky Alex","year":"2012","unstructured":"Alex Krizhevsky, Ilya Sutskever, and Geoffrey E. Hinton. 2012. ImageNet classification with deep convolutional neural networks. Advances in Neural Information Processing Systems 25 (2012), 1097\u20131105.","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_3_2_78_2","first-page":"1","volume-title":"Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition","author":"Szegedy Christian","year":"2015","unstructured":"Christian Szegedy, Wei Liu, Yangqing Jia, Pierre Sermanet, Scott Reed, Dragomir Anguelov, Dumitru Erhan, Vincent Vanhoucke, and Andrew Rabinovich. 2015. Going deeper with convolutions. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 1\u20139."},{"key":"e_1_3_2_79_2","first-page":"6105","volume-title":"International Conference on Machine Learning","author":"Tan Mingxing","year":"2019","unstructured":"Mingxing Tan and Quoc Le. 2019. EfficientNet: Rethinking model scaling for convolutional neural networks. In International Conference on Machine Learning. PMLR, 6105\u20136114."},{"key":"e_1_3_2_80_2","first-page":"10096","volume-title":"International Conference on Machine Learning","author":"Tan Mingxing","year":"2021","unstructured":"Mingxing Tan and Quoc Le. 2021. EfficientNetV2: Smaller models and faster training. In International Conference on Machine Learning. PMLR, 10096\u201310106."},{"key":"e_1_3_2_81_2","doi-asserted-by":"crossref","first-page":"248","DOI":"10.1109\/CVPR.2009.5206848","volume-title":"2009 IEEE Conference on Computer Vision and Pattern Recognition","author":"Deng Jia","year":"2009","unstructured":"Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Li Fei-Fei. 2009. ImageNet: A large-scale hierarchical image database. In 2009 IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 248\u2013255."},{"key":"e_1_3_2_82_2","article-title":"SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and < 0.5 MB model size","author":"Iandola Forrest N.","year":"2016","unstructured":"Forrest N. Iandola, Song Han, Matthew W. Moskewicz, Khalid Ashraf, William J. Dally, and Kurt Keutzer. 2016. SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and < 0.5 MB model size. arXiv preprint arXiv:1602.07360 (2016).","journal-title":"arXiv preprint arXiv:1602.07360"},{"issue":"3","key":"e_1_3_2_83_2","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1145\/3486618","article-title":"Enable deep learning on mobile devices: Methods, systems, and applications","volume":"27","author":"Cai Han","year":"2022","unstructured":"Han Cai, Ji Lin, Yujun Lin, Zhijian Liu, Haotian Tang, Hanrui Wang, Ligeng Zhu, and Song Han. 2022. Enable deep learning on mobile devices: Methods, systems, and applications. ACM Transactions on Design Automation of Electronic Systems (TODAES) 27, 3 (2022), 1\u201350.","journal-title":"ACM Transactions on Design Automation of Electronic Systems (TODAES)"},{"key":"e_1_3_2_84_2","article-title":"Multi-scale context aggregation by dilated convolutions","author":"Yu Fisher","year":"2015","unstructured":"Fisher Yu and Vladlen Koltun. 2015. Multi-scale context aggregation by dilated convolutions. arXiv preprint arXiv:1511.07122 (2015).","journal-title":"arXiv preprint arXiv:1511.07122"},{"key":"e_1_3_2_85_2","first-page":"12021","volume-title":"Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition","author":"Chen Jierun","year":"2023","unstructured":"Jierun Chen, Shiu-hong Kao, Hao He, Weipeng Zhuo, Song Wen, Chul-Ho Lee, and S-H Gary Chan. 2023. Run, don\u2019t walk: Chasing higher FLOPS for faster neural networks. In Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition. 12021\u201312031."},{"key":"e_1_3_2_86_2","first-page":"4510","volume-title":"Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition","author":"Sandler Mark","year":"2018","unstructured":"Mark Sandler, Andrew Howard, Menglong Zhu, Andrey Zhmoginov, and Liang-Chieh Chen. 2018. MobileNetV2: Inverted residuals and linear bottlenecks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 4510\u20134520."},{"key":"e_1_3_2_87_2","first-page":"680","volume-title":"Computer Vision\u2013ECCV 2020: 16th European Conference, Glasgow, UK, August 23\u201328, 2020, Proceedings, Part III 16","author":"Zhou Daquan","year":"2020","unstructured":"Daquan Zhou, Qibin Hou, Yunpeng Chen, Jiashi Feng, and Shuicheng Yan. 2020. Rethinking bottleneck structure for efficient mobile network design. In Computer Vision\u2013ECCV 2020: 16th European Conference, Glasgow, UK, August 23\u201328, 2020, Proceedings, Part III 16. Springer, 680\u2013697."},{"key":"e_1_3_2_88_2","first-page":"2752","volume-title":"Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition","author":"Huang Gao","year":"2018","unstructured":"Gao Huang, Shichen Liu, Laurens Van der Maaten, and Kilian Q. Weinberger. 2018. CondenseNet: An efficient DenseNet using learned group convolutions. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2752\u20132761."},{"key":"e_1_3_2_89_2","first-page":"3569","volume-title":"Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition","author":"Yang Le","year":"2021","unstructured":"Le Yang, Haojun Jiang, Ruojin Cai, Yulin Wang, Shiji Song, Gao Huang, and Qi Tian. 2021. CondenseNet V2: Sparse feature reactivation for deep networks. In Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition. 3569\u20133578."},{"key":"e_1_3_2_90_2","volume-title":"International Conference on Learning Representations","author":"Han Song","year":"2016","unstructured":"Song Han, Huizi Mao, and William J. Dally. 2016. Deep compression: Compressing deep neural networks with pruning, trained quantization and Huffman coding. In International Conference on Learning Representations."},{"key":"e_1_3_2_91_2","article-title":"Attention is all you need","volume":"30","author":"Vaswani Ashish","year":"2017","unstructured":"Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, \u0141ukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. Advances in Neural Information Processing Systems 30 (2017).","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_3_2_92_2","article-title":"BERT: Pre-training of deep bidirectional transformers for language understanding","author":"Devlin Jacob","year":"2019","unstructured":"Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of deep bidirectional transformers for language understanding. NAACL-HLT (2019).","journal-title":"NAACL-HLT"},{"key":"e_1_3_2_93_2","unstructured":"OpenAI. 2020. ChatGPT: A Variant of GPT by OpenAI. Retrieved from https:\/\/openai.com\/ (2020)."},{"key":"e_1_3_2_94_2","doi-asserted-by":"crossref","first-page":"7675","DOI":"10.18653\/v1\/2020.acl-main.686","volume-title":"Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics","author":"Wang Hanrui","year":"2020","unstructured":"Hanrui Wang, Zhanghao Wu, Zhijian Liu, Han Cai, Ligeng Zhu, Chuang Gan, and Song Han. 2020. HAT: Hardware-aware transformers for efficient natural language processing. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. 7675\u20137688."},{"key":"e_1_3_2_95_2","doi-asserted-by":"crossref","first-page":"4163","DOI":"10.18653\/v1\/2020.findings-emnlp.372","volume-title":"Findings of the Association for Computational Linguistics: EMNLP 2020","author":"Jiao Xiaoqi","year":"2020","unstructured":"Xiaoqi Jiao, Yichun Yin, Lifeng Shang, Xin Jiang, Xiao Chen, Linlin Li, Fang Wang, and Qun Liu. 2020. TinyBERT: Distilling BERT for natural language understanding. In Findings of the Association for Computational Linguistics: EMNLP 2020. 4163\u20134174."},{"key":"e_1_3_2_96_2","first-page":"2158","volume-title":"Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics","author":"Sun Zhiqing","year":"2020","unstructured":"Zhiqing Sun, Hongkun Yu, Xiaodan Song, Renjie Liu, Yiming Yang, and Denny Zhou. 2020. MobileBERT: A compact task-agnostic BERT for resource-limited devices. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. 2158\u20132170."},{"key":"e_1_3_2_97_2","article-title":"DistilBERT, a distilled version of BERT: Smaller, faster, cheaper and lighter","author":"Sanh Victor","year":"2019","unstructured":"Victor Sanh, Lysandre Debut, Julien Chaumond, and Thomas Wolf. 2019. DistilBERT, a distilled version of BERT: Smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019).","journal-title":"arXiv preprint arXiv:1910.01108"},{"key":"e_1_3_2_98_2","article-title":"Linformer: Self-attention with linear complexity","author":"Wang Sinong","year":"2020","unstructured":"Sinong Wang, Belinda Z. Li, Madian Khabsa, Han Fang, and Hao Ma. 2020. Linformer: Self-attention with linear complexity. arXiv preprint arXiv:2006.04768 (2020).","journal-title":"arXiv preprint arXiv:2006.04768"},{"key":"e_1_3_2_99_2","article-title":"Reformer: The efficient transformer","author":"Kitaev Nikita","year":"2020","unstructured":"Nikita Kitaev, \u0141ukasz Kaiser, and Anselm Levskaya. 2020. Reformer: The efficient transformer. arXiv preprint arXiv:2001.04451 (2020).","journal-title":"arXiv preprint arXiv:2001.04451"},{"key":"e_1_3_2_100_2","article-title":"No train no gain: Revisiting efficient training algorithms for transformer-based language models","volume":"36","author":"Kaddour Jean","year":"2024","unstructured":"Jean Kaddour, Oscar Key, Piotr Nawrot, Pasquale Minervini, and Matt J. Kusner. 2024. No train no gain: Revisiting efficient training algorithms for transformer-based language models. Advances in Neural Information Processing Systems 36 (2024).","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_3_2_101_2","article-title":"Learning to grow pretrained models for efficient transformer training","author":"Wang Peihao","year":"2023","unstructured":"Peihao Wang, Rameswar Panda, Lucas Torroba Hennigen, Philip Greengard, Leonid Karlinsky, Rogerio Feris, David Daniel Cox, Zhangyang Wang, and Yoon Kim. 2023. Learning to grow pretrained models for efficient transformer training. arXiv preprint arXiv:2303.00980 (2023).","journal-title":"arXiv preprint arXiv:2303.00980"},{"key":"e_1_3_2_102_2","article-title":"Efficient language model training through cross-lingual and progressive transfer learning","author":"Ostendorff Malte","year":"2023","unstructured":"Malte Ostendorff and Georg Rehm. 2023. Efficient language model training through cross-lingual and progressive transfer learning. arXiv preprint arXiv:2301.09626 (2023).","journal-title":"arXiv preprint arXiv:2301.09626"},{"key":"e_1_3_2_103_2","article-title":"Efficiently scaling transformer inference","volume":"5","author":"Pope Reiner","year":"2023","unstructured":"Reiner Pope, Sholto Douglas, Aakanksha Chowdhery, Jacob Devlin, James Bradbury, Jonathan Heek, Kefan Xiao, Shivani Agrawal, and Jeff Dean. 2023. Efficiently scaling transformer inference. Proceedings of Machine Learning and Systems 5 (2023).","journal-title":"Proceedings of Machine Learning and Systems"},{"key":"e_1_3_2_104_2","first-page":"42531","volume-title":"International Conference on Machine Learning","author":"Zhou Yanqi","year":"2023","unstructured":"Yanqi Zhou, Nan Du, Yanping Huang, Daiyi Peng, Chang Lan, Da Huang, Siamak Shakeri, David So, Andrew M. Dai, Yifeng Lu, et\u00a0al. 2023. Brainformers: Trading simplicity for efficiency. In International Conference on Machine Learning. PMLR, 42531\u201342542."},{"key":"e_1_3_2_105_2","article-title":"Towards adaptive prefix tuning for parameter-efficient language model fine-tuning","author":"Zhang Zhen-Ru","year":"2023","unstructured":"Zhen-Ru Zhang, Chuanqi Tan, Haiyang Xu, Chengyu Wang, Jun Huang, and Songfang Huang. 2023. Towards adaptive prefix tuning for parameter-efficient language model fine-tuning. arXiv preprint arXiv:2305.15212 (2023).","journal-title":"arXiv preprint arXiv:2305.15212"},{"key":"e_1_3_2_106_2","article-title":"LLaMA-adapter: Efficient fine-tuning of language models with zero-init attention","author":"Zhang Renrui","year":"2023","unstructured":"Renrui Zhang, Jiaming Han, Chris Liu, Peng Gao, Aojun Zhou, Xiangfei Hu, Shilin Yan, Pan Lu, Hongsheng Li, and Yu Qiao. 2023. LLaMA-adapter: Efficient fine-tuning of language models with zero-init attention. arXiv preprint arXiv:2303.16199 (2023).","journal-title":"arXiv preprint arXiv:2303.16199"},{"key":"e_1_3_2_107_2","doi-asserted-by":"crossref","first-page":"213","DOI":"10.1007\/978-3-030-58452-8_13","volume-title":"Computer Vision\u2013ECCV 2020: 16th European Conference, Glasgow, UK, August 23\u201328, 2020, Proceedings, Part I 16","author":"Carion Nicolas","year":"2020","unstructured":"Nicolas Carion, Francisco Massa, Gabriel Synnaeve, Nicolas Usunier, Alexander Kirillov, and Sergey Zagoruyko. 2020. End-to-end object detection with transformers. In Computer Vision\u2013ECCV 2020: 16th European Conference, Glasgow, UK, August 23\u201328, 2020, Proceedings, Part I 16. Springer, 213\u2013229."},{"key":"e_1_3_2_108_2","article-title":"An image is worth 16x16 words: Transformers for image recognition at scale","author":"Dosovitskiy Alexey","year":"2020","unstructured":"Alexey Dosovitskiy, Lucas Beyer, Alexander Kolesnikov, Dirk Weissenborn, Xiaohua Zhai, Thomas Unterthiner, Mostafa Dehghani, Matthias Minderer, Georg Heigold, Sylvain Gelly, et\u00a0al. 2020. An image is worth 16x16 words: Transformers for image recognition at scale. arXiv preprint arXiv:2010.11929 (2020).","journal-title":"arXiv preprint arXiv:2010.11929"},{"key":"e_1_3_2_109_2","first-page":"10012","volume-title":"Proceedings of the IEEE\/CVF International Conference on Computer Vision","author":"Liu Ze","year":"2021","unstructured":"Ze Liu, Yutong Lin, Yue Cao, Han Hu, Yixuan Wei, Zheng Zhang, Stephen Lin, and Baining Guo. 2021. Swin Transformer: Hierarchical vision transformer using shifted windows. In Proceedings of the IEEE\/CVF International Conference on Computer Vision. 10012\u201310022."},{"key":"e_1_3_2_110_2","first-page":"12009","volume-title":"Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition","author":"Liu Ze","year":"2022","unstructured":"Ze Liu, Han Hu, Yutong Lin, Zhuliang Yao, Zhenda Xie, Yixuan Wei, Jia Ning, Yue Cao, Zheng Zhang, Li Dong, et\u00a0al. 2022. Swin Transformer V2: Scaling up capacity and resolution. In Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition. 12009\u201312019."},{"key":"e_1_3_2_111_2","first-page":"280","volume-title":"Computer Vision\u2013ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23\u201327, 2022, Proceedings, Part IX","author":"Li Yanghao","year":"2022","unstructured":"Yanghao Li, Hanzi Mao, Ross Girshick, and Kaiming He. 2022. Exploring plain vision transformer backbones for object detection. In Computer Vision\u2013ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23\u201327, 2022, Proceedings, Part IX. Springer, 280\u2013296."},{"key":"e_1_3_2_112_2","first-page":"26183","article-title":"You only look at one sequence: Rethinking transformer in vision through object detection","volume":"34","author":"Fang Yuxin","year":"2021","unstructured":"Yuxin Fang, Bencheng Liao, Xinggang Wang, Jiemin Fang, Jiyang Qi, Rui Wu, Jianwei Niu, and Wenyu Liu. 2021. You only look at one sequence: Rethinking transformer in vision through object detection. Advances in Neural Information Processing Systems 34 (2021), 26183\u201326197.","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_3_2_113_2","article-title":"Simple open-vocabulary object detection with vision transformers","author":"Minderer Matthias","year":"2022","unstructured":"Matthias Minderer, Alexey Gritsenko, Austin Stone, Maxim Neumann, Dirk Weissenborn, Alexey Dosovitskiy, Aravindh Mahendran, Anurag Arnab, Mostafa Dehghani, Zhuoran Shen, et\u00a0al. 2022. Simple open-vocabulary object detection with vision transformers. arXiv preprint arXiv:2205.06230 (2022).","journal-title":"arXiv preprint arXiv:2205.06230"},{"key":"e_1_3_2_114_2","first-page":"7262","volume-title":"Proceedings of the IEEE\/CVF International Conference on Computer Vision","author":"Strudel Robin","year":"2021","unstructured":"Robin Strudel, Ricardo Garcia, Ivan Laptev, and Cordelia Schmid. 2021. Segmenter: Transformer for semantic segmentation. In Proceedings of the IEEE\/CVF International Conference on Computer Vision. 7262\u20137272."},{"key":"e_1_3_2_115_2","article-title":"TransUNet: Transformers make strong encoders for medical image segmentation","author":"Chen Jieneng","year":"2021","unstructured":"Jieneng Chen, Yongyi Lu, Qihang Yu, Xiangde Luo, Ehsan Adeli, Yan Wang, Le Lu, Alan L. Yuille, and Yuyin Zhou. 2021. TransUNet: Transformers make strong encoders for medical image segmentation. arXiv preprint arXiv:2102.04306 (2021).","journal-title":"arXiv preprint arXiv:2102.04306"},{"key":"e_1_3_2_116_2","first-page":"12094","volume-title":"Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition","author":"Gu Jiaqi","year":"2022","unstructured":"Jiaqi Gu, Hyoukjun Kwon, Dilin Wang, Wei Ye, Meng Li, Yu-Hsin Chen, Liangzhen Lai, Vikas Chandra, and David Z. Pan. 2022. Multi-scale high-resolution vision transformer for semantic segmentation. In Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition. 12094\u201312103."},{"key":"e_1_3_2_117_2","article-title":"Segment anything","author":"Kirillov Alexander","year":"2023","unstructured":"Alexander Kirillov, Eric Mintun, Nikhila Ravi, Hanzi Mao, Chloe Rolland, Laura Gustafson, Tete Xiao, Spencer Whitehead, Alexander C. Berg, Wan-Yen Lo, et\u00a0al. 2023. Segment anything. arXiv preprint arXiv:2304.02643 (2023).","journal-title":"arXiv preprint arXiv:2304.02643"},{"key":"e_1_3_2_118_2","first-page":"3202","volume-title":"Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition","author":"Liu Ze","year":"2022","unstructured":"Ze Liu, Jia Ning, Yue Cao, Yixuan Wei, Zheng Zhang, Stephen Lin, and Han Hu. 2022. Video Swin Transformer. In Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition. 3202\u20133211."},{"key":"e_1_3_2_119_2","first-page":"6836","volume-title":"Proceedings of the IEEE\/CVF International Conference on Computer Vision","author":"Arnab Anurag","year":"2021","unstructured":"Anurag Arnab, Mostafa Dehghani, Georg Heigold, Chen Sun, Mario Lu\u010di\u0107, and Cordelia Schmid. 2021. ViViT: A video vision transformer. In Proceedings of the IEEE\/CVF International Conference on Computer Vision. 6836\u20136846."},{"key":"e_1_3_2_120_2","first-page":"3163","volume-title":"Proceedings of the IEEE\/CVF International Conference on Computer Vision","author":"Neimark Daniel","year":"2021","unstructured":"Daniel Neimark, Omri Bar, Maya Zohar, and Dotan Asselmann. 2021. Video transformer network. In Proceedings of the IEEE\/CVF International Conference on Computer Vision. 3163\u20133172."},{"key":"e_1_3_2_121_2","first-page":"12259","volume-title":"Proceedings of the IEEE\/CVF International Conference on Computer Vision","author":"Graham Benjamin","year":"2021","unstructured":"Benjamin Graham, Alaaeldin El-Nouby, Hugo Touvron, Pierre Stock, Armand Joulin, Herv\u00e9 J\u00e9gou, and Matthijs Douze. 2021. LeViT: A vision transformer in ConvNet\u2019s clothing for faster inference. In Proceedings of the IEEE\/CVF International Conference on Computer Vision. 12259\u201312269."},{"key":"e_1_3_2_122_2","first-page":"5270","volume-title":"Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition","author":"Chen Yinpeng","year":"2022","unstructured":"Yinpeng Chen, Xiyang Dai, Dongdong Chen, Mengchen Liu, Xiaoyi Dong, Lu Yuan, and Zicheng Liu. 2022. Mobile-former: Bridging MobileNet and Transformer. In Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition. 5270\u20135279."},{"key":"e_1_3_2_123_2","volume-title":"International Conference on Learning Representations","author":"Mehta Sachin","year":"2022","unstructured":"Sachin Mehta and Mohammad Rastegari. 2022. MobileViT: Light-weight, general-purpose, and mobile-friendly vision transformer. In International Conference on Learning Representations."},{"key":"e_1_3_2_124_2","article-title":"Separable self-attention for mobile vision transformers","author":"Mehta Sachin","year":"2022","unstructured":"Sachin Mehta and Mohammad Rastegari. 2022. Separable self-attention for mobile vision transformers. arXiv preprint arXiv:2206.02680 (2022).","journal-title":"arXiv preprint arXiv:2206.02680"},{"key":"e_1_3_2_125_2","article-title":"MobileViTv3: Mobile-friendly vision transformer with simple and effective fusion of local, global and input features","author":"Wadekar Shakti N.","year":"2022","unstructured":"Shakti N. Wadekar and Abhishek Chaurasia. 2022. MobileViTv3: Mobile-friendly vision transformer with simple and effective fusion of local, global and input features. arXiv preprint arXiv:2209.15159 (2022).","journal-title":"arXiv preprint arXiv:2209.15159"},{"key":"e_1_3_2_126_2","article-title":"EfficientViT: Enhanced linear attention for high-resolution low-computation visual recognition","author":"Cai Han","year":"2022","unstructured":"Han Cai, Chuang Gan, and Song Han. 2022. EfficientViT: Enhanced linear attention for high-resolution low-computation visual recognition. arXiv preprint arXiv:2205.14756 (2022).","journal-title":"arXiv preprint arXiv:2205.14756"},{"key":"e_1_3_2_127_2","first-page":"294","volume-title":"Computer Vision\u2013ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23\u201327, 2022, Proceedings, Part XI","author":"Pan Junting","year":"2022","unstructured":"Junting Pan, Adrian Bulat, Fuwen Tan, Xiatian Zhu, Lukasz Dudziak, Hongsheng Li, Georgios Tzimiropoulos, and Brais Martinez. 2022. EdgeViTs: Competing light-weight CNNs on mobile devices with vision transformers. In Computer Vision\u2013ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23\u201327, 2022, Proceedings, Part XI. Springer, 294\u2013311."},{"key":"e_1_3_2_128_2","doi-asserted-by":"crossref","first-page":"3","DOI":"10.1007\/978-3-031-25082-8_1","volume-title":"Computer Vision\u2013ECCV 2022 Workshops: Tel Aviv, Israel, October 23\u201327, 2022, Proceedings, Part VII","author":"Maaz Muhammad","year":"2023","unstructured":"Muhammad Maaz, Abdelrahman Shaker, Hisham Cholakkal, Salman Khan, Syed Waqas Zamir, Rao Muhammad Anwer, and Fahad Shahbaz Khan. 2023. EdgeNeXt: efficiently amalgamated CNN-transformer architecture for mobile vision applications. In Computer Vision\u2013ECCV 2022 Workshops: Tel Aviv, Israel, October 23\u201327, 2022, Proceedings, Part VII. Springer, 3\u201320."},{"key":"e_1_3_2_129_2","volume-title":"Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition","author":"You Haoran","year":"2023","unstructured":"Haoran You, Yunyang Xiong, Xiaoliang Dai, Bichen Wu, Peizhao Zhang, Haoqi Fan, Peter Vajda, and Yingyan Lin. 2023. Castling-ViT: Compressing self-attention via switching towards linear-angular attention during vision transformer inference. In Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition."},{"key":"e_1_3_2_130_2","article-title":"FastViT: A fast hybrid vision transformer using structural reparameterization","author":"Vasu Pavan Kumar Anasosalu","year":"2023","unstructured":"Pavan Kumar Anasosalu Vasu, James Gabriel, Jeff Zhu, Oncel Tuzel, and Anurag Ranjan. 2023. FastViT: A fast hybrid vision transformer using structural reparameterization. arXiv preprint arXiv:2303.14189 (2023).","journal-title":"arXiv preprint arXiv:2303.14189"},{"key":"e_1_3_2_131_2","first-page":"288","volume-title":"2020 IEEE 38th International Conference on Computer Design (ICCD\u201920)","author":"Luo Xiangzhong","year":"2020","unstructured":"Xiangzhong Luo, Di Liu, Hao Kong, and Weichen Liu. 2020. EdgeNAS: Discovering efficient neural architectures for edge systems. In 2020 IEEE 38th International Conference on Computer Design (ICCD\u201920). IEEE, 288\u2013295."},{"key":"e_1_3_2_132_2","first-page":"475","volume-title":"Proceedings of the 59th ACM\/IEEE Design Automation Conference","author":"Luo Xiangzhong","year":"2022","unstructured":"Xiangzhong Luo, Di Liu, Hao Kong, Shuo Huai, Hui Chen, and Weichen Liu. 2022. You only search once: On lightweight differentiable architecture search for resource-constrained embedded platforms. In Proceedings of the 59th ACM\/IEEE Design Automation Conference. 475\u2013480."},{"key":"e_1_3_2_133_2","first-page":"11936","volume-title":"Proceedings of the IEEE\/CVF International Conference on Computer Vision","author":"Heo Byeongho","year":"2021","unstructured":"Byeongho Heo, Sangdoo Yun, Dongyoon Han, Sanghyuk Chun, Junsuk Choe, and Seong Joon Oh. 2021. Rethinking spatial dimensions of vision transformers. In Proceedings of the IEEE\/CVF International Conference on Computer Vision. 11936\u201311945."},{"key":"e_1_3_2_134_2","first-page":"10347","volume-title":"International Conference on Machine Learning","author":"Touvron Hugo","year":"2021","unstructured":"Hugo Touvron, Matthieu Cord, Matthijs Douze, Francisco Massa, Alexandre Sablayrolles, and Herv\u00e9 J\u00e9gou. 2021. Training data-efficient image transformers & distillation through attention. In International Conference on Machine Learning. PMLR, 10347\u201310357."},{"key":"e_1_3_2_135_2","first-page":"432","volume-title":"Intelligent Systems and Applications: Proceedings of the 2019 Intelligent Systems Conference (IntelliSys) Volume 2","author":"Hu Dichao","year":"2020","unstructured":"Dichao Hu. 2020. An introductory survey on attention mechanisms in NLP problems. In Intelligent Systems and Applications: Proceedings of the 2019 Intelligent Systems Conference (IntelliSys) Volume 2. Springer, 432\u2013448."},{"issue":"3","key":"e_1_3_2_136_2","doi-asserted-by":"crossref","first-page":"331","DOI":"10.1007\/s41095-022-0271-y","article-title":"Attention mechanisms in computer vision: A survey","volume":"8","author":"Guo Meng-Hao","year":"2022","unstructured":"Meng-Hao Guo, Tian-Xing Xu, Jiang-Jiang Liu, Zheng-Ning Liu, Peng-Tao Jiang, Tai-Jiang Mu, Song-Hai Zhang, Ralph R. Martin, Ming-Ming Cheng, and Shi-Min Hu. 2022. Attention mechanisms in computer vision: A survey. Computational Visual Media 8, 3 (2022), 331\u2013368.","journal-title":"Computational Visual Media"},{"key":"e_1_3_2_137_2","doi-asserted-by":"crossref","first-page":"185","DOI":"10.1016\/j.neunet.2020.07.010","article-title":"Towards explainable deep neural networks (xDNN)","volume":"130","author":"Angelov Plamen","year":"2020","unstructured":"Plamen Angelov and Eduardo Soares. 2020. Towards explainable deep neural networks (xDNN). Neural Networks 130 (2020), 185\u2013194.","journal-title":"Neural Networks"},{"key":"e_1_3_2_138_2","article-title":"Neural architecture search with reinforcement learning","author":"Zoph Barret","year":"2016","unstructured":"Barret Zoph and Quoc V. Le. 2016. Neural architecture search with reinforcement learning. arXiv preprint arXiv:1611.01578 (2016).","journal-title":"arXiv preprint arXiv:1611.01578"},{"key":"e_1_3_2_139_2","volume-title":"International Conference on Learning Representations","author":"Liu Hanxiao","year":"2019","unstructured":"Hanxiao Liu, Karen Simonyan, and Yiming Yang. 2019. DARTS: Differentiable architecture search. In International Conference on Learning Representations."},{"key":"e_1_3_2_140_2","article-title":"Vision GNN: An image is worth graph of nodes","author":"Han Kai","year":"2022","unstructured":"Kai Han, Yunhe Wang, Jianyuan Guo, Yehui Tang, and Enhua Wu. 2022. Vision GNN: An image is worth graph of nodes. arXiv preprint arXiv:2206.00272 (2022).","journal-title":"arXiv preprint arXiv:2206.00272"},{"key":"e_1_3_2_141_2","article-title":"A survey on multi-modal summarization","author":"Jangra Anubhav","year":"2021","unstructured":"Anubhav Jangra, Sourajit Mukherjee, Adam Jatowt, Sriparna Saha, and Mohammad Hasanuzzaman. 2021. A survey on multi-modal summarization. Comput. Surveys (2021).","journal-title":"Comput. Surveys"},{"key":"e_1_3_2_142_2","article-title":"How to train your ViT? Data, augmentation, and regularization in vision transformers","author":"Steiner Andreas","year":"2021","unstructured":"Andreas Steiner, Alexander Kolesnikov, Xiaohua Zhai, Ross Wightman, Jakob Uszkoreit, and Lucas Beyer. 2021. How to train your ViT? Data, augmentation, and regularization in vision transformers. arXiv preprint arXiv:2106.10270 (2021).","journal-title":"arXiv preprint arXiv:2106.10270"},{"key":"e_1_3_2_143_2","volume-title":"International Conference on Learning Representations","author":"Fu Yonggan","year":"2022","unstructured":"Yonggan Fu, Shunyao Zhang, Shang Wu, Cheng Wan, and Yingyan Lin. 2022. Patch-fool: Are vision transformers always robust against adversarial perturbations?. In International Conference on Learning Representations."},{"key":"e_1_3_2_144_2","first-page":"111","volume-title":"Proceedings of the IEEE\/CVF International Conference on Computer Vision","author":"Ye Shaokai","year":"2019","unstructured":"Shaokai Ye, Kaidi Xu, Sijia Liu, Hao Cheng, Jan-Henrik Lambrechts, Huan Zhang, Aojun Zhou, Kaisheng Ma, Yanzhi Wang, and Xue Lin. 2019. Adversarial robustness vs. model compression, or both?. In Proceedings of the IEEE\/CVF International Conference on Computer Vision. 111\u2013120."},{"key":"e_1_3_2_145_2","first-page":"8697","volume-title":"Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition","author":"Zoph Barret","year":"2018","unstructured":"Barret Zoph, Vijay Vasudevan, Jonathon Shlens, and Quoc V. Le. 2018. Learning transferable architectures for scalable image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 8697\u20138710."},{"key":"e_1_3_2_146_2","volume-title":"International Conference on Learning Representations","author":"Zela Arber","year":"2020","unstructured":"Arber Zela, Thomas Elsken, Tonmoy Saikia, Yassine Marrakchi, Thomas Brox, and Frank Hutter. 2020. Understanding and robustifying differentiable architecture search. In International Conference on Learning Representations."},{"key":"e_1_3_2_147_2","volume-title":"International Conference on Learning Representations","author":"Xu Yuhui","year":"2020","unstructured":"Yuhui Xu, Lingxi Xie, Xiaopeng Zhang, Xin Chen, Guo-Jun Qi, Qi Tian, and Hongkai Xiong. 2020. PC-DARTS: Partial channel connections for memory-efficient architecture search. In International Conference on Learning Representations."},{"key":"e_1_3_2_148_2","first-page":"1294","volume-title":"Proceedings of the IEEE\/CVF International Conference on Computer Vision","author":"Chen Xin","year":"2019","unstructured":"Xin Chen, Lingxi Xie, Jun Wu, and Qi Tian. 2019. Progressive differentiable architecture search: Bridging the depth gap between search and evaluation. In Proceedings of the IEEE\/CVF International Conference on Computer Vision. 1294\u20131303."},{"key":"e_1_3_2_149_2","article-title":"Darts+: Improved differentiable architecture search with early stopping","author":"Liang Hanwen","year":"2019","unstructured":"Hanwen Liang, Shifeng Zhang, Jiacheng Sun, Xingqiu He, Weiran Huang, Kechen Zhuang, and Zhenguo Li. 2019. Darts+: Improved differentiable architecture search with early stopping. arXiv preprint arXiv:1909.06035 (2019).","journal-title":"arXiv preprint arXiv:1909.06035"},{"key":"e_1_3_2_150_2","volume-title":"International Conference on Learning Representations","author":"Chu Xiangxiang","year":"2021","unstructured":"Xiangxiang Chu, Xiaoxing Wang, Bo Zhang, Shun Lu, Xiaolin Wei, and Junchi Yan. 2021. DARTS-: Robustly stepping out of performance collapse without indicators. In International Conference on Learning Representations."},{"key":"e_1_3_2_151_2","first-page":"465","volume-title":"Computer Vision\u2013ECCV 2020: 16th European Conference, Glasgow, UK, August 23\u201328, 2020, Proceedings, Part XV","author":"Chu Xiangxiang","year":"2020","unstructured":"Xiangxiang Chu, Tianbao Zhou, Bo Zhang, and Jixiang Li. 2020. Fair darts: Eliminating unfair advantages in differentiable architecture search. In Computer Vision\u2013ECCV 2020: 16th European Conference, Glasgow, UK, August 23\u201328, 2020, Proceedings, Part XV. Springer, 465\u2013480."},{"key":"e_1_3_2_152_2","first-page":"10864","volume-title":"2022 IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR\u201922)","author":"Ye Peng","year":"2022","unstructured":"Peng Ye, Baopu Li, Yikang Li, Tao Chen, Jiayuan Fan, and Wanli Ouyang. 2022. \\(\\beta\\) -DARTS: Beta-decay regularization for differentiable architecture search. In 2022 IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR\u201922). IEEE, 10864\u201310873."},{"key":"e_1_3_2_153_2","first-page":"418","volume-title":"2021 Design, Automation & Test in Europe Conference & Exhibition (DATE\u201921)","author":"Luo Xiangzhong","year":"2021","unstructured":"Xiangzhong Luo, Di Liu, Shuo Huai, and Weichen Liu. 2021. HSCoNAS: Hardware-software co-design of efficient DNNs via neural architecture search. In 2021 Design, Automation & Test in Europe Conference & Exhibition (DATE\u201921). IEEE, 418\u2013421."},{"key":"e_1_3_2_154_2","first-page":"692","volume-title":"Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition Workshops","author":"Zhang Li Lyna","year":"2020","unstructured":"Li Lyna Zhang, Yuqing Yang, Yuhang Jiang, Wenwu Zhu, and Yunxin Liu. 2020. Fast hardware-aware neural architecture search. In Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition Workshops. 692\u2013693."},{"key":"e_1_3_2_155_2","article-title":"SurgeNAS: A comprehensive surgery on hardware-aware differentiable neural architecture search","author":"Luo Xiangzhong","year":"2022","unstructured":"Xiangzhong Luo, Di Liu, Hao Kong, Shuo Huai, Hui Chen, and Weichen Liu. 2022. SurgeNAS: A comprehensive surgery on hardware-aware differentiable neural architecture search. IEEE Trans. Comput. (2022).","journal-title":"IEEE Trans. Comput."},{"issue":"4","key":"e_1_3_2_156_2","doi-asserted-by":"crossref","first-page":"65","DOI":"10.1145\/1498765.1498785","article-title":"Roofline: An insightful visual performance model for multicore architectures","volume":"52","author":"Williams Samuel","year":"2009","unstructured":"Samuel Williams, Andrew Waterman, and David Patterson. 2009. Roofline: An insightful visual performance model for multicore architectures. Commun. ACM 52, 4 (2009), 65\u201376.","journal-title":"Commun. ACM"},{"key":"e_1_3_2_157_2","article-title":"LightNAS: On lightweight and scalable neural architecture search for embedded platforms","author":"Luo Xiangzhong","year":"2022","unstructured":"Xiangzhong Luo, Di Liu, Hao Kong, Shuo Huai, Hui Chen, and Weichen Liu. 2022. LightNAS: On lightweight and scalable neural architecture search for embedded platforms. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems (2022).","journal-title":"IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems"},{"key":"e_1_3_2_158_2","first-page":"4780","volume-title":"Proceedings of the AAAI Conference on Artificial Intelligence","author":"Real Esteban","year":"2019","unstructured":"Esteban Real, Alok Aggarwal, Yanping Huang, and Quoc V. Le. 2019. Regularized evolution for image classifier architecture search. In Proceedings of the AAAI Conference on Artificial Intelligence. 4780\u20134789."},{"key":"e_1_3_2_159_2","article-title":"Designing neural network architectures using reinforcement learning","author":"Baker Bowen","year":"2016","unstructured":"Bowen Baker, Otkrist Gupta, Nikhil Naik, and Ramesh Raskar. 2016. Designing neural network architectures using reinforcement learning. arXiv preprint arXiv:1611.02167 (2016).","journal-title":"arXiv preprint arXiv:1611.02167"},{"key":"e_1_3_2_160_2","doi-asserted-by":"crossref","first-page":"5","DOI":"10.1007\/978-1-4615-3618-5_2","article-title":"Simple statistical gradient-following algorithms for connectionist reinforcement learning","author":"Williams Ronald J.","year":"1992","unstructured":"Ronald J. Williams. 1992. Simple statistical gradient-following algorithms for connectionist reinforcement learning. Reinforcement Learning (1992), 5\u201332.","journal-title":"Reinforcement Learning"},{"key":"e_1_3_2_161_2","first-page":"4095","volume-title":"International Conference on Machine Learning","author":"Pham Hieu","year":"2018","unstructured":"Hieu Pham, Melody Guan, Barret Zoph, Quoc Le, and Jeff Dean. 2018. Efficient neural architecture search via parameters sharing. In International Conference on Machine Learning. PMLR, 4095\u20134104."},{"key":"e_1_3_2_162_2","first-page":"544","volume-title":"Computer Vision\u2013ECCV 2020: 16th European Conference, Glasgow, UK, August 23\u201328, 2020, Proceedings, Part XVI 16","author":"Guo Zichao","year":"2020","unstructured":"Zichao Guo, Xiangyu Zhang, Haoyuan Mu, Wen Heng, Zechun Liu, Yichen Wei, and Jian Sun. 2020. Single path one-shot neural architecture search with uniform sampling. In Computer Vision\u2013ECCV 2020: 16th European Conference, Glasgow, UK, August 23\u201328, 2020, Proceedings, Part XVI 16. Springer, 544\u2013560."},{"key":"e_1_3_2_163_2","first-page":"1314","volume-title":"Proceedings of the IEEE\/CVF International Conference on Computer Vision","author":"Howard Andrew","year":"2019","unstructured":"Andrew Howard, Mark Sandler, Grace Chu, Liang-Chieh Chen, Bo Chen, Mingxing Tan, Weijun Wang, Yukun Zhu, Ruoming Pang, Vijay Vasudevan, et\u00a0al. 2019. Searching for MobileNetV3. In Proceedings of the IEEE\/CVF International Conference on Computer Vision. 1314\u20131324."},{"key":"e_1_3_2_164_2","first-page":"14323","volume-title":"Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition","author":"Bender Gabriel","year":"2020","unstructured":"Gabriel Bender, Hanxiao Liu, Bo Chen, Grace Chu, Shuyang Cheng, Pieter-Jan Kindermans, and Quoc V. Le. 2020. Can weight sharing outperform random architecture search? An investigation with TuNAS. In Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition. 14323\u201314332."},{"key":"e_1_3_2_165_2","article-title":"MONAS: Multi-objective neural architecture search using reinforcement learning","author":"Hsu Chi-Hung","year":"2018","unstructured":"Chi-Hung Hsu, Shu-Huan Chang, Jhao-Hong Liang, Hsin-Ping Chou, Chun-Hao Liu, Shih-Chieh Chang, Jia-Yu Pan, Yu-Ting Chen, Wei Wei, and Da-Cheng Juan. 2018. MONAS: Multi-objective neural architecture search using reinforcement learning. arXiv preprint arXiv:1806.10332 (2018).","journal-title":"arXiv preprint arXiv:1806.10332"},{"key":"e_1_3_2_166_2","first-page":"379","volume-title":"ICGA","author":"Miller Geoffrey F.","year":"1989","unstructured":"Geoffrey F. Miller, Peter M. Todd, and Shailesh U. Hegde. 1989. Designing neural networks using genetic algorithms. In ICGA, Vol. 89. 379\u2013384."},{"issue":"1","key":"e_1_3_2_167_2","doi-asserted-by":"crossref","first-page":"54","DOI":"10.1109\/72.265960","article-title":"An evolutionary algorithm that constructs recurrent neural networks","volume":"5","author":"Angeline Peter J.","year":"1994","unstructured":"Peter J. Angeline, Gregory M. Saunders, and Jordan B. Pollack. 1994. An evolutionary algorithm that constructs recurrent neural networks. IEEE Transactions on Neural Networks 5, 1 (1994), 54\u201365.","journal-title":"IEEE Transactions on Neural Networks"},{"key":"e_1_3_2_168_2","doi-asserted-by":"crossref","first-page":"47","DOI":"10.1007\/s12065-007-0002-4","article-title":"Neuroevolution: From architectures to learning","volume":"1","author":"Floreano Dario","year":"2008","unstructured":"Dario Floreano, Peter D\u00fcrr, and Claudio Mattiussi. 2008. Neuroevolution: From architectures to learning. Evolutionary Intelligence 1 (2008), 47\u201362.","journal-title":"Evolutionary Intelligence"},{"issue":"2","key":"e_1_3_2_169_2","doi-asserted-by":"crossref","first-page":"99","DOI":"10.1162\/106365602320169811","article-title":"Evolving neural networks through augmenting topologies","volume":"10","author":"Stanley Kenneth O.","year":"2002","unstructured":"Kenneth O. Stanley and Risto Miikkulainen. 2002. Evolving neural networks through augmenting topologies. Evolutionary Computation 10, 2 (2002), 99\u2013127.","journal-title":"Evolutionary Computation"},{"key":"e_1_3_2_170_2","first-page":"550","volume-title":"International Conference on Machine Learning","author":"Bender Gabriel","year":"2018","unstructured":"Gabriel Bender, Pieter-Jan Kindermans, Barret Zoph, Vijay Vasudevan, and Quoc Le. 2018. Understanding and simplifying one-shot architecture search. In International Conference on Machine Learning. PMLR, 550\u2013559."},{"issue":"3","key":"e_1_3_2_171_2","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1145\/2480741.2480752","article-title":"Exploration and exploitation in evolutionary algorithms: A survey","volume":"45","author":"\u010crepin\u0161ek Matej","year":"2013","unstructured":"Matej \u010crepin\u0161ek, Shih-Hsi Liu, and Marjan Mernik. 2013. Exploration and exploitation in evolutionary algorithms: A survey. ACM Computing Surveys (CSUR) 45, 3 (2013), 1\u201333.","journal-title":"ACM Computing Surveys (CSUR)"},{"issue":"10","key":"e_1_3_2_172_2","doi-asserted-by":"crossref","first-page":"1108","DOI":"10.1016\/j.infsof.2011.03.008","article-title":"Evolutionary mutation testing","volume":"53","author":"Dom\u00ednguez-Jim\u00e9nez Juan Jos\u00e9","year":"2011","unstructured":"Juan Jos\u00e9 Dom\u00ednguez-Jim\u00e9nez, Antonia Estero-Botaro, Antonio Garc\u00eda-Dom\u00ednguez, and Inmaculada Medina-Bulo. 2011. Evolutionary mutation testing. Information and Software Technology 53, 10 (2011), 1108\u20131123.","journal-title":"Information and Software Technology"},{"key":"e_1_3_2_173_2","doi-asserted-by":"crossref","first-page":"367","DOI":"10.7551\/mitpress\/2887.003.0035","volume-title":"Evolutionary Programming","author":"Spears William M.","year":"1995","unstructured":"William M. Spears, et\u00a0al. 1995. Adapting crossover in evolutionary algorithms. In Evolutionary Programming. 367\u2013384."},{"key":"e_1_3_2_174_2","volume-title":"6th International Conference on Learning Representations 2018","author":"Brock Andrew","year":"2018","unstructured":"Andrew Brock, Theodore Lim, James Millar Ritchie, and Nicholas J. Weston. 2018. SmaSH: One-shot model architecture search through hypernetworks. In 6th International Conference on Learning Representations 2018."},{"key":"e_1_3_2_175_2","first-page":"12239","volume-title":"Proceedings of the IEEE\/CVF International Conference on Computer Vision","author":"Chu Xiangxiang","year":"2021","unstructured":"Xiangxiang Chu, Bo Zhang, and Ruijun Xu. 2021. FairNAS: Rethinking evaluation fairness of weight sharing neural architecture search. In Proceedings of the IEEE\/CVF International Conference on Computer Vision. 12239\u201312248."},{"key":"e_1_3_2_176_2","first-page":"702","volume-title":"Computer Vision\u2013ECCV 2020: 16th European Conference, Glasgow, UK, August 23\u201328, 2020, Proceedings, Part VII 16","author":"Yu Jiahui","year":"2020","unstructured":"Jiahui Yu, Pengchong Jin, Hanxiao Liu, Gabriel Bender, Pieter-Jan Kindermans, Mingxing Tan, Thomas Huang, Xiaodan Song, Ruoming Pang, and Quoc Le. 2020. BigNAS: Scaling up neural architecture search with big single-stage models. In Computer Vision\u2013ECCV 2020: 16th European Conference, Glasgow, UK, August 23\u201328, 2020, Proceedings, Part VII 16. Springer, 702\u2013717."},{"issue":"3","key":"e_1_3_2_177_2","first-page":"1","article-title":"One proxy device is enough for hardware-aware neural architecture search","volume":"5","author":"Lu Bingqian","year":"2021","unstructured":"Bingqian Lu, Jianyi Yang, Weiwen Jiang, Yiyu Shi, and Shaolei Ren. 2021. One proxy device is enough for hardware-aware neural architecture search. Proceedings of the ACM on Measurement and Analysis of Computing Systems 5, 3 (2021), 1\u201334.","journal-title":"Proceedings of the ACM on Measurement and Analysis of Computing Systems"},{"key":"e_1_3_2_178_2","first-page":"1999","volume-title":"Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition","author":"You Shan","year":"2020","unstructured":"Shan You, Tao Huang, Mingmin Yang, Fei Wang, Chen Qian, and Changshui Zhang. 2020. GreedyNAS: Towards fast one-shot MAS with greedy supernet. In Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition. 1999\u20132008."},{"issue":"6","key":"e_1_3_2_179_2","first-page":"1799","article-title":"Designing efficient DNNs via hardware-aware neural architecture search and beyond","volume":"41","author":"Luo Xiangzhong","year":"2021","unstructured":"Xiangzhong Luo, Di Liu, Shuo Huai, Hao Kong, Hui Chen, and Weichen Liu. 2021. Designing efficient DNNs via hardware-aware neural architecture search and beyond. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems 41, 6 (2021), 1799\u20131812.","journal-title":"IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems"},{"key":"e_1_3_2_180_2","volume-title":"International Conference on Learning Representations","author":"Elsken Thomas","year":"2019","unstructured":"Thomas Elsken, Jan Hendrik Metzen, and Frank Hutter. 2019. Efficient multi-objective neural architecture search via Lamarckian evolution. In International Conference on Learning Representations."},{"key":"e_1_3_2_181_2","first-page":"1620","volume-title":"Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition","author":"Li Guohao","year":"2020","unstructured":"Guohao Li, Guocheng Qian, Itzel C. Delgadillo, Matthias Muller, Ali Thabet, and Bernard Ghanem. 2020. SGAS: Sequential greedy architecture search. In Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition. 1620\u20131630."},{"key":"e_1_3_2_182_2","first-page":"6667","volume-title":"Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition","author":"Yang Yibo","year":"2021","unstructured":"Yibo Yang, Shan You, Hongyang Li, Fei Wang, Chen Qian, and Zhouchen Lin. 2021. Towards improving the consistency, efficiency, and flexibility of differentiable neural architecture search. In Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition. 6667\u20136676."},{"key":"e_1_3_2_183_2","volume-title":"International Conference on Learning Representations","author":"Wang Ruochen","year":"2021","unstructured":"Ruochen Wang, Minhao Cheng, Xiangning Chen, Xiaocheng Tang, and Cho-Jui Hsieh. 2021. Rethinking architecture selection in differentiable NAS. In International Conference on Learning Representations."},{"key":"e_1_3_2_184_2","volume-title":"International Conference on Learning Representations","author":"Chen Xiangning","year":"2021","unstructured":"Xiangning Chen, Ruochen Wang, Minhao Cheng, Xiaocheng Tang, and Cho-Jui Hsieh. 2021. DrNAS: Dirichlet neural architecture search. In International Conference on Learning Representations."},{"key":"e_1_3_2_185_2","article-title":"GOLD-NAS: Gradual, one-level, differentiable","author":"Bi Kaifeng","year":"2020","unstructured":"Kaifeng Bi, Lingxi Xie, Xin Chen, Longhui Wei, and Qi Tian. 2020. GOLD-NAS: Gradual, one-level, differentiable. arXiv preprint arXiv:2007.03331 (2020).","journal-title":"arXiv preprint arXiv:2007.03331"},{"key":"e_1_3_2_186_2","first-page":"373","volume-title":"Proceedings of the IEEE\/CVF International Conference on Computer Vision","author":"Hou Pengfei","year":"2021","unstructured":"Pengfei Hou, Ying Jin, and Yukang Chen. 2021. Single-DARTS: Towards stable architecture search. In Proceedings of the IEEE\/CVF International Conference on Computer Vision. 373\u2013382."},{"key":"e_1_3_2_187_2","first-page":"1761","volume-title":"Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition","author":"Dong Xuanyi","year":"2019","unstructured":"Xuanyi Dong and Yi Yang. 2019. Searching for a robust neural architecture in four GPU hours. In Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition. 1761\u20131770."},{"key":"e_1_3_2_188_2","volume-title":"International Conference on Learning Representations","author":"Xie Sirui","year":"2019","unstructured":"Sirui Xie, Hehui Zheng, Chunxiao Liu, and Liang Lin. 2019. SNAS: Stochastic neural architecture search. In International Conference on Learning Representations."},{"key":"e_1_3_2_189_2","volume-title":"International Conference on Learning Representations","author":"Mei Jieru","year":"2020","unstructured":"Jieru Mei, Yingwei Li, Xiaochen Lian, Xiaojie Jin, Linjie Yang, Alan Yuille, and Jianchao Yang. 2020. AtomNAS: Fine-grained end-to-end neural architecture search. In International Conference on Learning Representations."},{"key":"e_1_3_2_190_2","article-title":"Automated deep learning: Neural architecture search is not the end","author":"Dong Xuanyi","year":"2021","unstructured":"Xuanyi Dong, David Jacob Kedziora, Katarzyna Musial, and Bogdan Gabrys. 2021. Automated deep learning: Neural architecture search is not the end. arXiv preprint arXiv:2112.09245 (2021).","journal-title":"arXiv preprint arXiv:2112.09245"},{"key":"e_1_3_2_191_2","volume-title":"International Conference on Learning Representations","author":"Jang Eric","year":"2017","unstructured":"Eric Jang, Shixiang Gu, and Ben Poole. 2017. Categorical reparameterization with Gumbel-Softmax. In International Conference on Learning Representations."},{"key":"e_1_3_2_192_2","article-title":"Latency-aware differentiable neural architecture search","author":"Xu Yuhui","year":"2020","unstructured":"Yuhui Xu, Lingxi Xie, Xiaopeng Zhang, Xin Chen, Bowen Shi, Qi Tian, and Hongkai Xiong. 2020. Latency-aware differentiable neural architecture search. arXiv preprint arXiv:2001.06392 (2020).","journal-title":"arXiv preprint arXiv:2001.06392"},{"key":"e_1_3_2_193_2","article-title":"LC-NAS: Latency constrained neural architecture search for point cloud networks","author":"Li Guohao","year":"2020","unstructured":"Guohao Li, Mengmeng Xu, Silvio Giancola, Ali Thabet, and Bernard Ghanem. 2020. LC-NAS: Latency constrained neural architecture search for point cloud networks. arXiv preprint arXiv:2008.10309 (2020).","journal-title":"arXiv preprint arXiv:2008.10309"},{"key":"e_1_3_2_194_2","doi-asserted-by":"crossref","first-page":"1115","DOI":"10.23919\/DATE54114.2022.9774615","volume-title":"2022 Design, Automation & Test in Europe Conference & Exhibition (DATE\u201922)","author":"Loni Mohammad","year":"2022","unstructured":"Mohammad Loni, Hamid Mousavi, Mohammad Riazati, Masoud Daneshtalab, and Mikael Sj\u00f6din. 2022. TAS: Ternarized neural architecture search for resource-constrained edge devices. In 2022 Design, Automation & Test in Europe Conference & Exhibition (DATE\u201922). IEEE, 1115\u20131118."},{"key":"e_1_3_2_195_2","first-page":"1344","volume-title":"2021 Design, Automation & Test in Europe Conference & Exhibition (DATE\u201921)","author":"Kim Sunghoon","year":"2021","unstructured":"Sunghoon Kim, Hyunjeong Kwon, Eunji Kwon, Youngchang Choi, Tae-Hyun Oh, and Seokhyeong Kang. 2021. MDARTS: Multi-objective differentiable neural architecture search. In 2021 Design, Automation & Test in Europe Conference & Exhibition (DATE\u201921). IEEE, 1344\u20131349."},{"key":"e_1_3_2_196_2","first-page":"123","volume-title":"Computer Vision\u2013ECCV 2020: 16th European Conference, Glasgow, UK, August 23\u201328, 2020, Proceedings, Part XV 16","author":"Hu Yibo","year":"2020","unstructured":"Yibo Hu, Xiang Wu, and Ran He. 2020. TF-NAS: Rethinking three search freedoms of latency-constrained differentiable neural architecture search. In Computer Vision\u2013ECCV 2020: 16th European Conference, Glasgow, UK, August 23\u201328, 2020, Proceedings, Part XV 16. Springer, 123\u2013139."},{"key":"e_1_3_2_197_2","first-page":"12965","volume-title":"Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition","author":"Wan Alvin","year":"2020","unstructured":"Alvin Wan, Xiaoliang Dai, Peizhao Zhang, Zijian He, Yuandong Tian, Saining Xie, Bichen Wu, Matthew Yu, Tao Xu, Kan Chen, et\u00a0al. 2020. FBNetV2: Differentiable neural architecture search for spatial and channel dimensions. In Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition. 12965\u201312974."},{"key":"e_1_3_2_198_2","doi-asserted-by":"crossref","first-page":"481","DOI":"10.1007\/978-3-030-46147-8_29","volume-title":"Machine Learning and Knowledge Discovery in Databases: European Conference, ECML PKDD 2019, W\u00fcrzburg, Germany, September 16\u201320, 2019, Proceedings, Part II","author":"Stamoulis Dimitrios","year":"2020","unstructured":"Dimitrios Stamoulis, Ruizhou Ding, Di Wang, Dimitrios Lymberopoulos, Bodhi Priyantha, Jie Liu, and Diana Marculescu. 2020. Single-path NAS: Designing hardware-efficient ConvNets in less than 4 hours. In Machine Learning and Knowledge Discovery in Databases: European Conference, ECML PKDD 2019, W\u00fcrzburg, Germany, September 16\u201320, 2019, Proceedings, Part II. Springer, 481\u2013497."},{"issue":"11","key":"e_1_3_2_199_2","first-page":"4826","article-title":"SNAS: Fast hardware-aware neural architecture search methodology","volume":"41","author":"Lee Jaeseong","year":"2021","unstructured":"Jaeseong Lee, Jungsub Rhim, Duseok Kang, and Soonhoi Ha. 2021. SNAS: Fast hardware-aware neural architecture search methodology. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems 41, 11 (2021), 4826\u20134836.","journal-title":"IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems"},{"key":"e_1_3_2_200_2","first-page":"10628","volume-title":"Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition","author":"Fang Jiemin","year":"2020","unstructured":"Jiemin Fang, Yuzhu Sun, Qian Zhang, Yuan Li, Wenyu Liu, and Xinggang Wang. 2020. Densely connected search space for more flexible neural architecture search. In Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition. 10628\u201310637."},{"key":"e_1_3_2_201_2","first-page":"7979","volume-title":"International Conference on Machine Learning","author":"Nayman Niv","year":"2021","unstructured":"Niv Nayman, Yonathan Aflalo, Asaf Noy, and Lihi Zelnik. 2021. Hardcore-NAS: Hard constrained differentiable neural architecture search. In International Conference on Machine Learning. PMLR, 7979\u20137990."},{"key":"e_1_3_2_202_2","first-page":"53","volume-title":"International Conference on Machine Learning","author":"Lacoste-Julien Simon","year":"2013","unstructured":"Simon Lacoste-Julien, Martin Jaggi, Mark Schmidt, and Patrick Pletscher. 2013. Block-coordinate Frank-Wolfe optimization for structural SVMs. In International Conference on Machine Learning. PMLR, 53\u201361."},{"key":"e_1_3_2_203_2","first-page":"1","volume-title":"Proceedings of the 61st ACM\/IEEE Design Automation Conference","author":"Luo Xiangzhong","year":"2024","unstructured":"Xiangzhong Luo, Di Liu, Hao Kong, Shuo Huai, and Weichen Liu. 2024. Double-Win NAS: Towards deep-to-shallow transformable neural architecture search for intelligent embedded systems. In Proceedings of the 61st ACM\/IEEE Design Automation Conference. 1\u20136."},{"key":"e_1_3_2_204_2","article-title":"EH-DNAS: End-to-end hardware-aware differentiable neural architecture search","author":"Jiang Qian","year":"2021","unstructured":"Qian Jiang, Xiaofan Zhang, Deming Chen, Minh N. Do, and Raymond A. Yeh. 2021. EH-DNAS: End-to-end hardware-aware differentiable neural architecture search. arXiv preprint arXiv:2111.12299 (2021).","journal-title":"arXiv preprint arXiv:2111.12299"},{"key":"e_1_3_2_205_2","doi-asserted-by":"crossref","first-page":"4704","DOI":"10.1109\/ICPR48806.2021.9412130","volume-title":"2020 25th International Conference on Pattern Recognition (ICPR\u201921)","author":"L\u00f3pez Javier Garc\u00eda","year":"2021","unstructured":"Javier Garc\u00eda L\u00f3pez, Antonio Agudo, and Francesc Moreno-Noguer. 2021. E-DNAS: Differentiable neural architecture search for embedded systems. In 2020 25th International Conference on Pattern Recognition (ICPR\u201921). IEEE, 4704\u20134711."},{"key":"e_1_3_2_206_2","article-title":"Evaluating the search phase of neural architecture search","author":"Yu Kaicheng","year":"2019","unstructured":"Kaicheng Yu, Christian Sciuto, Martin Jaggi, Claudiu Musat, and Mathieu Salzmann. 2019. Evaluating the search phase of neural architecture search. arXiv preprint arXiv:1902.08142 (2019).","journal-title":"arXiv preprint arXiv:1902.08142"},{"key":"e_1_3_2_207_2","first-page":"12707","volume-title":"International Conference on Machine Learning","author":"Zhao Yiyang","year":"2021","unstructured":"Yiyang Zhao, Linnan Wang, Yuandong Tian, Rodrigo Fonseca, and Tian Guo. 2021. Few-shot neural architecture search. In International Conference on Machine Learning. PMLR, 12707\u201312718."},{"key":"e_1_3_2_208_2","article-title":"Generalizing few-shot NAS with gradient matching","author":"Hu Shoukang","year":"2022","unstructured":"Shoukang Hu, Ruochen Wang, Lanqing Hong, Zhenguo Li, Cho-Jui Hsieh, and Jiashi Feng. 2022. Generalizing few-shot NAS with gradient matching. arXiv preprint arXiv:2203.15207 (2022).","journal-title":"arXiv preprint arXiv:2203.15207"},{"key":"e_1_3_2_209_2","first-page":"28644","article-title":"Few-shot task-agnostic neural architecture search for distilling large language models","volume":"35","author":"Xu Dongkuan D. K.","year":"2022","unstructured":"Dongkuan D. K. Xu, Subhabrata Mukherjee, Xiaodong Liu, Debadeepta Dey, Wenhui Wang, Xiang Zhang, Ahmed Awadallah, and Jianfeng Gao. 2022. Few-shot task-agnostic neural architecture search for distilling large language models. Advances in Neural Information Processing Systems 35 (2022), 28644\u201328656.","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_3_2_210_2","unstructured":"Timot\u00e9e Ly-Manson Mathieu L\u00e9onardon and Abdeldjalil Aissa El Bey. 2023. Understanding few-shot neural architecture search with zero-cost proxies. https:\/\/gretsi.fr\/data\/colloque\/pdf\/2023_lymanson1237.pdf (2023)."},{"key":"e_1_3_2_211_2","first-page":"9880","volume-title":"International Conference on Machine Learning","author":"Su Xiu","year":"2021","unstructured":"Xiu Su, Shan You, Mingkai Zheng, Fei Wang, Chen Qian, Changshui Zhang, and Chang Xu. 2021. K-shot NAS: Learnable weight-sharing for NAS with k-shot supernets. In International Conference on Machine Learning. PMLR, 9880\u20139890."},{"key":"e_1_3_2_212_2","first-page":"578","volume-title":"European Conference on Computer Vision","author":"Zhou Zixuan","year":"2022","unstructured":"Zixuan Zhou, Xuefei Ning, Yi Cai, Jiashu Han, Yiping Deng, Yuhan Dong, Huazhong Yang, and Yu Wang. 2022. Close: Curriculum learning on the sharing extent towards better one-shot nas. In European Conference on Computer Vision. Springer, 578\u2013594."},{"key":"e_1_3_2_213_2","volume-title":"International Conference on Automated Machine Learning","author":"Laube Kevin Alexander","year":"2022","unstructured":"Kevin Alexander Laube, Maximus Mutschler, and Andreas Zell. 2022. What to expect of hardware metric predictors in NAS. In International Conference on Automated Machine Learning. PMLR, 13\u20131."},{"key":"e_1_3_2_214_2","first-page":"10480","article-title":"BRP-NAS: Prediction-based NAS using GCN","volume":"33","author":"Dudziak Lukasz","year":"2020","unstructured":"Lukasz Dudziak, Thomas Chau, Mohamed Abdelfattah, Royson Lee, Hyeji Kim, and Nicholas Lane. 2020. BRP-NAS: Prediction-based NAS using GCN. Advances in Neural Information Processing Systems 33 (2020), 10480\u201310490.","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_3_2_215_2","volume-title":"International Conference on Learning Representations","author":"Li Chaojian","year":"2021","unstructured":"Chaojian Li, Zhongzhi Yu, Yonggan Fu, Yongan Zhang, Yang Zhao, Haoran You, Qixuan Yu, Yue Wang, Cong Hao, and Yingyan Lin. 2021. HW-NAS-Bench: Hardware-aware neural architecture search benchmark. In International Conference on Learning Representations."},{"key":"e_1_3_2_216_2","first-page":"27016","article-title":"Hardware-adaptive efficient latency prediction for NAS via meta-learning","volume":"34","author":"Lee Hayeon","year":"2021","unstructured":"Hayeon Lee, Sewoong Lee, Song Chong, and Sung Ju Hwang. 2021. Hardware-adaptive efficient latency prediction for NAS via meta-learning. Advances in Neural Information Processing Systems 34 (2021), 27016\u201327028.","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_3_2_217_2","first-page":"3660","volume-title":"Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition","author":"Nair Saeejith","year":"2022","unstructured":"Saeejith Nair, Saad Abbasi, Alexander Wong, and Mohammad Javad Shafiee. 2022. Maple-edge: A runtime latency predictor for edge devices. In Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition. 3660\u20133668."},{"key":"e_1_3_2_218_2","article-title":"EvoLP: Self-evolving latency predictor for model compression in real-time edge systems","author":"Huai Shuo","year":"2023","unstructured":"Shuo Huai, Hao Kong, Shiqing Li, Xiangzhong Luo, Ravi Subramaniam, Christian Makaya, Qian Lin, and Weichen Liu. 2023. EvoLP: Self-evolving latency predictor for model compression in real-time edge systems. IEEE Embedded Systems Letters (2023).","journal-title":"IEEE Embedded Systems Letters"},{"key":"e_1_3_2_219_2","doi-asserted-by":"crossref","first-page":"660","DOI":"10.1007\/978-3-030-58526-6_39","volume-title":"Computer Vision\u2013ECCV 2020: 16th European Conference, Glasgow, UK, August 23\u201328, 2020, Proceedings, Part XXIX","author":"Wen Wei","year":"2020","unstructured":"Wei Wen, Hanxiao Liu, Yiran Chen, Hai Li, Gabriel Bender, and Pieter-Jan Kindermans. 2020. Neural predictor for neural architecture search. In Computer Vision\u2013ECCV 2020: 16th European Conference, Glasgow, UK, August 23\u201328, 2020, Proceedings, Part XXIX. Springer, 660\u2013676."},{"key":"e_1_3_2_220_2","article-title":"Accuracy prediction with non-neural model for neural architecture search","author":"Luo Renqian","year":"2020","unstructured":"Renqian Luo, Xu Tan, Rui Wang, Tao Qin, Enhong Chen, and Tie-Yan Liu. 2020. Accuracy prediction with non-neural model for neural architecture search. arXiv preprint arXiv:2007.04785 (2020).","journal-title":"arXiv preprint arXiv:2007.04785"},{"key":"e_1_3_2_221_2","first-page":"28454","article-title":"How powerful are performance predictors in neural architecture search?","volume":"34","author":"White Colin","year":"2021","unstructured":"Colin White, Arber Zela, Robin Ru, Yang Liu, and Frank Hutter. 2021. How powerful are performance predictors in neural architecture search? Advances in Neural Information Processing Systems 34 (2021), 28454\u201328469.","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_3_2_222_2","first-page":"12229","volume-title":"Proceedings of the IEEE\/CVF International Conference on Computer Vision","author":"Moons Bert","year":"2021","unstructured":"Bert Moons, Parham Noorzad, Andrii Skliar, Giovanni Mariani, Dushyant Mehta, Chris Lott, and Tijmen Blankevoort. 2021. Distilling optimal neural networks: Rapid search in diverse spaces. In Proceedings of the IEEE\/CVF International Conference on Computer Vision. 12229\u201312238."},{"key":"e_1_3_2_223_2","first-page":"189","volume-title":"European Conference on Computer Vision","author":"Ning Xuefei","year":"2020","unstructured":"Xuefei Ning, Yin Zheng, Tianchen Zhao, Yu Wang, and Huazhong Yang. 2020. A generic graph-based neural architecture encoding scheme for predictor-based NAS. In European Conference on Computer Vision. Springer, 189\u2013204."},{"key":"e_1_3_2_224_2","first-page":"7105","volume-title":"International Conference on Machine Learning","author":"Ying Chris","year":"2019","unstructured":"Chris Ying, Aaron Klein, Eric Christiansen, Esteban Real, Kevin Murphy, and Frank Hutter. 2019. NAS-Bench-101: Towards reproducible neural architecture search. In International Conference on Machine Learning. PMLR, 7105\u20137114."},{"key":"e_1_3_2_225_2","volume-title":"International Conference on Learning Representations","author":"Dong Xuanyi","year":"2020","unstructured":"Xuanyi Dong and Yi Yang. 2020. NAS-Bench-201: Extending the scope of reproducible neural architecture search. In International Conference on Learning Representations."},{"key":"e_1_3_2_226_2","doi-asserted-by":"crossref","first-page":"45736","DOI":"10.1109\/ACCESS.2022.3169897","article-title":"NAS-Bench-NLP: Neural architecture search benchmark for natural language processing","volume":"10","author":"Klyuchnikov Nikita","year":"2022","unstructured":"Nikita Klyuchnikov, Ilya Trofimov, Ekaterina Artemova, Mikhail Salnikov, Maxim Fedorov, Alexander Filippov, and Evgeny Burnaev. 2022. NAS-Bench-NLP: Neural architecture search benchmark for natural language processing. IEEE Access 10 (2022), 45736\u201345747.","journal-title":"IEEE Access"},{"key":"e_1_3_2_227_2","article-title":"A generic graph-based neural architecture encoding scheme with multifaceted information","author":"Ning Xuefei","year":"2022","unstructured":"Xuefei Ning, Yin Zheng, Zixuan Zhou, Tianchen Zhao, Huazhong Yang, and Yu Wang. 2022. A generic graph-based neural architecture encoding scheme with multifaceted information. IEEE Transactions on Pattern Analysis and Machine Intelligence (2022).","journal-title":"IEEE Transactions on Pattern Analysis and Machine Intelligence"},{"key":"e_1_3_2_228_2","first-page":"32325","article-title":"TA-GATES: An encoding scheme for neural network architectures","volume":"35","author":"Ning Xuefei","year":"2022","unstructured":"Xuefei Ning, Zixuan Zhou, Junbo Zhao, Tianchen Zhao, Yiping Deng, Changcheng Tang, Shuang Liang, Huazhong Yang, and Yu Wang. 2022. TA-GATES: An encoding scheme for neural network architectures. Advances in Neural Information Processing Systems 35 (2022), 32325\u201332339.","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_3_2_229_2","first-page":"10514","volume-title":"International Conference on Machine Learning","author":"Xiong Huan","year":"2020","unstructured":"Huan Xiong, Lei Huang, Mengyang Yu, Li Liu, Fan Zhu, and Ling Shao. 2020. On the number of linear regions of convolutional neural networks. In International Conference on Machine Learning. PMLR, 10514\u201310523."},{"key":"e_1_3_2_230_2","first-page":"10462","volume-title":"International Conference on Machine Learning","author":"Xiao Lechao","year":"2020","unstructured":"Lechao Xiao, Jeffrey Pennington, and Samuel Schoenholz. 2020. Disentangling trainability and generalization in deep neural networks. In International Conference on Machine Learning. PMLR, 10462\u201310472."},{"key":"e_1_3_2_231_2","volume-title":"International Conference on Learning Representations","author":"Chen Wuyang","year":"2021","unstructured":"Wuyang Chen, Xinyu Gong, and Zhangyang Wang. 2021. Neural architecture search on ImageNet in four GPU hours: A theoretically inspired perspective. In International Conference on Learning Representations."},{"key":"e_1_3_2_232_2","volume-title":"24th International Joint Conference on Artificial Intelligence","author":"Domhan Tobias","year":"2015","unstructured":"Tobias Domhan, Jost Tobias Springenberg, and Frank Hutter. 2015. Speeding up automatic hyperparameter optimization of deep neural networks by extrapolation of learning curves. In 24th International Joint Conference on Artificial Intelligence."},{"key":"e_1_3_2_233_2","first-page":"4079","article-title":"Speedy performance estimation for neural architecture search","volume":"34","author":"Ru Robin","year":"2021","unstructured":"Robin Ru, Clare Lyle, Lisa Schut, Miroslav Fil, Mark van der Wilk, and Yarin Gal. 2021. Speedy performance estimation for neural architecture search. Advances in Neural Information Processing Systems 34 (2021), 4079\u20134092.","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_3_2_234_2","first-page":"22534","article-title":"NAS-Bench-x11 and the power of learning curves","volume":"34","author":"Yan Shen","year":"2021","unstructured":"Shen Yan, Colin White, Yash Savani, and Frank Hutter. 2021. NAS-Bench-x11 and the power of learning curves. Advances in Neural Information Processing Systems 34 (2021), 22534\u201322549.","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_3_2_235_2","doi-asserted-by":"crossref","first-page":"715","DOI":"10.1109\/IPDPSW55747.2022.00123","volume-title":"2022 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW\u201922)","author":"Zhao Dan","year":"2022","unstructured":"Dan Zhao, Nathan C. Frey, Vijay Gadepally, and Siddharth Samsi. 2022. Loss curve approximations for fast neural architecture ranking training elasticity estimation. In 2022 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW\u201922). IEEE, 715\u2013723."},{"key":"e_1_3_2_236_2","volume-title":"International Conference on Learning Representations","author":"Klein Aaron","year":"2017","unstructured":"Aaron Klein, Stefan Falkner, Jost Tobias Springenberg, and Frank Hutter. 2017. Learning curve prediction with Bayesian neural networks. In International Conference on Learning Representations."},{"key":"e_1_3_2_237_2","article-title":"Accelerating neural architecture search using performance prediction","author":"Baker Bowen","year":"2017","unstructured":"Bowen Baker, Otkrist Gupta, Ramesh Raskar, and Nikhil Naik. 2017. Accelerating neural architecture search using performance prediction. arXiv preprint arXiv:1705.10823 (2017).","journal-title":"arXiv preprint arXiv:1705.10823"},{"key":"e_1_3_2_238_2","first-page":"1","volume-title":"2022 International Conference on Hardware\/Software Codesign and System Synthesis (CODES+ ISSS\u201922)","author":"Luo Xiangzhong","year":"2022","unstructured":"Xiangzhong Luo, Di Liu, Hao Kong, Shuo Huai, Hui Chen, and Weichen Liu. 2022. Work-in-progress: What to expect of early training statistics? An investigation on hardware-aware neural architecture search. In 2022 International Conference on Hardware\/Software Codesign and System Synthesis (CODES+ ISSS\u201922). IEEE, 1\u20132."},{"key":"e_1_3_2_239_2","article-title":"Zero-cost proxies for lightweight NAS","author":"Abdelfattah Mohamed S.","year":"2021","unstructured":"Mohamed S. Abdelfattah, Abhinav Mehrotra, \u0141ukasz Dudziak, and Nicholas D. Lane. 2021. Zero-cost proxies for lightweight NAS. arXiv preprint arXiv:2101.08134 (2021).","journal-title":"arXiv preprint arXiv:2101.08134"},{"key":"e_1_3_2_240_2","article-title":"NAS-Bench-Suite-Zero: Accelerating research on zero cost proxies","author":"Krishnakumar Arjun","year":"2022","unstructured":"Arjun Krishnakumar, Colin White, Arber Zela, Renbo Tu, Mahmoud Safari, and Frank Hutter. 2022. NAS-Bench-Suite-Zero: Accelerating research on zero cost proxies. arXiv preprint arXiv:2210.03230 (2022).","journal-title":"arXiv preprint arXiv:2210.03230"},{"key":"e_1_3_2_241_2","doi-asserted-by":"crossref","first-page":"552","DOI":"10.1007\/978-3-030-86383-8_44","volume-title":"Artificial Neural Networks and Machine Learning\u2013ICANN 2021: 30th International Conference on Artificial Neural Networks, Bratislava, Slovakia, September 14\u201317, 2021, Proceedings, Part V","author":"Lopes Vasco","year":"2021","unstructured":"Vasco Lopes, Saeid Alirezazadeh, and Lu\u00eds A. Alexandre. 2021. EPE-NAS: Efficient performance estimation without training for neural architecture search. In Artificial Neural Networks and Machine Learning\u2013ICANN 2021: 30th International Conference on Artificial Neural Networks, Bratislava, Slovakia, September 14\u201317, 2021, Proceedings, Part V. Springer, 552\u2013563."},{"key":"e_1_3_2_242_2","article-title":"BlockSwap: Fisher-guided block substitution for network compression on a budget","author":"Turner Jack","year":"2019","unstructured":"Jack Turner, Elliot J. Crowley, Michael O\u2019Boyle, Amos Storkey, and Gavin Gray. 2019. BlockSwap: Fisher-guided block substitution for network compression on a budget. arXiv preprint arXiv:1906.04113 (2019).","journal-title":"arXiv preprint arXiv:1906.04113"},{"key":"e_1_3_2_243_2","article-title":"Picking winning tickets before training by preserving gradient flow","author":"Wang Chaoqi","year":"2020","unstructured":"Chaoqi Wang, Guodong Zhang, and Roger Grosse. 2020. Picking winning tickets before training by preserving gradient flow. arXiv preprint arXiv:2002.07376 (2020).","journal-title":"arXiv preprint arXiv:2002.07376"},{"key":"e_1_3_2_244_2","first-page":"7588","volume-title":"International Conference on Machine Learning","author":"Mellor Joe","year":"2021","unstructured":"Joe Mellor, Jack Turner, Amos Storkey, and Elliot J. Crowley. 2021. Neural architecture search without training. In International Conference on Machine Learning. PMLR, 7588\u20137598."},{"key":"e_1_3_2_245_2","article-title":"Snip: Single-shot network pruning based on connection sensitivity","author":"Lee Namhoon","year":"2018","unstructured":"Namhoon Lee, Thalaiyasingam Ajanthan, and Philip H. S. Torr. 2018. Snip: Single-shot network pruning based on connection sensitivity. arXiv preprint arXiv:1810.02340 (2018).","journal-title":"arXiv preprint arXiv:1810.02340"},{"key":"e_1_3_2_246_2","first-page":"6377","article-title":"Pruning neural networks without any data by iteratively conserving synaptic flow","volume":"33","author":"Tanaka Hidenori","year":"2020","unstructured":"Hidenori Tanaka, Daniel Kunin, Daniel L. Yamins, and Surya Ganguli. 2020. Pruning neural networks without any data by iteratively conserving synaptic flow. Advances in Neural Information Processing Systems 33 (2020), 6377\u20136389.","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_3_2_247_2","first-page":"347","volume-title":"Proceedings of the IEEE\/CVF International Conference on Computer Vision","author":"Lin Ming","year":"2021","unstructured":"Ming Lin, Pichao Wang, Zhenhong Sun, Hesen Chen, Xiuyu Sun, Qi Qian, Hao Li, and Rong Jin. 2021. ZEN-NAS: A zero-shot NAS for high-performance image recognition. In Proceedings of the IEEE\/CVF International Conference on Computer Vision. 347\u2013356."},{"key":"e_1_3_2_248_2","first-page":"30459","article-title":"EZNAS: Evolving zero-cost proxies for neural architecture scoring","volume":"35","author":"Akhauri Yash","year":"2022","unstructured":"Yash Akhauri, Juan Munoz, Nilesh Jain, and Ravishankar Iyer. 2022. EZNAS: Evolving zero-cost proxies for neural architecture scoring. Advances in Neural Information Processing Systems 35 (2022), 30459\u201330470.","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_3_2_249_2","first-page":"12270","volume-title":"Proceedings of the IEEE\/CVF International Conference on Computer Vision","author":"Chen Minghao","year":"2021","unstructured":"Minghao Chen, Houwen Peng, Jianlong Fu, and Haibin Ling. 2021. AutoFormer: Searching transformers for visual recognition. In Proceedings of the IEEE\/CVF International Conference on Computer Vision. 12270\u201312280."},{"key":"e_1_3_2_250_2","first-page":"10663","volume-title":"Proceedings of the AAAI Conference on Artificial Intelligence","author":"Gao Jiahui","year":"2022","unstructured":"Jiahui Gao, Hang Xu, Han Shi, Xiaozhe Ren, L. H. Philip, Xiaodan Liang, Xin Jiang, and Zhenguo Li. 2022. AutoBERT-Zero: Evolving BERT backbone from scratch. In Proceedings of the AAAI Conference on Artificial Intelligence. 10663\u201310671."},{"key":"e_1_3_2_251_2","article-title":"Primer: Searching for efficient transformers for language modeling","author":"So David R.","year":"2021","unstructured":"David R. So, Wojciech Ma\u0144ke, Hanxiao Liu, Zihang Dai, Noam Shazeer, and Quoc V. Le. 2021. Primer: Searching for efficient transformers for language modeling. arXiv preprint arXiv:2109.08668 (2021).","journal-title":"arXiv preprint arXiv:2109.08668"},{"key":"e_1_3_2_252_2","first-page":"5146","volume-title":"Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)","author":"Yin Yichun","year":"2021","unstructured":"Yichun Yin, Cheng Chen, Lifeng Shang, Xin Jiang, Xiao Chen, and Qun Liu. 2021. AutoTinyBERT: Automatic hyper-parameter optimization for efficient pre-trained language models. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers). 5146\u20135157."},{"key":"e_1_3_2_253_2","first-page":"1933","volume-title":"Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining","author":"Xu Jin","year":"2021","unstructured":"Jin Xu, Xu Tan, Renqian Luo, Kaitao Song, Jian Li, Tao Qin, and Tie-Yan Liu. 2021. NAS-BERT: Task-agnostic and adaptive-size BERT compression with neural architecture search. In Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining. 1933\u20131943."},{"key":"e_1_3_2_254_2","first-page":"5699","volume-title":"ICASSP 2021-2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP\u201921)","author":"Luo Renqian","year":"2021","unstructured":"Renqian Luo, Xu Tan, Rui Wang, Tao Qin, Jinzhu Li, Sheng Zhao, Enhong Chen, and Tie-Yan Liu. 2021. Lightspeech: Lightweight and fast text to speech with neural architecture search. In ICASSP 2021-2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP\u201921). IEEE, 5699\u20135703."},{"key":"e_1_3_2_255_2","first-page":"1788","volume-title":"INTERSPEECH","author":"Kim Jihwan","year":"2020","unstructured":"Jihwan Kim, Jisung Wang, Sangki Kim, and Yeha Lee. 2020. Evolved speech-transformer: Applying neural architecture search to end-to-end automatic speech recognition. In INTERSPEECH. 1788\u20131792."},{"key":"e_1_3_2_256_2","article-title":"\\(\\alpha\\) NAS: Neural architecture search using property guided synthesis","author":"Jin Charles","year":"2022","unstructured":"Charles Jin, Phitchaya Mangpo Phothilimthana, and Sudip Roy. 2022. \\(\\alpha\\) NAS: Neural architecture search using property guided synthesis. arXiv preprint arXiv:2205.03960 (2022).","journal-title":"arXiv preprint arXiv:2205.03960"},{"key":"e_1_3_2_257_2","first-page":"12","volume-title":"Proceedings of the IEEE\/CVF International Conference on Computer Vision","author":"Chen Boyu","year":"2021","unstructured":"Boyu Chen, Peixia Li, Chuming Li, Baopu Li, Lei Bai, Chen Lin, Ming Sun, Junjie Yan, and Wanli Ouyang. 2021. GLiT: Neural architecture search for global and local image transformer. In Proceedings of the IEEE\/CVF International Conference on Computer Vision. 12\u201321."},{"key":"e_1_3_2_258_2","volume-title":"International Conference on Learning Representations","author":"Gong Chengyue","year":"2021","unstructured":"Chengyue Gong, Dilin Wang, Meng Li, Xinlei Chen, Zhicheng Yan, Yuandong Tian, Vikas Chandra, et\u00a0al. 2021. NASViT: Neural architecture search for efficient vision transformers with gradient conflict aware supernet training. In International Conference on Learning Representations."},{"key":"e_1_3_2_259_2","article-title":"NAT: Neural architecture transformer for accurate and compact architectures","volume":"32","author":"Guo Yong","year":"2019","unstructured":"Yong Guo, Yin Zheng, Mingkui Tan, Qi Chen, Jian Chen, Peilin Zhao, and Junzhou Huang. 2019. NAT: Neural architecture transformer for accurate and compact architectures. Advances in Neural Information Processing Systems 32 (2019).","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_3_2_260_2","first-page":"2982","volume-title":"Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition","author":"Ding Mingyu","year":"2021","unstructured":"Mingyu Ding, Xiaochen Lian, Linjie Yang, Peng Wang, Xiaojie Jin, Zhiwu Lu, and Ping Luo. 2021. HR-NAS: Searching efficient high-resolution neural architectures with lightweight transformers. In Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition. 2982\u20132992."},{"key":"e_1_3_2_261_2","first-page":"139","volume-title":"Computer Vision\u2013ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23\u201327, 2022, Proceedings, Part XXI","author":"Su Xiu","year":"2022","unstructured":"Xiu Su, Shan You, Jiyang Xie, Mingkai Zheng, Fei Wang, Chen Qian, Changshui Zhang, Xiaogang Wang, and Chang Xu. 2022. ViTAS: Vision transformer architecture search. In Computer Vision\u2013ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23\u201327, 2022, Proceedings, Part XXI. Springer, 139\u2013157."},{"issue":"7","key":"e_1_3_2_262_2","first-page":"3634","article-title":"NATS-bench: Benchmarking NAS algorithms for architecture topology and size","volume":"44","author":"Dong Xuanyi","year":"2021","unstructured":"Xuanyi Dong, Lu Liu, Katarzyna Musial, and Bogdan Gabrys. 2021. NATS-bench: Benchmarking NAS algorithms for architecture topology and size. IEEE Transactions on Pattern Analysis and Machine Intelligence 44, 7 (2021), 3634\u20133646.","journal-title":"IEEE Transactions on Pattern Analysis and Machine Intelligence"},{"key":"e_1_3_2_263_2","article-title":"NAS-Bench-301 and the case for surrogate benchmarks for neural architecture search","author":"Siems Julien","year":"2020","unstructured":"Julien Siems, Lucas Zimmer, Arber Zela, Jovita Lukasik, Margret Keuper, and Frank Hutter. 2020. NAS-Bench-301 and the case for surrogate benchmarks for neural architecture search. arXiv preprint arXiv:2008.09777 (2020).","journal-title":"arXiv preprint arXiv:2008.09777"},{"key":"e_1_3_2_264_2","first-page":"12380","article-title":"NAS-Bench-360: Benchmarking neural architecture search on diverse tasks","volume":"35","author":"Tu Renbo","year":"2022","unstructured":"Renbo Tu, Nicholas Roberts, Misha Khodak, Junhong Shen, Frederic Sala, and Ameet Talwalkar. 2022. NAS-Bench-360: Benchmarking neural architecture search on diverse tasks. Advances in Neural Information Processing Systems 35 (2022), 12380\u201312394.","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_3_2_265_2","volume-title":"International Conference on Learning Representations","author":"Zela Arber","year":"2020","unstructured":"Arber Zela, Julien Siems, and Frank Hutter. 2020. NAS-Bench-1shot1: Benchmarking and dissecting one-shot neural architecture search. In International Conference on Learning Representations."},{"key":"e_1_3_2_266_2","volume-title":"International Conference on Learning Representations","author":"Mehrotra Abhinav","year":"2021","unstructured":"Abhinav Mehrotra, Alberto Gil C. P. Ramos, Sourav Bhattacharya, \u0141ukasz Dudziak, Ravichander Vipperla, Thomas Chau, Mohamed S. Abdelfattah, Samin Ishtiaq, and Nicholas Donald Lane. 2021. NAS-Bench-ASR: Reproducible neural architecture search for speech recognition. In International Conference on Learning Representations."},{"key":"e_1_3_2_267_2","volume-title":"36th Conference on Neural Information Processing Systems Datasets and Benchmarks Track","author":"Qin Yijian","year":"2022","unstructured":"Yijian Qin, Ziwei Zhang, Xin Wang, Zeyang Zhang, and Wenwu Zhu. 2022. NAS-Bench-Graph: Benchmarking Graph Neural Architecture Search. In 36th Conference on Neural Information Processing Systems Datasets and Benchmarks Track."},{"key":"e_1_3_2_268_2","volume-title":"International Conference on Learning Representations","author":"Mehta Yash","year":"2022","unstructured":"Yash Mehta, Colin White, Arber Zela, Arjun Krishnakumar, Guri Zabergja, Shakiba Moradian, Mahmoud Safari, Kaicheng Yu, and Frank Hutter. 2022. NAS-Bench-sSuite: NAS evaluation is (now) surprisingly easy. In International Conference on Learning Representations."},{"key":"e_1_3_2_269_2","first-page":"3825","volume-title":"Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition","author":"Xiong Yunyang","year":"2021","unstructured":"Yunyang Xiong, Hanxiao Liu, Suyog Gupta, Berkin Akin, Gabriel Bender, Yongzhe Wang, Pieter-Jan Kindermans, Mingxing Tan, Vikas Singh, and Bo Chen. 2021. MobileDets: Searching for object detection architectures for mobile accelerators. In Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition. 3825\u20133834."},{"key":"e_1_3_2_270_2","first-page":"11943","volume-title":"Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition","author":"Wang Ning","year":"2020","unstructured":"Ning Wang, Yang Gao, Hao Chen, Peng Wang, Zhi Tian, Chunhua Shen, and Yanning Zhang. 2020. NAS-FCOS: Fast neural architecture search for object detection. In Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition. 11943\u201311951."},{"key":"e_1_3_2_271_2","first-page":"7036","volume-title":"Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition","author":"Ghiasi Golnaz","year":"2019","unstructured":"Golnaz Ghiasi, Tsung-Yi Lin, and Quoc V. Le. 2019. NAS-FPN: Learning scalable feature pyramid architecture for object detection. In Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition. 7036\u20137045."},{"key":"e_1_3_2_272_2","first-page":"82","volume-title":"Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition","author":"Liu Chenxi","year":"2019","unstructured":"Chenxi Liu, Liang-Chieh Chen, Florian Schroff, Hartwig Adam, Wei Hua, Alan L. Yuille, and Li Fei-Fei. 2019. Auto-DeepLab: Hierarchical neural architecture search for semantic image segmentation. In Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition. 82\u201392."},{"key":"e_1_3_2_273_2","first-page":"1","volume-title":"Proceedings of the IEEE\/CVF International Conference on Computer Vision Workshops","author":"Shaw Albert","year":"2019","unstructured":"Albert Shaw, Daniel Hunter, Forrest Landola, and Sammy Sidhu. 2019. SqueezeNAS: Fast neural architecture search for faster semantic segmentation. In Proceedings of the IEEE\/CVF International Conference on Computer Vision Workshops. 1\u201311."},{"key":"e_1_3_2_274_2","first-page":"13956","volume-title":"Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition","author":"Zhang Xiong","year":"2021","unstructured":"Xiong Zhang, Hongmin Xu, Hong Mo, Jianchao Tan, Cheng Yang, Lei Wang, and Wenqi Ren. 2021. DCNAS: Densely connected neural architecture search for semantic image segmentation. In Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition. 13956\u201313967."},{"key":"e_1_3_2_275_2","first-page":"158","volume-title":"Computer Vision\u2013ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23\u201327, 2022, Proceedings, Part XXI","author":"Liu Chenxi","year":"2022","unstructured":"Chenxi Liu, Zhaoqi Leng, Pei Sun, Shuyang Cheng, Charles R. Qi, Yin Zhou, Mingxing Tan, and Dragomir Anguelov. 2022. LidarNAS: Unifying and Searching Neural Architectures for 3D Point Clouds. In Computer Vision\u2013ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23\u201327, 2022, Proceedings, Part XXI. Springer, 158\u2013175."},{"issue":"11","key":"e_1_3_2_276_2","first-page":"8552","article-title":"PVNAS: 3D neural architecture search with point-voxel convolution","volume":"44","author":"Liu Zhijian","year":"2021","unstructured":"Zhijian Liu, Haotian Tang, Shengyu Zhao, Kevin Shao, and Song Han. 2021. PVNAS: 3D neural architecture search with point-voxel convolution. IEEE Transactions on Pattern Analysis and Machine Intelligence 44, 11 (2021), 8552\u20138568.","journal-title":"IEEE Transactions on Pattern Analysis and Machine Intelligence"},{"key":"e_1_3_2_277_2","doi-asserted-by":"crossref","first-page":"685","DOI":"10.1007\/978-3-030-58604-1_41","volume-title":"Computer Vision\u2013ECCV 2020: 16th European Conference, Glasgow, UK, August 23\u201328, 2020, Proceedings, Part XXVIII","author":"Tang Haotian","year":"2020","unstructured":"Haotian Tang, Zhijian Liu, Shengyu Zhao, Yujun Lin, Ji Lin, Hanrui Wang, and Song Han. 2020. Searching efficient 3D architectures with sparse point-voxel convolution. In Computer Vision\u2013ECCV 2020: 16th European Conference, Glasgow, UK, August 23\u201328, 2020, Proceedings, Part XXVIII. Springer, 685\u2013702."},{"key":"e_1_3_2_278_2","first-page":"2480","volume-title":"Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition","author":"Liu Shaoli","year":"2021","unstructured":"Shaoli Liu, Chengjian Zheng, Kaidi Lu, Si Gao, Ning Wang, Bofei Wang, Diankai Zhang, Xiaofeng Zhang, and Tianyu Xu. 2021. EVSRNet: Efficient video super-resolution with neural architecture search. In Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition. 2480\u20132485."},{"key":"e_1_3_2_279_2","first-page":"92","volume-title":"Computer Vision\u2013ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23\u201327, 2022, Proceedings, Part XIX","author":"Wu Yushu","year":"2022","unstructured":"Yushu Wu, Yifan Gong, Pu Zhao, Yanyu Li, Zheng Zhan, Wei Niu, Hao Tang, Minghai Qin, Bin Ren, and Yanzhi Wang. 2022. Compiler-aware neural architecture search for on-mobile real-time super-resolution. In Computer Vision\u2013ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23\u201327, 2022, Proceedings, Part XIX. Springer, 92\u2013111."},{"key":"e_1_3_2_280_2","volume-title":"International Conference on Learning Representations","author":"Yang Antoine","year":"2020","unstructured":"Antoine Yang, Pedro M. Esperan\u00e7a, and Fabio M. Carlucci. 2020. NAS evaluation is frustratingly hard. In International Conference on Learning Representations."},{"key":"e_1_3_2_281_2","first-page":"113","volume-title":"Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition","author":"Cubuk Ekin D.","year":"2019","unstructured":"Ekin D. Cubuk, Barret Zoph, Dandelion Mane, Vijay Vasudevan, and Quoc V. Le. 2019. Autoaugment: Learning augmentation strategies from data. In Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition. 113\u2013123."},{"key":"e_1_3_2_282_2","first-page":"367","volume-title":"Uncertainty in Artificial Intelligence","author":"Li Liam","year":"2020","unstructured":"Liam Li and Ameet Talwalkar. 2020. Random search and reproducibility for neural architecture search. In Uncertainty in Artificial Intelligence. PMLR, 367\u2013377."},{"key":"e_1_3_2_283_2","first-page":"1284","volume-title":"Proceedings of the IEEE\/CVF International Conference on Computer Vision","author":"Xie Saining","year":"2019","unstructured":"Saining Xie, Alexander Kirillov, Ross Girshick, and Kaiming He. 2019. Exploring randomly wired neural networks for image recognition. In Proceedings of the IEEE\/CVF International Conference on Computer Vision. 1284\u20131293."},{"key":"e_1_3_2_284_2","first-page":"10428","volume-title":"Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition","author":"Radosavovic Ilija","year":"2020","unstructured":"Ilija Radosavovic, Raj Prateek Kosaraju, Ross Girshick, Kaiming He, and Piotr Doll\u00e1r. 2020. Designing network design spaces. In Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition. 10428\u201310436."},{"key":"e_1_3_2_285_2","first-page":"11405","volume-title":"Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition","author":"Guo Jianyuan","year":"2020","unstructured":"Jianyuan Guo, Kai Han, Yunhe Wang, Chao Zhang, Zhaohui Yang, Han Wu, Xinghao Chen, and Chang Xu. 2020. Hit-detector: Hierarchical trinity architecture search for object detection. In Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition. 11405\u201311414."},{"key":"e_1_3_2_286_2","first-page":"1","volume-title":"Computer Vision\u2013ECCV 2020: 16th European Conference, Glasgow, UK, August 23\u201328, 2020, Proceedings, Part XXV 16","author":"Hataya Ryuichiro","year":"2020","unstructured":"Ryuichiro Hataya, Jan Zdenek, Kazuki Yoshizoe, and Hideki Nakayama. 2020. Faster autoaugment: Learning augmentation strategies using backpropagation. In Computer Vision\u2013ECCV 2020: 16th European Conference, Glasgow, UK, August 23\u201328, 2020, Proceedings, Part XXV 16. Springer, 1\u201316."},{"key":"e_1_3_2_287_2","article-title":"Searching for activation functions","author":"Ramachandran Prajit","year":"2017","unstructured":"Prajit Ramachandran, Barret Zoph, and Quoc V. Le. 2017. Searching for activation functions. arXiv preprint arXiv:1710.05941 (2017).","journal-title":"arXiv preprint arXiv:1710.05941"},{"key":"e_1_3_2_288_2","first-page":"12095","volume-title":"Proceedings of the IEEE\/CVF International Conference on Computer Vision","author":"Zhou Yucong","year":"2021","unstructured":"Yucong Zhou, Zezhou Zhu, and Zhao Zhong. 2021. Learning specialized activation functions with the Piecewise Linear Unit. In Proceedings of the IEEE\/CVF International Conference on Computer Vision. 12095\u201312104."},{"key":"e_1_3_2_289_2","first-page":"16276","volume-title":"Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition","author":"Dai Xiaoliang","year":"2021","unstructured":"Xiaoliang Dai, Alvin Wan, Peizhao Zhang, Bichen Wu, Zijian He, Zhen Wei, Kan Chen, Yuandong Tian, Matthew Yu, Peter Vajda, et\u00a0al. 2021. FBNetV3: Joint architecture-recipe search using predictor pretraining. In Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition. 16276\u201316285."},{"key":"e_1_3_2_290_2","article-title":"AutoHAS: Efficient hyperparameter and architecture search","author":"Dong Xuanyi","year":"2020","unstructured":"Xuanyi Dong, Mingxing Tan, Adams Wei Yu, Daiyi Peng, Bogdan Gabrys, and Quoc V. Le. 2020. AutoHAS: Efficient hyperparameter and architecture search. arXiv preprint arXiv:2006.03656 (2020).","journal-title":"arXiv preprint arXiv:2006.03656"},{"key":"e_1_3_2_291_2","article-title":"FBNetV5: Neural architecture search for multiple tasks in one run","author":"Wu Bichen","year":"2021","unstructured":"Bichen Wu, Chaojian Li, Hang Zhang, Xiaoliang Dai, Peizhao Zhang, Matthew Yu, Jialiang Wang, Yingyan Lin, and Peter Vajda. 2021. FBNetV5: Neural architecture search for multiple tasks in one run. arXiv preprint arXiv:2111.10007 (2021).","journal-title":"arXiv preprint arXiv:2111.10007"},{"key":"e_1_3_2_292_2","first-page":"740","volume-title":"Computer Vision\u2013ECCV 2014: 13th European Conference, Zurich, Switzerland, September 6\u201312, 2014, Proceedings, Part V 13","author":"Lin Tsung-Yi","year":"2014","unstructured":"Tsung-Yi Lin, Michael Maire, Serge Belongie, James Hays, Pietro Perona, Deva Ramanan, Piotr Doll\u00e1r, and C. Lawrence Zitnick. 2014. Microsoft COCO: Common objects in context. In Computer Vision\u2013ECCV 2014: 13th European Conference, Zurich, Switzerland, September 6\u201312, 2014, Proceedings, Part V 13. Springer, 740\u2013755."},{"key":"e_1_3_2_293_2","first-page":"633","volume-title":"Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition","author":"Zhou Bolei","year":"2017","unstructured":"Bolei Zhou, Hang Zhao, Xavier Puig, Sanja Fidler, Adela Barriuso, and Antonio Torralba. 2017. Scene parsing through ADE20K dataset. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 633\u2013641."},{"key":"e_1_3_2_294_2","article-title":"Slimmable neural networks","author":"Yu Jiahui","year":"2018","unstructured":"Jiahui Yu, Linjie Yang, Ning Xu, Jianchao Yang, and Thomas Huang. 2018. Slimmable neural networks. arXiv preprint arXiv:1812.08928 (2018).","journal-title":"arXiv preprint arXiv:1812.08928"},{"key":"e_1_3_2_295_2","first-page":"1803","volume-title":"Proceedings of the IEEE\/CVF International Conference on Computer Vision","author":"Yu Jiahui","year":"2019","unstructured":"Jiahui Yu and Thomas S. Huang. 2019. Universally slimmable networks and improved training techniques. In Proceedings of the IEEE\/CVF International Conference on Computer Vision. 1803\u20131811."},{"key":"e_1_3_2_296_2","first-page":"8607","volume-title":"Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition","author":"Li Changlin","year":"2021","unstructured":"Changlin Li, Guangrun Wang, Bing Wang, Xiaodan Liang, Zhihui Li, and Xiaojun Chang. 2021. Dynamic slimmable network. In Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition. 8607\u20138617."},{"key":"e_1_3_2_297_2","first-page":"12281","volume-title":"Proceedings of the IEEE\/CVF International Conference on Computer Vision","author":"Li Changlin","year":"2021","unstructured":"Changlin Li, Tao Tang, Guangrun Wang, Jiefeng Peng, Bing Wang, Xiaodan Liang, and Xiaojun Chang. 2021. BossNAS: Exploring hybrid CNN-transformers with block-wisely self-supervised neural architecture search. In Proceedings of the IEEE\/CVF International Conference on Computer Vision. 12281\u201312291."},{"key":"e_1_3_2_298_2","article-title":"HyT-NAS: Hybrid transformers neural architecture search for edge devices","author":"Mecharbat Lotfi Abdelkrim","year":"2023","unstructured":"Lotfi Abdelkrim Mecharbat, Hadjer Benmeziane, Hamza Ouranoughi, and Smail Niar. 2023. HyT-NAS: Hybrid transformers neural architecture search for edge devices. arXiv preprint arXiv:2303.04440 (2023).","journal-title":"arXiv preprint arXiv:2303.04440"},{"key":"e_1_3_2_299_2","first-page":"1126","volume-title":"International Conference on Machine Learning","author":"Finn Chelsea","year":"2017","unstructured":"Chelsea Finn, Pieter Abbeel, and Sergey Levine. 2017. Model-agnostic meta-learning for fast adaptation of deep networks. In International Conference on Machine Learning. PMLR, 1126\u20131135."},{"key":"e_1_3_2_300_2","article-title":"Meta architecture search","volume":"32","author":"Shaw Albert","year":"2019","unstructured":"Albert Shaw, Wei Wei, Weiyang Liu, Le Song, and Bo Dai. 2019. Meta architecture search. Advances in Neural Information Processing Systems 32 (2019).","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_3_2_301_2","first-page":"6186","volume-title":"Proceedings of the AAAI Conference on Artificial Intelligence","author":"Wang Jiaxing","year":"2020","unstructured":"Jiaxing Wang, Jiaxiang Wu, Haoli Bai, and Jian Cheng. 2020. M-NAS: Meta neural architecture search. In Proceedings of the AAAI Conference on Artificial Intelligence. 6186\u20136193."},{"key":"e_1_3_2_302_2","article-title":"Rapid neural architecture search by learning to generate graphs from datasets","author":"Lee Hayeon","year":"2021","unstructured":"Hayeon Lee, Eunyoung Hyung, and Sung Ju Hwang. 2021. Rapid neural architecture search by learning to generate graphs from datasets. arXiv preprint arXiv:2107.00860 (2021).","journal-title":"arXiv preprint arXiv:2107.00860"},{"issue":"4","key":"e_1_3_2_303_2","doi-asserted-by":"crossref","first-page":"485","DOI":"10.1109\/JPROC.2020.2976475","article-title":"Model compression and hardware acceleration for neural networks: A comprehensive survey","volume":"108","author":"Deng Lei","year":"2020","unstructured":"Lei Deng, Guoqi Li, Song Han, Luping Shi, and Yuan Xie. 2020. Model compression and hardware acceleration for neural networks: A comprehensive survey. Proc. IEEE 108, 4 (2020), 485\u2013532.","journal-title":"Proc. IEEE"},{"key":"e_1_3_2_304_2","first-page":"2078","volume-title":"Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition","author":"Wang Tianzhe","year":"2020","unstructured":"Tianzhe Wang, Kuan Wang, Han Cai, Ji Lin, Zhijian Liu, Hanrui Wang, Yujun Lin, and Song Han. 2020. APQ: Joint search for network architecture, pruning and quantization policy. In Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition. 2078\u20132087."},{"key":"e_1_3_2_305_2","volume-title":"International Conference on Learning Representations","author":"Du Simon S.","year":"2019","unstructured":"Simon S. Du, Xiyu Zhai, Barnabas Poczos, and Aarti Singh. 2019. Gradient descent provably optimizes over-parameterized neural networks. In International Conference on Learning Representations."},{"key":"e_1_3_2_306_2","volume-title":"International Conference on Learning Representations","author":"Liu Zhuang","year":"2019","unstructured":"Zhuang Liu, Mingjie Sun, Tinghui Zhou, Gao Huang, and Trevor Darrell. 2019. Rethinking the value of network pruning. In International Conference on Learning Representations."},{"issue":"3","key":"e_1_3_2_307_2","doi-asserted-by":"crossref","first-page":"243","DOI":"10.1145\/3007787.3001163","article-title":"EIE: Efficient inference engine on compressed deep neural network","volume":"44","author":"Han Song","year":"2016","unstructured":"Song Han, Xingyu Liu, Huizi Mao, Jing Pu, Ardavan Pedram, Mark A. Horowitz, and William J. Dally. 2016. EIE: Efficient inference engine on compressed deep neural network. ACM SIGARCH Computer Architecture News 44, 3 (2016), 243\u2013254.","journal-title":"ACM SIGARCH Computer Architecture News"},{"key":"e_1_3_2_308_2","article-title":"Optimal brain damage","volume":"2","author":"LeCun Yann","year":"1989","unstructured":"Yann LeCun, John Denker, and Sara Solla. 1989. Optimal brain damage. Advances in Neural Information Processing Systems 2 (1989).","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_3_2_309_2","doi-asserted-by":"crossref","first-page":"293","DOI":"10.1109\/ICNN.1993.298572","volume-title":"IEEE International Conference on Neural Networks","author":"Hassibi Babak","year":"1993","unstructured":"Babak Hassibi, David G. Stork, and Gregory J. Wolff. 1993. Optimal brain surgeon and general network pruning. In IEEE International Conference on Neural Networks. IEEE, 293\u2013299."},{"key":"e_1_3_2_310_2","article-title":"Data-free parameter pruning for deep neural networks","author":"Srinivas Suraj","year":"2015","unstructured":"Suraj Srinivas and R. Venkatesh Babu. 2015. Data-free parameter pruning for deep neural networks. British Machine Vision Conference (2015).","journal-title":"British Machine Vision Conference"},{"key":"e_1_3_2_311_2","first-page":"2498","volume-title":"International Conference on Machine Learning","author":"Molchanov Dmitry","year":"2017","unstructured":"Dmitry Molchanov, Arsenii Ashukha, and Dmitry Vetrov. 2017. Variational dropout sparsifies deep neural networks. In International Conference on Machine Learning. PMLR, 2498\u20132507."},{"key":"e_1_3_2_312_2","volume-title":"International Conference on Learning Representations","author":"Louizos Christos","year":"2018","unstructured":"Christos Louizos, Max Welling, and Diederik P. Kingma. 2018. Learning sparse neural networks through \\(L\\_0\\) regularization. In International Conference on Learning Representations."},{"key":"e_1_3_2_313_2","first-page":"5239","volume-title":"Proceedings of the IEEE\/CVF International Conference on Computer Vision","author":"Guo Yi","year":"2021","unstructured":"Yi Guo, Huan Yuan, Jianchao Tan, Zhangyang Wang, Sen Yang, and Ji Liu. 2021. GDP: Stabilized neural network pruning via gates with differentiable polarization. In Proceedings of the IEEE\/CVF International Conference on Computer Vision. 5239\u20135250."},{"key":"e_1_3_2_314_2","article-title":"The state of sparsity in deep neural networks","author":"Gale Trevor","year":"2019","unstructured":"Trevor Gale, Erich Elsen, and Sara Hooker. 2019. The state of sparsity in deep neural networks. arXiv preprint arXiv:1902.09574 (2019).","journal-title":"arXiv preprint arXiv:1902.09574"},{"key":"e_1_3_2_315_2","volume-title":"International Conference on Learning Representations","author":"Renda Alex","year":"2020","unstructured":"Alex Renda, Jonathan Frankle, and Michael Carbin. 2020. Comparing rewinding and fine-tuning in neural network pruning. In International Conference on Learning Representations."},{"key":"e_1_3_2_316_2","article-title":"Second order derivatives for network pruning: Optimal brain surgeon","volume":"5","author":"Hassibi Babak","year":"1992","unstructured":"Babak Hassibi and David Stork. 1992. Second order derivatives for network pruning: Optimal brain surgeon. Advances in Neural Information Processing Systems 5 (1992).","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_3_2_317_2","volume-title":"International Conference on Learning Representations","author":"Molchanov Pavlo","year":"2017","unstructured":"Pavlo Molchanov, Stephen Tyree, Tero Karras, Timo Aila, and Jan Kautz. 2017. Pruning convolutional neural networks for resource efficient inference. In International Conference on Learning Representations."},{"issue":"6","key":"e_1_3_2_318_2","doi-asserted-by":"crossref","first-page":"1386","DOI":"10.1109\/72.963775","article-title":"A new pruning heuristic based on variance analysis of sensitivity information","volume":"12","author":"Engelbrecht Andries P.","year":"2001","unstructured":"Andries P. Engelbrecht. 2001. A new pruning heuristic based on variance analysis of sensitivity information. IEEE Transactions on Neural Networks 12, 6 (2001), 1386\u20131399.","journal-title":"IEEE Transactions on Neural Networks"},{"issue":"1","key":"e_1_3_2_319_2","doi-asserted-by":"crossref","first-page":"127","DOI":"10.1109\/JSSC.2016.2616357","article-title":"Eyeriss: An energy-efficient reconfigurable accelerator for deep convolutional neural networks","volume":"52","author":"Chen Yu-Hsin","year":"2016","unstructured":"Yu-Hsin Chen, Tushar Krishna, Joel S. Emer, and Vivienne Sze. 2016. Eyeriss: An energy-efficient reconfigurable accelerator for deep convolutional neural networks. IEEE Journal of Solid-State Circuits 52, 1 (2016), 127\u2013138.","journal-title":"IEEE Journal of Solid-State Circuits"},{"issue":"2","key":"e_1_3_2_320_2","doi-asserted-by":"crossref","first-page":"292","DOI":"10.1109\/JETCAS.2019.2910232","article-title":"Eyeriss v2: A flexible accelerator for emerging deep neural networks on mobile devices","volume":"9","author":"Chen Yu-Hsin","year":"2019","unstructured":"Yu-Hsin Chen, Tien-Ju Yang, Joel Emer, and Vivienne Sze. 2019. Eyeriss v2: A flexible accelerator for emerging deep neural networks on mobile devices. IEEE Journal on Emerging and Selected Topics in Circuits and Systems 9, 2 (2019), 292\u2013308.","journal-title":"IEEE Journal on Emerging and Selected Topics in Circuits and Systems"},{"key":"e_1_3_2_321_2","first-page":"75","volume-title":"Proceedings of the 2017 ACM\/SIGDA International Symposium on Field-Programmable Gate Arrays","author":"Han Song","year":"2017","unstructured":"Song Han, Junlong Kang, Huizi Mao, Yiming Hu, Xin Li, Yubin Li, Dongliang Xie, Hong Luo, Song Yao, Yu Wang, et\u00a0al. 2017. ESE: Efficient speech recognition engine with sparse LSTM on FPGA. In Proceedings of the 2017 ACM\/SIGDA International Symposium on Field-Programmable Gate Arrays. 75\u201384."},{"key":"e_1_3_2_322_2","first-page":"1","volume-title":"2016 49th Annual IEEE\/ACM International Symposium on Microarchitecture (MICRO\u201916)","author":"Zhang Shijin","year":"2016","unstructured":"Shijin Zhang, Zidong Du, Lei Zhang, Huiying Lan, Shaoli Liu, Ling Li, Qi Guo, Tianshi Chen, and Yunji Chen. 2016. Cambricon-X: An accelerator for sparse neural networks. In 2016 49th Annual IEEE\/ACM International Symposium on Microarchitecture (MICRO\u201916). IEEE, 1\u201312."},{"issue":"2","key":"e_1_3_2_323_2","doi-asserted-by":"crossref","first-page":"27","DOI":"10.1145\/3140659.3080254","article-title":"SCNN: An accelerator for compressed-sparse convolutional neural networks","volume":"45","author":"Parashar Angshuman","year":"2017","unstructured":"Angshuman Parashar, Minsoo Rhu, Anurag Mukkara, Antonio Puglielli, Rangharajan Venkatesan, Brucek Khailany, Joel Emer, Stephen W. Keckler, and William J. Dally. 2017. SCNN: An accelerator for compressed-sparse convolutional neural networks. ACM SIGARCH Computer Architecture News 45, 2 (2017), 27\u201340.","journal-title":"ACM SIGARCH Computer Architecture News"},{"key":"e_1_3_2_324_2","first-page":"1110","volume-title":"2021 ACM\/IEEE 48th Annual International Symposium on Computer Architecture (ISCA\u201921)","author":"Deng Chunhua","year":"2021","unstructured":"Chunhua Deng, Yang Sui, Siyu Liao, Xuehai Qian, and Bo Yuan. 2021. GoSPA: An energy-efficient high-performance globally optimized sparse convolutional neural network accelerator. In 2021 ACM\/IEEE 48th Annual International Symposium on Computer Architecture (ISCA\u201921). IEEE, 1110\u20131123."},{"issue":"2","key":"e_1_3_2_325_2","doi-asserted-by":"crossref","first-page":"636","DOI":"10.1109\/JSSC.2020.3043870","article-title":"SNAP: An efficient sparse neural acceleration processor for unstructured sparse deep neural network inference","volume":"56","author":"Zhang Jie-Fang","year":"2020","unstructured":"Jie-Fang Zhang, Ching-En Lee, Chester Liu, Yakun Sophia Shao, Stephen W. Keckler, and Zhengya Zhang. 2020. SNAP: An efficient sparse neural acceleration processor for unstructured sparse deep neural network inference. IEEE Journal of Solid-State Circuits 56, 2 (2020), 636\u2013647.","journal-title":"IEEE Journal of Solid-State Circuits"},{"key":"e_1_3_2_326_2","doi-asserted-by":"crossref","first-page":"876","DOI":"10.1109\/HPCA53966.2022.00069","volume-title":"2022 IEEE International Symposium on High-Performance Computer Architecture (HPCA\u201922)","author":"Gudaparthi Sumanth","year":"2022","unstructured":"Sumanth Gudaparthi, Sarabjeet Singh, Surya Narayanan, Rajeev Balasubramonian, and Visvesh Sathe. 2022. CANDLES: Channel-aware novel dataflow-microarchitecture co-design for low energy sparse neural network acceleration. In 2022 IEEE International Symposium on High-Performance Computer Architecture (HPCA\u201922). IEEE, 876\u2013891."},{"key":"e_1_3_2_327_2","first-page":"5117","volume-title":"Proceedings of the AAAI Conference on Artificial Intelligence","author":"Ma Xiaolong","year":"2020","unstructured":"Xiaolong Ma, Fu-Ming Guo, Wei Niu, Xue Lin, Jian Tang, Kaisheng Ma, Bin Ren, and Yanzhi Wang. 2020. PCONV: The missing but desirable sparsity in DNN weight pruning for real-time execution on mobile devices. In Proceedings of the AAAI Conference on Artificial Intelligence. 5117\u20135124."},{"key":"e_1_3_2_328_2","first-page":"907","volume-title":"Proceedings of the Twenty-25 International Conference on Architectural Support for Programming Languages and Operating Systems","author":"Niu Wei","year":"2020","unstructured":"Wei Niu, Xiaolong Ma, Sheng Lin, Shihao Wang, Xuehai Qian, Xue Lin, Yanzhi Wang, and Bin Ren. 2020. PatDNN: Achieving real-time DNN execution on mobile devices with pattern-based weight pruning. In Proceedings of the Twenty-25 International Conference on Architectural Support for Programming Languages and Operating Systems. 907\u2013922."},{"key":"e_1_3_2_329_2","volume-title":"International Conference on Learning Representations","author":"Frankle Jonathan","year":"2019","unstructured":"Jonathan Frankle and Michael Carbin. 2019. The lottery ticket hypothesis: Finding sparse, trainable neural networks. In International Conference on Learning Representations."},{"key":"e_1_3_2_330_2","first-page":"138","volume-title":"Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops","author":"Srinivas Suraj","year":"2017","unstructured":"Suraj Srinivas, Akshayvarun Subramanya, and R. Venkatesh Babu. 2017. Training sparse neural networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops. 138\u2013145."},{"key":"e_1_3_2_331_2","article-title":"The difficulty of training sparse neural networks","author":"Evci Utku","year":"2019","unstructured":"Utku Evci, Fabian Pedregosa, Aidan Gomez, and Erich Elsen. 2019. The difficulty of training sparse neural networks. arXiv preprint arXiv:1906.10732 (2019).","journal-title":"arXiv preprint arXiv:1906.10732"},{"key":"e_1_3_2_332_2","first-page":"9833","volume-title":"International Conference on Machine Learning","author":"Jaiswal Ajay Kumar","year":"2022","unstructured":"Ajay Kumar Jaiswal, Haoyu Ma, Tianlong Chen, Ying Ding, and Zhangyang Wang. 2022. Training your sparse neural network better with any mask. In International Conference on Machine Learning. PMLR, 9833\u20139844."},{"key":"e_1_3_2_333_2","first-page":"24193","article-title":"Training neural networks with fixed sparse masks","volume":"34","author":"Sung Yi-Lin","year":"2021","unstructured":"Yi-Lin Sung, Varun Nair, and Colin A. Raffel. 2021. Training neural networks with fixed sparse masks. Advances in Neural Information Processing Systems 34 (2021), 24193\u201324205.","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_3_2_334_2","first-page":"6682","volume-title":"International Conference on Machine Learning","author":"Malach Eran","year":"2020","unstructured":"Eran Malach, Gilad Yehudai, Shai Shalev-Schwartz, and Ohad Shamir. 2020. Proving the lottery ticket hypothesis: Pruning is all you need. In International Conference on Machine Learning. PMLR, 6682\u20136691."},{"key":"e_1_3_2_335_2","first-page":"30196","article-title":"Validating the lottery ticket hypothesis with inertial manifold theory","volume":"34","author":"Zhang Zeru","year":"2021","unstructured":"Zeru Zhang, Jiayin Jin, Zijie Zhang, Yang Zhou, Xin Zhao, Jiaxiang Ren, Ji Liu, Lingfei Wu, Ruoming Jin, and Dejing Dou. 2021. Validating the lottery ticket hypothesis with inertial manifold theory. Advances in Neural Information Processing Systems 34 (2021), 30196\u201330210.","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_3_2_336_2","article-title":"Stabilizing the lottery ticket hypothesis","author":"Frankle Jonathan","year":"2019","unstructured":"Jonathan Frankle, Gintare Karolina Dziugaite, Daniel M. Roy, and Michael Carbin. 2019. Stabilizing the lottery ticket hypothesis. arXiv preprint arXiv:1903.01611 (2019).","journal-title":"arXiv preprint arXiv:1903.01611"},{"key":"e_1_3_2_337_2","first-page":"1695","volume-title":"International Conference on Machine Learning","author":"Chen Tianlong","year":"2021","unstructured":"Tianlong Chen, Yongduo Sui, Xuxi Chen, Aston Zhang, and Zhangyang Wang. 2021. A unified lottery ticket hypothesis for graph neural networks. In International Conference on Machine Learning. PMLR, 1695\u20131706."},{"key":"e_1_3_2_338_2","first-page":"102","volume-title":"Computer Vision\u2013ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23\u201327, 2022, Proceedings, Part XII","author":"Kim Youngeun","year":"2022","unstructured":"Youngeun Kim, Yuhang Li, Hyoungseob Park, Yeshwanth Venkatesha, Ruokai Yin, and Priyadarshini Panda. 2022. Exploring lottery ticket hypothesis in spiking neural networks. In Computer Vision\u2013ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23\u201327, 2022, Proceedings, Part XII. Springer, 102\u2013120."},{"key":"e_1_3_2_339_2","doi-asserted-by":"crossref","first-page":"128","DOI":"10.1109\/ISVLSI54635.2022.00035","volume-title":"2022 IEEE Computer Society Annual Symposium on VLSI (ISVLSI\u201922)","author":"Banerjee Sanmitra","year":"2022","unstructured":"Sanmitra Banerjee, Mahdi Nikdast, Sudeep Pasricha, and Krishnendu Chakrabarty. 2022. Pruning coherent integrated photonic neural networks using the lottery ticket hypothesis. In 2022 IEEE Computer Society Annual Symposium on VLSI (ISVLSI\u201922). IEEE, 128\u2013133."},{"key":"e_1_3_2_340_2","first-page":"941","article-title":"Learning best combination for efficient N:M sparsity","volume":"35","author":"Zhang Yuxin","year":"2022","unstructured":"Yuxin Zhang, Mingbao Lin, Zhihang Lin, Yiting Luo, Ke Li, Fei Chao, Yongjian Wu, and Rongrong Ji. 2022. Learning best combination for efficient N:M sparsity. Advances in Neural Information Processing Systems 35 (2022), 941\u2013953.","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_3_2_341_2","article-title":"Accelerating sparse deep neural networks","author":"Mishra Asit","year":"2021","unstructured":"Asit Mishra, Jorge Albericio Latorre, Jeff Pool, Darko Stosic, Dusan Stosic, Ganesh Venkatesh, Chong Yu, and Paulius Micikevicius. 2021. Accelerating sparse deep neural networks. arXiv preprint arXiv:2104.08378 (2021).","journal-title":"arXiv preprint arXiv:2104.08378"},{"key":"e_1_3_2_342_2","first-page":"578","volume-title":"13th USENIX Symposium on Operating Systems Design and Implementation (OSDI\u201918)","author":"Chen Tianqi","year":"2018","unstructured":"Tianqi Chen, Thierry Moreau, Ziheng Jiang, Lianmin Zheng, Eddie Yan, Haichen Shen, Meghan Cowan, Leyuan Wang, Yuwei Hu, Luis Ceze, et\u00a0al. 2018. \\(\\lbrace\\) TVM \\(\\rbrace\\) : An automated \\(\\lbrace\\) End-to-End \\(\\rbrace\\) optimizing compiler for deep learning. In 13th USENIX Symposium on Operating Systems Design and Implementation (OSDI\u201918). 578\u2013594."},{"key":"e_1_3_2_343_2","first-page":"1818","article-title":"NxMTransformer: Semi-structured sparsification for natural language understanding via ADMM","volume":"34","author":"Holmes Connor","year":"2021","unstructured":"Connor Holmes, Minjia Zhang, Yuxiong He, and Bo Wu. 2021. NxMTransformer: Semi-structured sparsification for natural language understanding via ADMM. Advances in Neural Information Processing Systems 34 (2021), 1818\u20131830.","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_3_2_344_2","article-title":"Learning N:M fine-grained structured sparse neural networks from scratch","author":"Zhou Aojun","year":"2021","unstructured":"Aojun Zhou, Yukun Ma, Junnan Zhu, Jianbo Liu, Zhijie Zhang, Kun Yuan, Wenxiu Sun, and Hongsheng Li. 2021. Learning N:M fine-grained structured sparse neural networks from scratch. arXiv preprint arXiv:2102.04010 (2021).","journal-title":"arXiv preprint arXiv:2102.04010"},{"key":"e_1_3_2_345_2","first-page":"13316","article-title":"Channel permutations for N:M sparsity","volume":"34","author":"Pool Jeff","year":"2021","unstructured":"Jeff Pool and Chong Yu. 2021. Channel permutations for N:M sparsity. Advances in Neural Information Processing Systems 34 (2021), 13316\u201313327.","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_3_2_346_2","article-title":"Progressive gradient flow for Robust N:M sparsity training in transformers","author":"Bambhaniya Abhimanyu Rajeshkumar","year":"2024","unstructured":"Abhimanyu Rajeshkumar Bambhaniya, Amir Yazdanbakhsh, Suvinay Subramanian, Sheng-Chun Kao, Shivani Agrawal, Utku Evci, and Tushar Krishna. 2024. Progressive gradient flow for Robust N:M sparsity training in transformers. arXiv preprint arXiv:2402.04744 (2024).","journal-title":"arXiv preprint arXiv:2402.04744"},{"key":"e_1_3_2_347_2","article-title":"E-Sparse: Boosting the large language model inference through entropy-based N:M sparsity","author":"Li Yun","year":"2023","unstructured":"Yun Li, Lin Niu, Xipeng Zhang, Kai Liu, Jianchen Zhu, and Zhanhui Kang. 2023. E-Sparse: Boosting the large language model inference through entropy-based N:M sparsity. arXiv preprint arXiv:2310.15929 (2023).","journal-title":"arXiv preprint arXiv:2310.15929"},{"key":"e_1_3_2_348_2","article-title":"The emergence of essential sparsity in large pre-trained models: The weights that matter","volume":"36","author":"Jaiswal Ajay","year":"2024","unstructured":"Ajay Jaiswal, Shiwei Liu, Tianlong Chen, Zhangyang Wang, et\u00a0al. 2024. The emergence of essential sparsity in large pre-trained models: The weights that matter. Advances in Neural Information Processing Systems 36 (2024).","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_3_2_349_2","first-page":"41488","volume-title":"International Conference on Machine Learning","author":"Zhang Yuxin","year":"2023","unstructured":"Yuxin Zhang, Yiting Luo, Mingbao Lin, Yunshan Zhong, Jingjing Xie, Fei Chao, and Rongrong Ji. 2023. Bi-directional masks for efficient N:M sparse training. In International Conference on Machine Learning. PMLR, 41488\u201341497."},{"key":"e_1_3_2_350_2","article-title":"Dynamic sparse training with structured sparsity","author":"Lasby Mike","year":"2023","unstructured":"Mike Lasby, Anna Golubeva, Utku Evci, Mihai Nica, and Yani Ioannou. 2023. Dynamic sparse training with structured sparsity. arXiv preprint arXiv:2305.02299 (2023).","journal-title":"arXiv preprint arXiv:2305.02299"},{"issue":"11","key":"e_1_3_2_351_2","doi-asserted-by":"crossref","first-page":"1573","DOI":"10.1109\/TVLSI.2022.3197282","article-title":"An algorithm\u2013hardware co-optimized framework for accelerating N:M sparse transformers","volume":"30","author":"Fang Chao","year":"2022","unstructured":"Chao Fang, Aojun Zhou, and Zhongfeng Wang. 2022. An algorithm\u2013hardware co-optimized framework for accelerating N:M sparse transformers. IEEE Transactions on Very Large Scale Integration (VLSI) Systems 30, 11 (2022), 1573\u20131586.","journal-title":"IEEE Transactions on Very Large Scale Integration (VLSI) Systems"},{"key":"e_1_3_2_352_2","doi-asserted-by":"crossref","first-page":"2670","DOI":"10.1109\/ISCAS48785.2022.9937659","volume-title":"2022 IEEE International Symposium on Circuits and Systems (ISCAS\u201922)","author":"Fang Chao","year":"2022","unstructured":"Chao Fang, Shouliang Guo, Wei Wu, Jun Lin, Zhongfeng Wang, Ming Kai Hsu, and Lingzhi Liu. 2022. An efficient hardware accelerator for sparse transformer neural networks. In 2022 IEEE International Symposium on Circuits and Systems (ISCAS\u201922). IEEE, 2670\u20132674."},{"key":"e_1_3_2_353_2","first-page":"280","volume-title":"2022 IEEE 40th International Conference on Computer Design (ICCD\u201922)","author":"Luo Yixuan","year":"2022","unstructured":"Yixuan Luo, Payman Behnam, Kiran Thorat, Zhuo Liu, Hongwu Peng, Shaoyi Huang, Shu Zhou, Omer Khan, Alexey Tumanov, Caiwen Ding, et\u00a0al. 2022. CoDG-ReRAM: An algorithm-hardware co-design to accelerate semi-structured GNNs on ReRAM. In 2022 IEEE 40th International Conference on Computer Design (ICCD\u201922). IEEE, 280\u2013289."},{"key":"e_1_3_2_354_2","first-page":"20863","article-title":"RED: Looking for redundancies for data-free structured compression of deep neural networks","volume":"34","author":"Yvinec Edouard","year":"2021","unstructured":"Edouard Yvinec, Arnaud Dapogny, Matthieu Cord, and Kevin Bailly. 2021. RED: Looking for redundancies for data-free structured compression of deep neural networks. Advances in Neural Information Processing Systems 34 (2021), 20863\u201320873.","journal-title":"Advances in Neural Information Processing Systems"},{"issue":"3","key":"e_1_3_2_355_2","doi-asserted-by":"crossref","first-page":"3664","DOI":"10.1109\/TPAMI.2022.3179616","article-title":"RED++: Data-free pruning of deep neural networks via input splitting and output merging","volume":"45","author":"Yvinec Edouard","year":"2022","unstructured":"Edouard Yvinec, Arnaud Dapogny, Matthieu Cord, and Kevin Bailly. 2022. RED++: Data-free pruning of deep neural networks via input splitting and output merging. IEEE Transactions on Pattern Analysis and Machine Intelligence 45, 3 (2022), 3664\u20133676.","journal-title":"IEEE Transactions on Pattern Analysis and Machine Intelligence"},{"key":"e_1_3_2_356_2","volume-title":"International Joint Conference on Artificial Intelligence","author":"Wang Wenxiao","year":"2019","unstructured":"Wenxiao Wang, Cong Fu, Jishun Guo, Deng Cai, and Xiaofei He. 2019. COP: Customized deep model compression via regularized correlation-based filter-level pruning. In International Joint Conference on Artificial Intelligence."},{"key":"e_1_3_2_357_2","first-page":"14913","volume-title":"Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition","author":"Wang Zi","year":"2021","unstructured":"Zi Wang, Chengcheng Li, and Xiangyang Wang. 2021. Convolutional neural network pruning with structural redundancy reduction. In Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition. 14913\u201314922."},{"key":"e_1_3_2_358_2","first-page":"1389","volume-title":"Proceedings of the IEEE International Conference on Computer Vision","author":"He Yihui","year":"2017","unstructured":"Yihui He, Xiangyu Zhang, and Jian Sun. 2017. Channel pruning for accelerating very deep neural networks. In Proceedings of the IEEE International Conference on Computer Vision. 1389\u20131397."},{"key":"e_1_3_2_359_2","first-page":"1529","volume-title":"Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition","author":"Lin Mingbao","year":"2020","unstructured":"Mingbao Lin, Rongrong Ji, Yan Wang, Yichen Zhang, Baochang Zhang, Yonghong Tian, and Ling Shao. 2020. HRank: Filter pruning using high-rank feature map. In Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition. 1529\u20131538."},{"key":"e_1_3_2_360_2","first-page":"24604","article-title":"CHIP: CHannel Independence-based Pruning for compact neural networks","volume":"34","author":"Sui Yang","year":"2021","unstructured":"Yang Sui, Miao Yin, Yi Xie, Huy Phan, Saman Aliari Zonouz, and Bo Yuan. 2021. CHIP: CHannel Independence-based Pruning for compact neural networks. Advances in Neural Information Processing Systems 34 (2021), 24604\u201324616.","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_3_2_361_2","first-page":"9356","volume-title":"International Conference on Machine Learning","author":"Tan Chong Min John","year":"2020","unstructured":"Chong Min John Tan and Mehul Motani. 2020. DropNet: Reducing neural network complexity via iterative pruning. In International Conference on Machine Learning. PMLR, 9356\u20139366."},{"key":"e_1_3_2_362_2","first-page":"5058","volume-title":"Proceedings of the IEEE International Conference on Computer Vision","author":"Luo Jian-Hao","year":"2017","unstructured":"Jian-Hao Luo, Jianxin Wu, and Weiyao Lin. 2017. ThiNet: A filter level pruning method for deep neural network compression. In Proceedings of the IEEE International Conference on Computer Vision. 5058\u20135066."},{"key":"e_1_3_2_363_2","first-page":"1607","volume-title":"International Conference on Machine Learning","author":"Ding Xiaohan","year":"2019","unstructured":"Xiaohan Ding, Guiguang Ding, Yuchen Guo, Jungong Han, and Chenggang Yan. 2019. Approximated oracle filter pruning for destructive CNN width optimization. In International Conference on Machine Learning. PMLR, 1607\u20131616."},{"key":"e_1_3_2_364_2","article-title":"Runtime neural pruning","volume":"30","author":"Lin Ji","year":"2017","unstructured":"Ji Lin, Yongming Rao, Jiwen Lu, and Jie Zhou. 2017. Runtime neural pruning. Advances in Neural Information Processing Systems 30 (2017).","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_3_2_365_2","first-page":"2736","volume-title":"Proceedings of the IEEE International Conference on Computer Vision","author":"Liu Zhuang","year":"2017","unstructured":"Zhuang Liu, Jianguo Li, Zhiqiang Shen, Gao Huang, Shoumeng Yan, and Changshui Zhang. 2017. Learning efficient convolutional networks through network slimming. In Proceedings of the IEEE International Conference on Computer Vision. 2736\u20132744."},{"key":"e_1_3_2_366_2","article-title":"Gate Decorator: Global filter pruning method for accelerating deep convolutional neural networks","volume":"32","author":"You Zhonghui","year":"2019","unstructured":"Zhonghui You, Kun Yan, Jinmian Ye, Meng Ma, and Ping Wang. 2019. Gate Decorator: Global filter pruning method for accelerating deep convolutional neural networks. Advances in Neural Information Processing Systems 32 (2019).","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_3_2_367_2","first-page":"9865","article-title":"Neuron-level structured pruning using polarization regularizer","volume":"33","author":"Zhuang Tao","year":"2020","unstructured":"Tao Zhuang, Zhixuan Zhang, Yuheng Huang, Xiaoyi Zeng, Kai Shuang, and Xiang Li. 2020. Neuron-level structured pruning using polarization regularizer. Advances in Neural Information Processing Systems 33 (2020), 9865\u20139877.","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_3_2_368_2","volume-title":"International Conference on Learning Representations","author":"Ye Jianbo","year":"2018","unstructured":"Jianbo Ye, Xin Lu, Zhe Lin, and James Z. Wang. 2018. Rethinking the smaller-norm-less-informative assumption in channel pruning of convolution layers. In International Conference on Learning Representations."},{"key":"e_1_3_2_369_2","first-page":"5122","volume-title":"International Conference on Machine Learning","author":"Kang Minsoo","year":"2020","unstructured":"Minsoo Kang and Bohyung Han. 2020. Operation-aware soft channel pruning using differentiable masks. In International Conference on Machine Learning. PMLR, 5122\u20135131."},{"key":"e_1_3_2_370_2","first-page":"784","volume-title":"Proceedings of the European Conference on Computer Vision (ECCV\u201918)","author":"He Yihui","year":"2018","unstructured":"Yihui He, Ji Lin, Zhijian Liu, Hanrui Wang, Li-Jia Li, and Song Han. 2018. AMC: AutoML for model compression and acceleration on mobile devices. In Proceedings of the European Conference on Computer Vision (ECCV\u201918). 784\u2013800."},{"key":"e_1_3_2_371_2","first-page":"6362","volume-title":"Proceedings of the IEEE\/CVF International Conference on Computer Vision","author":"Yu Sixing","year":"2021","unstructured":"Sixing Yu, Arya Mazaheri, and Ali Jannesari. 2021. Auto graph encoder-decoder for neural network pruning. In Proceedings of the IEEE\/CVF International Conference on Computer Vision. 6362\u20136372."},{"key":"e_1_3_2_372_2","first-page":"12349","volume-title":"Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition","author":"Alwani Manoj","year":"2022","unstructured":"Manoj Alwani, Yang Wang, and Vashisht Madhavan. 2022. DECORE: Deep compression with reinforcement learning. In Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition. 12349\u201312359."},{"key":"e_1_3_2_373_2","first-page":"3296","volume-title":"Proceedings of the IEEE\/CVF International Conference on Computer Vision","author":"Liu Zechun","year":"2019","unstructured":"Zechun Liu, Haoyuan Mu, Xiangyu Zhang, Zichao Guo, Xin Yang, Kwang-Ting Cheng, and Jian Sun. 2019. Metapruning: Meta learning for automatic neural network channel pruning. In Proceedings of the IEEE\/CVF International Conference on Computer Vision. 3296\u20133305."},{"key":"e_1_3_2_374_2","first-page":"673","volume-title":"Proceedings of the 29th International Conference on International Joint Conferences on Artificial Intelligence","author":"Lin Mingbao","year":"2021","unstructured":"Mingbao Lin, Rongrong Ji, Yuxin Zhang, Baochang Zhang, Yongjian Wu, and Yonghong Tian. 2021. Channel pruning via automatic structure search. In Proceedings of the 29th International Conference on International Joint Conferences on Artificial Intelligence. 673\u2013679."},{"key":"e_1_3_2_375_2","article-title":"Sub-network multi-objective evolutionary algorithm for filter pruning","author":"Li Xuhua","year":"2022","unstructured":"Xuhua Li, Weize Sun, Lei Huang, and Shaowu Chen. 2022. Sub-network multi-objective evolutionary algorithm for filter pruning. arXiv preprint arXiv:2211.01957 (2022).","journal-title":"arXiv preprint arXiv:2211.01957"},{"key":"e_1_3_2_376_2","first-page":"608","volume-title":"Computer Vision\u2013ECCV 2020: 16th European Conference, Glasgow, UK, August 23\u201328, 2020, Proceedings, Part VIII 16","author":"Li Yawei","year":"2020","unstructured":"Yawei Li, Shuhang Gu, Kai Zhang, Luc Van Gool, and Radu Timofte. 2020. DHP: Differentiable meta pruning via hypernetworks. In Computer Vision\u2013ECCV 2020: 16th European Conference, Glasgow, UK, August 23\u201328, 2020, Proceedings, Part VIII 16. Springer, 608\u2013624."},{"key":"e_1_3_2_377_2","first-page":"1539","volume-title":"Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition","author":"Guo Shaopeng","year":"2020","unstructured":"Shaopeng Guo, Yujie Wang, Quanquan Li, and Junjie Yan. 2020. DMCP: Differentiable Markov channel pruning for neural networks. In Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition. 1539\u20131547."},{"key":"e_1_3_2_378_2","doi-asserted-by":"crossref","first-page":"592","DOI":"10.1007\/978-3-030-58580-8_35","volume-title":"Computer Vision\u2013ECCV 2020: 16th European Conference, Glasgow, UK, August 23\u201328, 2020, Proceedings, Part III","author":"Ning Xuefei","year":"2020","unstructured":"Xuefei Ning, Tianchen Zhao, Wenshuo Li, Peng Lei, Yu Wang, and Huazhong Yang. 2020. DSA: More efficient budgeted pruning via differentiable sparsity allocation. In Computer Vision\u2013ECCV 2020: 16th European Conference, Glasgow, UK, August 23\u201328, 2020, Proceedings, Part III. Springer, 592\u2013607."},{"issue":"12","key":"e_1_3_2_379_2","doi-asserted-by":"crossref","first-page":"3048","DOI":"10.1109\/TPAMI.2018.2874634","article-title":"Shallowing deep networks: Layer-wise pruning based on feature representations","volume":"41","author":"Chen Shi","year":"2018","unstructured":"Shi Chen and Qi Zhao. 2018. Shallowing deep networks: Layer-wise pruning based on feature representations. IEEE Transactions on Pattern Analysis and Machine Intelligence 41, 12 (2018), 3048\u20133056.","journal-title":"IEEE Transactions on Pattern Analysis and Machine Intelligence"},{"key":"e_1_3_2_380_2","volume-title":"Proceedings of the Asian Conference on Computer Vision","author":"Elkerdawy Sara","year":"2020","unstructured":"Sara Elkerdawy, Mostafa Elhoushi, Abhineet Singh, Hong Zhang, and Nilanjan Ray. 2020. To filter prune, or to layer prune, that is the question. In Proceedings of the Asian Conference on Computer Vision."},{"key":"e_1_3_2_381_2","article-title":"SR-init: An interpretable layer pruning method","author":"Tang Hui","year":"2023","unstructured":"Hui Tang, Yao Lu, and Qi Xuan. 2023. SR-init: An interpretable layer pruning method. arXiv preprint arXiv:2303.07677 (2023).","journal-title":"arXiv preprint arXiv:2303.07677"},{"key":"e_1_3_2_382_2","doi-asserted-by":"crossref","first-page":"1172","DOI":"10.1109\/LSP.2022.3171128","article-title":"Layer pruning for obtaining shallower ResNets","volume":"29","author":"Zhang Ke","year":"2022","unstructured":"Ke Zhang and Guangzhe Liu. 2022. Layer pruning for obtaining shallower ResNets. IEEE Signal Processing Letters 29 (2022), 1172\u20131176.","journal-title":"IEEE Signal Processing Letters"},{"key":"e_1_3_2_383_2","article-title":"When layers play the lottery, all tickets win at initialization","author":"Jordao Artur","year":"2023","unstructured":"Artur Jordao, George Correa de Araujo, Helena de Almeida Maia, and Helio Pedrini. 2023. When layers play the lottery, all tickets win at initialization. arXiv preprint arXiv:2301.10835 (2023).","journal-title":"arXiv preprint arXiv:2301.10835"},{"key":"e_1_3_2_384_2","article-title":"Structured pruning for deep convolutional neural networks: A survey","author":"He Yang","year":"2023","unstructured":"Yang He and Lingao Xiao. 2023. Structured pruning for deep convolutional neural networks: A survey. arXiv preprint arXiv:2303.00566 (2023).","journal-title":"arXiv preprint arXiv:2303.00566"},{"key":"e_1_3_2_385_2","article-title":"Discrimination-aware channel pruning for deep neural networks","volume":"31","author":"Zhuang Zhuangwei","year":"2018","unstructured":"Zhuangwei Zhuang, Mingkui Tan, Bohan Zhuang, Jing Liu, Yong Guo, Qingyao Wu, Junzhou Huang, and Jinhui Zhu. 2018. Discrimination-aware channel pruning for deep neural networks. Advances in Neural Information Processing Systems 31 (2018).","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_3_2_386_2","first-page":"448","volume-title":"International Conference on Machine Learning","author":"Ioffe Sergey","year":"2015","unstructured":"Sergey Ioffe and Christian Szegedy. 2015. Batch normalization: Accelerating deep network training by reducing internal covariate shift. In International Conference on Machine Learning. PMLR, 448\u2013456."},{"key":"e_1_3_2_387_2","article-title":"Continuous control with deep reinforcement learning","author":"Lillicrap Timothy P.","year":"2015","unstructured":"Timothy P. Lillicrap, Jonathan J. Hunt, Alexander Pritzel, Nicolas Heess, Tom Erez, Yuval Tassa, David Silver, and Daan Wierstra. 2015. Continuous control with deep reinforcement learning. arXiv preprint arXiv:1509.02971 (2015).","journal-title":"arXiv preprint arXiv:1509.02971"},{"key":"e_1_3_2_388_2","first-page":"250","volume-title":"2024 29th Asia and South Pacific Design Automation Conference (ASP-DAC\u201924)","author":"Luo Xiangzhong","year":"2024","unstructured":"Xiangzhong Luo, Di Liu, Hao Kong, Shuo Huai, Hui Chen, Shiqing Li, Guochu Xiong, and Weichen Liu. 2024. Pearls hide behind linearity: Simplifying deep convolutional networks for embedded hardware systems via linearity grafting. In 2024 29th Asia and South Pacific Design Automation Conference (ASP-DAC\u201924). IEEE, 250\u2013255."},{"key":"e_1_3_2_389_2","article-title":"Domino-Pro-Max: Towards efficient network simplification and reparameterization for embedded hardware systems","author":"Luo Xiangzhong","year":"2024","unstructured":"Xiangzhong Luo, Di Liu, Hao Kong, Shuo Huai, Guochu Xiong, and Weichen Liu. 2024. Domino-Pro-Max: Towards efficient network simplification and reparameterization for embedded hardware systems. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems (2024).","journal-title":"IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems"},{"key":"e_1_3_2_390_2","first-page":"1","volume-title":"Proceedings of the 41st IEEE\/ACM International Conference on Computer-Aided Design","author":"Kong Hao","year":"2022","unstructured":"Hao Kong, Di Liu, Shuo Huai, Xiangzhong Luo, Weichen Liu, Ravi Subramaniam, Christian Makaya, and Qian Lin. 2022. Smart scissor: Coupling spatial redundancy reduction and CNN compression for embedded hardware. In Proceedings of the 41st IEEE\/ACM International Conference on Computer-Aided Design. 1\u20139."},{"key":"e_1_3_2_391_2","first-page":"1","volume-title":"2023 60th ACM\/IEEE Design Automation Conference (DAC\u201923)","author":"Kong Hao","year":"2023","unstructured":"Hao Kong, Di Liu, Xiangzhong Luo, Shuo Huai, Ravi Subramaniam, Christian Makaya, Qian Lin, and Weichen Liu. 2023. Towards Towards efficient convolutional neural network for embedded hardware via multi-dimensional pruning. In 2023 60th ACM\/IEEE Design Automation Conference (DAC\u201923). IEEE, 1\u20136."},{"key":"e_1_3_2_392_2","first-page":"1","volume-title":"2023 Design, Automation & Test in Europe Conference & Exhibition (DATE\u201923)","author":"Kong Hao","year":"2023","unstructured":"Hao Kong, Xiangzhong Luo, Shuo Huai, Di Liu, Ravi Subramaniam, Christian Makaya, Qian Lin, and Weichen Liu. 2023. EMNAPE: Efficient multi-dimensional neural architecture pruning for EdgeAI. In 2023 Design, Automation & Test in Europe Conference & Exhibition (DATE\u201923). IEEE, 1\u20132."},{"issue":"12","key":"e_1_3_2_393_2","doi-asserted-by":"crossref","first-page":"4657","DOI":"10.1109\/TCAD.2023.3276938","article-title":"EdgeCompress: Coupling multidimensional model compression and dynamic inference for EdgeAI","volume":"42","author":"Kong Hao","year":"2023","unstructured":"Hao Kong, Di Liu, Shuo Huai, Xiangzhong Luo, Ravi Subramaniam, Christian Makaya, Qian Lin, and Weichen Liu. 2023. EdgeCompress: Coupling multidimensional model compression and dynamic inference for EdgeAI. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems 42, 12 (2023), 4657\u20134670.","journal-title":"IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems"},{"key":"e_1_3_2_394_2","first-page":"708","volume-title":"2022 27th Asia and South Pacific Design Automation Conference (ASP-DAC\u201922)","author":"Kong Hao","year":"2022","unstructured":"Hao Kong, Di Liu, Xiangzhong Luo, Weichen Liu, and Ravi Subramaniam. 2022. HACScale: Hardware-aware compound scaling for resource-efficient DNNs. In 2022 27th Asia and South Pacific Design Automation Conference (ASP-DAC\u201922). IEEE, 708\u2013713."},{"key":"e_1_3_2_395_2","article-title":"XNOR-Net++: Improved binary neural networks","author":"Bulat Adrian","year":"2019","unstructured":"Adrian Bulat and Georgios Tzimiropoulos. 2019. XNOR-Net++: Improved binary neural networks. arXiv preprint arXiv:1909.13863 (2019).","journal-title":"arXiv preprint arXiv:1909.13863"},{"key":"e_1_3_2_396_2","first-page":"722","volume-title":"Proceedings of the European Conference on Computer Vision (ECCV\u201918)","author":"Liu Zechun","year":"2018","unstructured":"Zechun Liu, Baoyuan Wu, Wenhan Luo, Xin Yang, Wei Liu, and Kwang-Ting Cheng. 2018. Bi-Real Net: Enhancing the performance of 1-bit CNNs with improved representational capability and advanced training algorithm. In Proceedings of the European Conference on Computer Vision (ECCV\u201918). 722\u2013737."},{"key":"e_1_3_2_397_2","first-page":"2250","volume-title":"Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition","author":"Qin Haotong","year":"2020","unstructured":"Haotong Qin, Ruihao Gong, Xianglong Liu, Mingzhu Shen, Ziran Wei, Fengwei Yu, and Jingkuan Song. 2020. Forward and backward information retention for accurate binary neural networks. In Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition. 2250\u20132259."},{"key":"e_1_3_2_398_2","first-page":"7474","article-title":"Rotated binary neural network","volume":"33","author":"Lin Mingbao","year":"2020","unstructured":"Mingbao Lin, Rongrong Ji, Zihan Xu, Baochang Zhang, Yan Wang, Yongjian Wu, Feiyue Huang, and Chia-Wen Lin. 2020. Rotated binary neural network. Advances in Neural Information Processing Systems 33 (2020), 7474\u20137485.","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_3_2_399_2","article-title":"SiMaN: Sign-to-magnitude network binarization","author":"Lin Mingbao","year":"2022","unstructured":"Mingbao Lin, Rongrong Ji, Zihan Xu, Baochang Zhang, Fei Chao, Chia-Wen Lin, and Ling Shao. 2022. SiMaN: Sign-to-magnitude network binarization. IEEE Transactions on Pattern Analysis and Machine Intelligence (2022).","journal-title":"IEEE Transactions on Pattern Analysis and Machine Intelligence"},{"key":"e_1_3_2_400_2","first-page":"6425","volume-title":"Proceedings of the IEEE\/CVF Winter Conference on Applications of Computer Vision","author":"Falkena Sieger","year":"2023","unstructured":"Sieger Falkena, Hadi Jamali-Rad, and Jan van Gemert. 2023. LAB: Learnable activation binarizer for binary neural networks. In Proceedings of the IEEE\/CVF Winter Conference on Applications of Computer Vision. 6425\u20136434."},{"key":"e_1_3_2_401_2","first-page":"379","volume-title":"Computer Vision\u2013ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23\u201327, 2022, Proceedings, Part XI","author":"Tu Zhijun","year":"2022","unstructured":"Zhijun Tu, Xinghao Chen, Pengju Ren, and Yunhe Wang. 2022. AdaBin: Improving binary neural networks with adaptive binary sets. In Computer Vision\u2013ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23\u201327, 2022, Proceedings, Part XI. Springer, 379\u2013395."},{"key":"e_1_3_2_402_2","article-title":"Ternary weight networks","author":"Li Fengfu","year":"2016","unstructured":"Fengfu Li, Bin Liu, Xiaoxing Wang, Bo Zhang, and Junchi Yan. 2016. Ternary weight networks. arXiv preprint arXiv:1605.04711 (2016).","journal-title":"arXiv preprint arXiv:1605.04711"},{"key":"e_1_3_2_403_2","volume-title":"International Conference on Learning Representations","author":"Zhu Chenzhuo","year":"2017","unstructured":"Chenzhuo Zhu, Song Han, Huizi Mao, and William J. Dally. 2017. Trained ternary quantization. In International Conference on Learning Representations."},{"key":"e_1_3_2_404_2","doi-asserted-by":"crossref","first-page":"2547","DOI":"10.1109\/IJCNN.2017.7966166","volume-title":"2017 International Joint Conference on Neural Networks (IJCNN\u201917)","author":"Alemdar Hande","year":"2017","unstructured":"Hande Alemdar, Vincent Leroy, Adrien Prost-Boucle, and Fr\u00e9d\u00e9ric P\u00e9trot. 2017. Ternary neural networks for resource-efficient AI applications. In 2017 International Joint Conference on Neural Networks (IJCNN\u201917). IEEE, 2547\u20132554."},{"key":"e_1_3_2_405_2","article-title":"Ternary neural networks with fine-grained quantization","author":"Mellempudi Naveen","year":"2017","unstructured":"Naveen Mellempudi, Abhisek Kundu, Dheevatsa Mudigere, Dipankar Das, Bharat Kaul, and Pradeep Dubey. 2017. Ternary neural networks with fine-grained quantization. arXiv preprint arXiv:1705.01462 (2017).","journal-title":"arXiv preprint arXiv:1705.01462"},{"key":"e_1_3_2_406_2","first-page":"8538","volume-title":"Proceedings of the AAAI Conference on Artificial Intelligence","author":"Li Yue","year":"2021","unstructured":"Yue Li, Wenrui Ding, Chunlei Liu, Baochang Zhang, and Guodong Guo. 2021. TRQ: Ternary neural networks with residual quantization. In Proceedings of the AAAI Conference on Artificial Intelligence. 8538\u20138546."},{"key":"e_1_3_2_407_2","first-page":"4780","volume-title":"Proceedings of the AAAI Conference on Artificial Intelligence","author":"Li Yuhang","year":"2020","unstructured":"Yuhang Li, Xin Dong, Sai Qian Zhang, Haoli Bai, Yuanpeng Chen, and Wei Wang. 2020. RTN: Reparameterized ternary network. In Proceedings of the AAAI Conference on Artificial Intelligence. 4780\u20134787."},{"key":"e_1_3_2_408_2","first-page":"5219","volume-title":"Proceedings of the IEEE\/CVF International Conference on Computer Vision","author":"Chen Peng","year":"2021","unstructured":"Peng Chen, Bohan Zhuang, and Chunhua Shen. 2021. FATNN: Fast and accurate ternary neural networks. In Proceedings of the IEEE\/CVF International Conference on Computer Vision. 5219\u20135228."},{"key":"e_1_3_2_409_2","article-title":"Soft threshold ternary networks","author":"Xu Weixiang","year":"2022","unstructured":"Weixiang Xu, Xiangyu He, Tianli Zhao, Qinghao Hu, Peisong Wang, and Jian Cheng. 2022. Soft threshold ternary networks. arXiv preprint arXiv:2204.01234 (2022).","journal-title":"arXiv preprint arXiv:2204.01234"},{"key":"e_1_3_2_410_2","volume-title":"Deep Learning and Unsupervised Feature Learning Workshop, NIPS 2011","author":"Vanhoucke Vincent","year":"2011","unstructured":"Vincent Vanhoucke, Andrew Senior, and Mark Z. Mao. 2011. Improving the speed of neural networks on CPUs. In Deep Learning and Unsupervised Feature Learning Workshop, NIPS 2011."},{"key":"e_1_3_2_411_2","volume-title":"GPU Technology Conference","author":"Vanholder Han","year":"2016","unstructured":"Han Vanholder. 2016. Efficient inference with TensorRT. In GPU Technology Conference."},{"key":"e_1_3_2_412_2","doi-asserted-by":"crossref","first-page":"164245","DOI":"10.1109\/ACCESS.2021.3133100","article-title":"Performance evaluation of INT8 quantized inference on mobile GPUs","volume":"9","author":"Kim Sumin","year":"2021","unstructured":"Sumin Kim, Gunju Park, and Youngmin Yi. 2021. Performance evaluation of INT8 quantized inference on mobile GPUs. IEEE Access 9 (2021), 164245\u2013164255.","journal-title":"IEEE Access"},{"key":"e_1_3_2_413_2","article-title":"SpaceEvo: Hardware-friendly search space design for efficient INT8 inference","author":"Zhang Li Lyna","year":"2023","unstructured":"Li Lyna Zhang, Xudong Wang, Jiahang Xu, Quanlu Zhang, Yujing Wang, Yuqing Yang, Ningxin Zheng, Ting Cao, and Mao Yang. 2023. SpaceEvo: Hardware-friendly search space design for efficient INT8 inference. arXiv preprint arXiv:2303.08308 (2023).","journal-title":"arXiv preprint arXiv:2303.08308"},{"key":"e_1_3_2_414_2","article-title":"Efficient 8-bit quantization of transformer neural machine language translation model","author":"Bhandare Aishwarya","year":"2019","unstructured":"Aishwarya Bhandare, Vamsi Sripathi, Deepthi Karkada, Vivek Menon, Sun Choi, Kushal Datta, and Vikram Saletore. 2019. Efficient 8-bit quantization of transformer neural machine language translation model. arXiv preprint arXiv:1906.00532 (2019).","journal-title":"arXiv preprint arXiv:1906.00532"},{"key":"e_1_3_2_415_2","first-page":"2704","volume-title":"Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition","author":"Jacob Benoit","year":"2018","unstructured":"Benoit Jacob, Skirmantas Kligys, Bo Chen, Menglong Zhu, Matthew Tang, Andrew Howard, Hartwig Adam, and Dmitry Kalenichenko. 2018. Quantization and training of neural networks for efficient integer-arithmetic-only inference. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2704\u20132713."},{"key":"e_1_3_2_416_2","first-page":"315","volume-title":"Proceedings of the European Conference on Computer Vision (ECCV\u201918)","author":"Wan Diwen","year":"2018","unstructured":"Diwen Wan, Fumin Shen, Li Liu, Fan Zhu, Jie Qin, Ling Shao, and Heng Tao Shen. 2018. TBN: Convolutional neural network with ternary inputs and binary weights. In Proceedings of the European Conference on Computer Vision (ECCV\u201918). 315\u2013332."},{"key":"e_1_3_2_417_2","first-page":"4300","volume-title":"Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition","author":"Faraone Julian","year":"2018","unstructured":"Julian Faraone, Nicholas Fraser, Michaela Blott, and Philip H. W. Leong. 2018. SYQ: Learning symmetric quantization for efficient deep neural networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 4300\u20134309."},{"key":"e_1_3_2_418_2","article-title":"PACT: Parameterized clipping activation for quantized neural networks","author":"Choi Jungwook","year":"2018","unstructured":"Jungwook Choi, Zhuo Wang, Swagath Venkataramani, Pierce I.-Jen Chuang, Vijayalakshmi Srinivasan, and Kailash Gopalakrishnan. 2018. PACT: Parameterized clipping activation for quantized neural networks. arXiv preprint arXiv:1805.06085 (2018).","journal-title":"arXiv preprint arXiv:1805.06085"},{"key":"e_1_3_2_419_2","article-title":"Mixed precision training","author":"Micikevicius Paulius","year":"2017","unstructured":"Paulius Micikevicius, Sharan Narang, Jonah Alben, Gregory Diamos, Erich Elsen, David Garcia, Boris Ginsburg, Michael Houston, Oleksii Kuchaiev, Ganesh Venkatesh, et\u00a0al. 2017. Mixed precision training. arXiv preprint arXiv:1710.03740 (2017).","journal-title":"arXiv preprint arXiv:1710.03740"},{"key":"e_1_3_2_420_2","volume-title":"International Conference on Learning Representations","author":"Das Dipankar","year":"2018","unstructured":"Dipankar Das, Naveen Mellempudi, Dheevatsa Mudigere, Dhiraj Kalamkar, Sasikanth Avancha, Kunal Banerjee, Srinivas Sridharan, Karthik Vaidyanathan, Bharat Kaul, Evangelos Georganas, et\u00a0al. 2018. Mixed precision training of convolutional neural networks using integer operations. In International Conference on Learning Representations."},{"key":"e_1_3_2_421_2","article-title":"Highly scalable deep learning training system with mixed-precision: Training ImageNet in four minutes","author":"Jia Xianyan","year":"2018","unstructured":"Xianyan Jia, Shutao Song, Wei He, Yangzihao Wang, Haidong Rong, Feihu Zhou, Liqiang Xie, Zhenyu Guo, Yuanzhou Yang, Liwei Yu, et\u00a0al. 2018. Highly scalable deep learning training system with mixed-precision: Training ImageNet in four minutes. arXiv preprint arXiv:1807.11205 (2018).","journal-title":"arXiv preprint arXiv:1807.11205"},{"key":"e_1_3_2_422_2","doi-asserted-by":"crossref","first-page":"41","DOI":"10.18653\/v1\/W18-2507","volume-title":"Proceedings of Workshop for NLP Open Source Software (NLP-OSS\u201918)","author":"Kuchaiev Oleksii","year":"2018","unstructured":"Oleksii Kuchaiev, Boris Ginsburg, Igor Gitman, Vitaly Lavrukhin, Carl Case, and Paulius Micikevicius. 2018. OpenSeq2Seq: Extensible toolkit for distributed and mixed precision training of sequence-to-sequence models. In Proceedings of Workshop for NLP Open Source Software (NLP-OSS\u201918). 41\u201346."},{"key":"e_1_3_2_423_2","first-page":"1969","volume-title":"Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition","author":"Zhu Feng","year":"2020","unstructured":"Feng Zhu, Ruihao Gong, Fengwei Yu, Xianglong Liu, Yanfei Wang, Zhelong Li, Xiuqi Yang, and Junjie Yan. 2020. Towards unified INT8 training for convolutional neural network. In Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition. 1969\u20131979."},{"key":"e_1_3_2_424_2","first-page":"3483","volume-title":"Proceedings of the AAAI Conference on Artificial Intelligence","author":"Zhao Kang","year":"2021","unstructured":"Kang Zhao, Sida Huang, Pan Pan, Yinghan Li, Yingya Zhang, Zhenyu Gu, and Yinghui Xu. 2021. Distribution adaptive INT8 quantization for training CNNs. In Proceedings of the AAAI Conference on Artificial Intelligence. 3483\u20133491."},{"key":"e_1_3_2_425_2","volume-title":"International Conference on Learning Representations","author":"Tailor Shyam A.","year":"2021","unstructured":"Shyam A. Tailor, Javier Fernandez-Marques, and Nicholas D. Lane. 2021. Degree-Quant: Quantization-aware training for graph neural networks. In International Conference on Learning Representations."},{"key":"e_1_3_2_426_2","first-page":"16318","volume-title":"International Conference on Machine Learning","author":"Nagel Markus","year":"2022","unstructured":"Markus Nagel, Marios Fournarakis, Yelysei Bondarenko, and Tijmen Blankevoort. 2022. Overcoming oscillations in quantization-aware training. In International Conference on Machine Learning. PMLR, 16318\u201316330."},{"key":"e_1_3_2_427_2","first-page":"19123","volume-title":"International Conference on Machine Learning","author":"Sakr Charbel","year":"2022","unstructured":"Charbel Sakr, Steve Dai, Rangha Venkatesan, Brian Zimmer, William Dally, and Brucek Khailany. 2022. Optimal clipping and magnitude-aware differentiation for improved quantization-aware training. In International Conference on Machine Learning. PMLR, 19123\u201319138."},{"key":"e_1_3_2_428_2","doi-asserted-by":"crossref","first-page":"208","DOI":"10.1007\/978-3-031-19775-8_13","volume-title":"Computer Vision\u2013ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23\u201327, 2022, Proceedings, Part XII","author":"Youn Jiseok","year":"2022","unstructured":"Jiseok Youn, Jaehun Song, Hyung-Sin Kim, and Saewoong Bahk. 2022. Bitwidth-adaptive quantization-aware neural network training: A meta-learning approach. In Computer Vision\u2013ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23\u201327, 2022, Proceedings, Part XII. Springer, 208\u2013224."},{"key":"e_1_3_2_429_2","article-title":"Mixed precision quantization of ConvNets via differentiable neural architecture search","author":"Wu Bichen","year":"2018","unstructured":"Bichen Wu, Yanghan Wang, Peizhao Zhang, Yuandong Tian, Peter Vajda, and Kurt Keutzer. 2018. Mixed precision quantization of ConvNets via differentiable neural architecture search. arXiv preprint arXiv:1812.00090 (2018).","journal-title":"arXiv preprint arXiv:1812.00090"},{"key":"e_1_3_2_430_2","first-page":"8612","volume-title":"Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition","author":"Wang Kuan","year":"2019","unstructured":"Kuan Wang, Zhijian Liu, Yujun Lin, Ji Lin, and Song Han. 2019. HAQ: Hardware-aware automated quantization with mixed precision. In Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition. 8612\u20138620."},{"key":"e_1_3_2_431_2","first-page":"293","volume-title":"Proceedings of the IEEE\/CVF International Conference on Computer Vision","author":"Dong Zhen","year":"2019","unstructured":"Zhen Dong, Zhewei Yao, Amir Gholami, Michael W. Mahoney, and Kurt Keutzer. 2019. HAWQ: Hessian aware quantization of neural networks with mixed-precision. In Proceedings of the IEEE\/CVF International Conference on Computer Vision. 293\u2013302."},{"key":"e_1_3_2_432_2","first-page":"1","volume-title":"Computer Vision\u2013ECCV 2020: 16th European Conference, Glasgow, UK, August 23\u201328, 2020, Proceedings, Part IX 16","author":"Yu Haibao","year":"2020","unstructured":"Haibao Yu, Qi Han, Jianbo Li, Jianping Shi, Guangliang Cheng, and Bin Fan. 2020. Search what you want: Barrier panelty NAS for mixed precision quantization. In Computer Vision\u2013ECCV 2020: 16th European Conference, Glasgow, UK, August 23\u201328, 2020, Proceedings, Part IX 16. Springer, 1\u201316."},{"key":"e_1_3_2_433_2","first-page":"5350","volume-title":"Proceedings of the IEEE\/CVF International Conference on Computer Vision","author":"Chen Weihan","year":"2021","unstructured":"Weihan Chen, Peisong Wang, and Jian Cheng. 2021. Towards mixed-precision quantization of neural networks via constrained optimization. In Proceedings of the IEEE\/CVF International Conference on Computer Vision. 5350\u20135359."},{"key":"e_1_3_2_434_2","first-page":"2349","volume-title":"Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition","author":"Cai Zhaowei","year":"2020","unstructured":"Zhaowei Cai and Nuno Vasconcelos. 2020. Rethinking differentiable search for mixed-precision neural networks. In Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition. 2349\u20132358."},{"key":"e_1_3_2_435_2","first-page":"5291","volume-title":"Proceedings of the IEEE\/CVF International Conference on Computer Vision","author":"Wang Ziwei","year":"2021","unstructured":"Ziwei Wang, Han Xiao, Jiwen Lu, and Jie Zhou. 2021. Generalizable mixed-precision quantization via attribution rank preservation. In Proceedings of the IEEE\/CVF International Conference on Computer Vision. 5291\u20135300."},{"key":"e_1_3_2_436_2","doi-asserted-by":"crossref","first-page":"448","DOI":"10.1007\/978-3-030-58574-7_27","volume-title":"Computer Vision\u2013ECCV 2020: 16th European Conference, Glasgow, UK, August 23\u201328, 2020, Proceedings, Part XXVI 16","author":"Habi Hai Victor","year":"2020","unstructured":"Hai Victor Habi, Roy H. Jennings, and Arnon Netzer. 2020. HMQ: Hardware friendly mixed precision quantization block for CNNs. In Computer Vision\u2013ECCV 2020: 16th European Conference, Glasgow, UK, August 23\u201328, 2020, Proceedings, Part XXVI 16. Springer, 448\u2013463."},{"key":"e_1_3_2_437_2","first-page":"4091","article-title":"Searching for low-bit weights in quantized neural networks","volume":"33","author":"Yang Zhaohui","year":"2020","unstructured":"Zhaohui Yang, Yunhe Wang, Kai Han, Chunjing Xu, Chao Xu, Dacheng Tao, and Chang Xu. 2020. Searching for low-bit weights in quantized neural networks. Advances in Neural Information Processing Systems 33 (2020), 4091\u20134102.","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_3_2_438_2","doi-asserted-by":"crossref","first-page":"236","DOI":"10.1109\/ISVLSI.2016.111","volume-title":"2016 IEEE Computer Society Annual Symposium on VLSI (ISVLSI\u201916)","author":"Andri Renzo","year":"2016","unstructured":"Renzo Andri, Lukas Cavigelli, Davide Rossi, and Luca Benini. 2016. YodaNN: An ultra-low power convolutional neural network accelerator based on binary weights. In 2016 IEEE Computer Society Annual Symposium on VLSI (ISVLSI\u201916). IEEE, 236\u2013241."},{"key":"e_1_3_2_439_2","first-page":"51","volume-title":"2018 28th International Conference on Field Programmable Logic and Applications (FPL\u201918)","author":"Guo Peng","year":"2018","unstructured":"Peng Guo, Hong Ma, Ruizhi Chen, Pin Li, Shaolin Xie, and Donglin Wang. 2018. FBNA: A fully binarized neural network accelerator. In 2018 28th International Conference on Field Programmable Logic and Applications (FPL\u201918). IEEE, 51\u2013513."},{"issue":"11","key":"e_1_3_2_440_2","doi-asserted-by":"crossref","first-page":"2940","DOI":"10.1109\/TCAD.2018.2857019","article-title":"XNOR neural engine: A hardware accelerator IP for 21.6-fJ\/op binary neural network inference","volume":"37","author":"Conti Francesco","year":"2018","unstructured":"Francesco Conti, Pasquale Davide Schiavone, and Luca Benini. 2018. XNOR neural engine: A hardware accelerator IP for 21.6-fJ\/op binary neural network inference. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems 37, 11 (2018), 2940\u20132951.","journal-title":"IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems"},{"issue":"7","key":"e_1_3_2_441_2","doi-asserted-by":"crossref","first-page":"1567","DOI":"10.1109\/TVLSI.2020.2993045","article-title":"TiM-DNN: Ternary in-memory accelerator for deep neural networks","volume":"28","author":"Jain Shubham","year":"2020","unstructured":"Shubham Jain, Sumeet Kumar Gupta, and Anand Raghunathan. 2020. TiM-DNN: Ternary in-memory accelerator for deep neural networks. IEEE Transactions on Very Large Scale Integration (VLSI) Systems 28, 7 (2020), 1567\u20131577.","journal-title":"IEEE Transactions on Very Large Scale Integration (VLSI) Systems"},{"issue":"4","key":"e_1_3_2_442_2","doi-asserted-by":"crossref","first-page":"1020","DOI":"10.1109\/TCAD.2021.3075420","article-title":"CUTIE: Beyond PetaOp\/s\/W ternary DNN inference acceleration with better-than-binary energy efficiency","volume":"41","author":"Scherer Moritz","year":"2021","unstructured":"Moritz Scherer, Georg Rutishauser, Lukas Cavigelli, and Luca Benini. 2021. CUTIE: Beyond PetaOp\/s\/W ternary DNN inference acceleration with better-than-binary energy efficiency. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems 41, 4 (2021), 1020\u20131033.","journal-title":"IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems"},{"key":"e_1_3_2_443_2","article-title":"FAT: An in-memory accelerator with fast addition for ternary weight neural networks","author":"Zhu Shien","year":"2022","unstructured":"Shien Zhu, Luan H. K. Duong, Hui Chen, Di Liu, and Weichen Liu. 2022. FAT: An in-memory accelerator with fast addition for ternary weight neural networks. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems (2022).","journal-title":"IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems"},{"issue":"7","key":"e_1_3_2_444_2","first-page":"2925","article-title":"Exploiting retraining-based mixed-precision quantization for low-cost DNN accelerator design","volume":"32","author":"Kim Nahsung","year":"2020","unstructured":"Nahsung Kim, Dongyeob Shin, Wonseok Choi, Geonho Kim, and Jongsun Park. 2020. Exploiting retraining-based mixed-precision quantization for low-cost DNN accelerator design. IEEE Transactions on Neural Networks and Learning Systems 32, 7 (2020), 2925\u20132938.","journal-title":"IEEE Transactions on Neural Networks and Learning Systems"},{"key":"e_1_3_2_445_2","first-page":"134","volume-title":"Proceedings of the 2022 ACM\/SIGDA International Symposium on Field-Programmable Gate Arrays","author":"Sun Mengshu","year":"2022","unstructured":"Mengshu Sun, Zhengang Li, Alec Lu, Yanyu Li, Sung-En Chang, Xiaolong Ma, Xue Lin, and Zhenman Fang. 2022. FILM-QNN: Efficient FPGA acceleration of deep neural networks with intra-layer, mixed-precision quantization. In Proceedings of the 2022 ACM\/SIGDA International Symposium on Field-Programmable Gate Arrays. 134\u2013145."},{"issue":"11","key":"e_1_3_2_446_2","doi-asserted-by":"crossref","first-page":"232","DOI":"10.1109\/LSSC.2019.2937440","article-title":"An energy-efficient sparse deep-neural-network learning accelerator with fine-grained mixed precision of FP8\u2013FP16","volume":"2","author":"Lee Jinsu","year":"2019","unstructured":"Jinsu Lee, Juhyoung Lee, Donghyeon Han, Jinmook Lee, Gwangtae Park, and Hoi-Jun Yoo. 2019. An energy-efficient sparse deep-neural-network learning accelerator with fine-grained mixed precision of FP8\u2013FP16. IEEE Solid-State Circuits Letters 2, 11 (2019), 232\u2013235.","journal-title":"IEEE Solid-State Circuits Letters"},{"key":"e_1_3_2_447_2","first-page":"372","volume-title":"Proceedings of the 26th Asia and South Pacific Design Automation Conference","author":"Huang Sitao","year":"2021","unstructured":"Sitao Huang, Aayush Ankit, Plinio Silveira, Rodrigo Antunes, Sai Rahul Chalamalasetti, Izzat El Hajj, Dong Eun Kim, Glaucimar Aguiar, Pedro Bruel, Sergey Serebryakov, et\u00a0al. 2021. Mixed precision quantization for ReRAM-based DNN inference accelerators. In Proceedings of the 26th Asia and South Pacific Design Automation Conference. 372\u2013377."},{"issue":"3","key":"e_1_3_2_448_2","doi-asserted-by":"crossref","first-page":"405","DOI":"10.1016\/0893-6080(91)90077-I","article-title":"Weight quantization in Boltzmann machines","volume":"4","author":"Balzer Wolfgang","year":"1991","unstructured":"Wolfgang Balzer, Masanobu Takahashi, Jun Ohta, and Kazuo Kyuma. 1991. Weight quantization in Boltzmann machines. Neural Networks 4, 3 (1991), 405\u2013409.","journal-title":"Neural Networks"},{"key":"e_1_3_2_449_2","doi-asserted-by":"crossref","first-page":"164","DOI":"10.1117\/12.20700","volume-title":"Optical Interconnections and Networks","author":"Fiesler Emile","year":"1990","unstructured":"Emile Fiesler, Amar Choudry, and H. John Caulfield. 1990. Weight discretization paradigm for optical neural networks. In Optical Interconnections and Networks, Vol. 1281. SPIE, 164\u2013173."},{"issue":"6","key":"e_1_3_2_450_2","doi-asserted-by":"crossref","first-page":"1446","DOI":"10.1109\/72.471364","article-title":"The effects of quantization on multilayer neural networks","volume":"6","author":"Dundar Gunhan","year":"1995","unstructured":"Gunhan Dundar and Kenneth Rose. 1995. The effects of quantization on multilayer neural networks. IEEE Transactions on Neural Networks 6, 6 (1995), 1446\u20131451.","journal-title":"IEEE Transactions on Neural Networks"},{"key":"e_1_3_2_451_2","doi-asserted-by":"crossref","first-page":"234","DOI":"10.1145\/3566097.3567856","volume-title":"Proceedings of the 28th Asia and South Pacific Design Automation Conference","author":"Huai Shuo","year":"2023","unstructured":"Shuo Huai, Di Liu, Xiangzhong Luo, Hui Chen, Weichen Liu, and Ravi Subramaniam. 2023. Crossbar-aligned & integer-only neural network compression for efficient in-memory acceleration. In Proceedings of the 28th Asia and South Pacific Design Automation Conference. 234\u2013239."},{"issue":"5","key":"e_1_3_2_452_2","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1145\/3609115","article-title":"CRIMP: Compact & Reliable DNN Inference on In-Memory Processing via Crossbar-Aligned Compression and Non-ideality Adaptation","volume":"22","author":"Huai Shuo","year":"2023","unstructured":"Shuo Huai, Hao Kong, Xiangzhong Luo, Shiqing Li, Ravi Subramaniam, Christian Makaya, Qian Lin, and Weichen Liu. 2023. CRIMP: Compact & Reliable DNN Inference on In-Memory Processing via Crossbar-Aligned Compression and Non-ideality Adaptation. ACM Transactions on Embedded Computing Systems 22, 5s (2023), 1\u201325.","journal-title":"ACM Transactions on Embedded Computing Systems"},{"key":"e_1_3_2_453_2","article-title":"Contrastive representation distillation","author":"Tian Yonglong","year":"2019","unstructured":"Yonglong Tian, Dilip Krishnan, and Phillip Isola. 2019. Contrastive representation distillation. arXiv preprint arXiv:1910.10699 (2019).","journal-title":"arXiv preprint arXiv:1910.10699"},{"key":"e_1_3_2_454_2","doi-asserted-by":"crossref","first-page":"3247","DOI":"10.1109\/ICASSP40776.2020.9054157","volume-title":"ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP\u201920)","author":"Hegde Srinidhi","year":"2020","unstructured":"Srinidhi Hegde, Ranjitha Prasad, Ramya Hebbalaguppe, and Vishwajeet Kumar. 2020. Variational student: Learning compact and sparser networks in knowledge distillation framework. In ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP\u201920). IEEE, 3247\u20133251."},{"key":"e_1_3_2_455_2","doi-asserted-by":"crossref","first-page":"25","DOI":"10.1016\/j.neucom.2021.04.102","article-title":"Preparing lessons: Improve knowledge distillation with better supervision","volume":"454","author":"Wen Tiancheng","year":"2021","unstructured":"Tiancheng Wen, Shenqi Lai, and Xueming Qian. 2021. Preparing lessons: Improve knowledge distillation with better supervision. Neurocomputing 454 (2021), 25\u201333.","journal-title":"Neurocomputing"},{"key":"e_1_3_2_456_2","first-page":"4794","volume-title":"Proceedings of the IEEE\/CVF International Conference on Computer Vision","author":"Cho Jang Hyun","year":"2019","unstructured":"Jang Hyun Cho and Bharath Hariharan. 2019. On the efficacy of knowledge distillation. In Proceedings of the IEEE\/CVF International Conference on Computer Vision. 4794\u20134802."},{"key":"e_1_3_2_457_2","first-page":"5191","volume-title":"Proceedings of the AAAI Conference on Artificial Intelligence","author":"Mirzadeh Seyed Iman","year":"2020","unstructured":"Seyed Iman Mirzadeh, Mehrdad Farajtabar, Ang Li, Nir Levine, Akihiro Matsukawa, and Hassan Ghasemzadeh. 2020. Improved knowledge distillation via teacher assistant. In Proceedings of the AAAI Conference on Artificial Intelligence. 5191\u20135198."},{"key":"e_1_3_2_458_2","first-page":"10925","volume-title":"Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition","author":"Beyer Lucas","year":"2022","unstructured":"Lucas Beyer, Xiaohua Zhai, Am\u00e9lie Royer, Larisa Markeeva, Rohan Anil, and Alexander Kolesnikov. 2022. Knowledge distillation: A good teacher is patient and consistent. In Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition. 10925\u201310934."},{"key":"e_1_3_2_459_2","first-page":"1910","volume-title":"Proceedings of the IEEE International Conference on Computer Vision","author":"Li Yuncheng","year":"2017","unstructured":"Yuncheng Li, Jianchao Yang, Yale Song, Liangliang Cao, Jiebo Luo, and Li-Jia Li. 2017. Learning from noisy labels with distillation. In Proceedings of the IEEE International Conference on Computer Vision. 1910\u20131918."},{"key":"e_1_3_2_460_2","first-page":"10687","volume-title":"Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition","author":"Xie Qizhe","year":"2020","unstructured":"Qizhe Xie, Minh-Thang Luong, Eduard Hovy, and Quoc V. Le. 2020. Self-training with noisy student improves ImageNet classification. In Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition. 10687\u201310698."},{"key":"e_1_3_2_461_2","first-page":"12075","volume-title":"Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition","author":"Hong Guanzhe","year":"2021","unstructured":"Guanzhe Hong, Zhiyuan Mao, Xiaojun Lin, and Stanley H. Chan. 2021. Student-teacher learning from clean inputs to noisy inputs. In Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition. 12075\u201312084."},{"key":"e_1_3_2_462_2","first-page":"4133","volume-title":"Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition","author":"Yim Junho","year":"2017","unstructured":"Junho Yim, Donggyu Joo, Jihoon Bae, and Junmo Kim. 2017. A gift from knowledge distillation: Fast optimization, network minimization and transfer learning. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 4133\u20134141."},{"key":"e_1_3_2_463_2","article-title":"Paraphrasing complex network: Network compression via factor transfer","volume":"31","author":"Kim Jangho","year":"2018","unstructured":"Jangho Kim, SeongUk Park, and Nojun Kwak. 2018. Paraphrasing complex network: Network compression via factor transfer. Advances in Neural Information Processing Systems 31 (2018).","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_3_2_464_2","first-page":"9163","volume-title":"Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition","author":"Ahn Sungsoo","year":"2019","unstructured":"Sungsoo Ahn, Shell Xu Hu, Andreas Damianou, Neil D. Lawrence, and Zhenwen Dai. 2019. Variational information distillation for knowledge transfer. In Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition. 9163\u20139171."},{"key":"e_1_3_2_465_2","first-page":"1365","volume-title":"Proceedings of the IEEE\/CVF International Conference on Computer Vision","author":"Tung Frederick","year":"2019","unstructured":"Frederick Tung and Greg Mori. 2019. Similarity-preserving knowledge distillation. In Proceedings of the IEEE\/CVF International Conference on Computer Vision. 1365\u20131374."},{"key":"e_1_3_2_466_2","article-title":"BERT-of-Theseus: Compressing BERT by progressive module replacing","author":"Xu Canwen","year":"2020","unstructured":"Canwen Xu, Wangchunshu Zhou, Tao Ge, Furu Wei, and Ming Zhou. 2020. BERT-of-Theseus: Compressing BERT by progressive module replacing. arXiv preprint arXiv:2002.02925 (2020).","journal-title":"arXiv preprint arXiv:2002.02925"},{"key":"e_1_3_2_467_2","article-title":"Channel distillation: Channel-wise attention for knowledge distillation","author":"Zhou Zaida","year":"2020","unstructured":"Zaida Zhou, Chaoran Zhuge, Xinwei Guan, and Wen Liu. 2020. Channel distillation: Channel-wise attention for knowledge distillation. arXiv preprint arXiv:2006.01683 (2020).","journal-title":"arXiv preprint arXiv:2006.01683"},{"key":"e_1_3_2_468_2","article-title":"Mean teachers are better role models: Weight-averaged consistency targets improve semi-supervised deep learning results","volume":"30","author":"Tarvainen Antti","year":"2017","unstructured":"Antti Tarvainen and Harri Valpola. 2017. Mean teachers are better role models: Weight-averaged consistency targets improve semi-supervised deep learning results. Advances in Neural Information Processing Systems 30 (2017).","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_3_2_469_2","first-page":"1285","volume-title":"Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining","author":"You Shan","year":"2017","unstructured":"Shan You, Chang Xu, Chao Xu, and Dacheng Tao. 2017. Learning from multiple teacher networks. In Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 1285\u20131294."},{"key":"e_1_3_2_470_2","article-title":"Deep model compression: Distilling knowledge from noisy teachers","author":"Sau Bharat Bhusan","year":"2016","unstructured":"Bharat Bhusan Sau and Vineeth N. Balasubramanian. 2016. Deep model compression: Distilling knowledge from noisy teachers. arXiv preprint arXiv:1610.09650 (2016).","journal-title":"arXiv preprint arXiv:1610.09650"},{"key":"e_1_3_2_471_2","article-title":"Collaborative learning for deep neural networks","volume":"31","author":"Song Guocong","year":"2018","unstructured":"Guocong Song and Wei Chai. 2018. Collaborative learning for deep neural networks. Advances in Neural Information Processing Systems 31 (2018).","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_3_2_472_2","doi-asserted-by":"crossref","first-page":"690","DOI":"10.1145\/3336191.3371792","volume-title":"Proceedings of the 13th International Conference on Web Search and Data Mining","author":"Yang Ze","year":"2020","unstructured":"Ze Yang, Linjun Shou, Ming Gong, Wutao Lin, and Daxin Jiang. 2020. Model compression with two-stage multi-teacher knowledge distillation for web question answering system. In Proceedings of the 13th International Conference on Web Search and Data Mining. 690\u2013698."},{"key":"e_1_3_2_473_2","article-title":"Knowledge distillation by on-the-fly native ensemble","volume":"31","author":"Zhu Xiatian","year":"2018","unstructured":"Xiatian Zhu, Shaogang Gong, et\u00a0al. 2018. Knowledge distillation by on-the-fly native ensemble. Advances in Neural Information Processing Systems 31 (2018).","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_3_2_474_2","first-page":"3697","volume-title":"Interspeech","author":"Fukuda Takashi","year":"2017","unstructured":"Takashi Fukuda, Masayuki Suzuki, Gakuto Kurata, Samuel Thomas, Jia Cui, and Bhuvana Ramabhadran. 2017. Efficient knowledge distillation from an ensemble of teachers. In Interspeech. 3697\u20133701."},{"key":"e_1_3_2_475_2","doi-asserted-by":"crossref","first-page":"247","DOI":"10.1007\/978-3-030-58558-7_15","volume-title":"Computer Vision\u2013ECCV 2020: 16th European Conference, Glasgow, UK, August 23\u201328, 2020, Proceedings, Part V 16","author":"Xiang Liuyu","year":"2020","unstructured":"Liuyu Xiang, Guiguang Ding, and Jungong Han. 2020. Learning from multiple experts: Self-paced knowledge distillation for long-tailed classification. In Computer Vision\u2013ECCV 2020: 16th European Conference, Glasgow, UK, August 23\u201328, 2020, Proceedings, Part V 16. Springer, 247\u2013263."},{"key":"e_1_3_2_476_2","first-page":"4320","volume-title":"Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition","author":"Zhang Ying","year":"2018","unstructured":"Ying Zhang, Tao Xiang, Timothy M. Hospedales, and Huchuan Lu. 2018. Deep mutual learning. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 4320\u20134328."},{"key":"e_1_3_2_477_2","article-title":"Moonshine: Distilling with cheap convolutions","volume":"31","author":"Crowley Elliot J.","year":"2018","unstructured":"Elliot J. Crowley, Gavin Gray, and Amos J. Storkey. 2018. Moonshine: Distilling with cheap convolutions. Advances in Neural Information Processing Systems 31 (2018).","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_3_2_478_2","first-page":"3713","volume-title":"Proceedings of the IEEE\/CVF International Conference on Computer Vision","author":"Zhang Linfeng","year":"2019","unstructured":"Linfeng Zhang, Jiebo Song, Anni Gao, Jingwei Chen, Chenglong Bao, and Kaisheng Ma. 2019. Be your own teacher: Improve the performance of convolutional neural networks via self distillation. In Proceedings of the IEEE\/CVF International Conference on Computer Vision. 3713\u20133722."},{"key":"e_1_3_2_479_2","first-page":"3351","article-title":"Self-distillation amplifies regularization in Hilbert space","volume":"33","author":"Mobahi Hossein","year":"2020","unstructured":"Hossein Mobahi, Mehrdad Farajtabar, and Peter Bartlett. 2020. Self-distillation amplifies regularization in Hilbert space. Advances in Neural Information Processing Systems 33 (2020), 3351\u20133361.","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_3_2_480_2","first-page":"13876","volume-title":"Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition","author":"Yun Sukmin","year":"2020","unstructured":"Sukmin Yun, Jongjin Park, Kimin Lee, and Jinwoo Shin. 2020. Regularizing class-wise predictions via self-knowledge distillation. In Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition. 13876\u201313885."},{"key":"e_1_3_2_481_2","first-page":"10664","volume-title":"Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition","author":"Ji Mingi","year":"2021","unstructured":"Mingi Ji, Seungjae Shin, Seunghyun Hwang, Gibeom Park, and Il-Chul Moon. 2021. Refine myself by teaching myself: Feature refinement via self-knowledge distillation. In Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition. 10664\u201310673."},{"key":"e_1_3_2_482_2","article-title":"Self-distillation with batch knowledge ensembling improves ImageNet classification","author":"Ge Yixiao","year":"2021","unstructured":"Yixiao Ge, Xiao Zhang, Ching Lam Choi, Ka Chun Cheung, Peipei Zhao, Feng Zhu, Xiaogang Wang, Rui Zhao, and Hongsheng Li. 2021. Self-distillation with batch knowledge ensembling improves ImageNet classification. arXiv preprint arXiv:2104.13298 (2021).","journal-title":"arXiv preprint arXiv:2104.13298"},{"issue":"61","key":"e_1_3_2_483_2","first-page":"2023","article-title":"Learning using privileged information: Similarity control and knowledge transfer","volume":"16","author":"Vapnik Vladimir","year":"2015","unstructured":"Vladimir Vapnik and Rauf Izmailov. 2015. Learning using privileged information: Similarity control and knowledge transfer. Journal of Machine Learning Research 16, 61 (2015), 2023\u20132049.","journal-title":"Journal of Machine Learning Research"},{"key":"e_1_3_2_484_2","volume-title":"International Conference on Learning Representations","author":"Lopez-Paz David","year":"2016","unstructured":"David Lopez-Paz, L\u00e9on Bottou, Bernhard Sch\u00f6lkopf, and Vladimir Vapnik. 2016. Unifying distillation and privileged information. In International Conference on Learning Representations."},{"key":"e_1_3_2_485_2","doi-asserted-by":"crossref","first-page":"108741","DOI":"10.1016\/j.patcog.2022.108741","article-title":"Progressive privileged knowledge distillation for online action detection","volume":"129","author":"Zhao Peisen","year":"2022","unstructured":"Peisen Zhao, Lingxi Xie, Jiajie Wang, Ya Zhang, and Qi Tian. 2022. Progressive privileged knowledge distillation for online action detection. Pattern Recognition 129 (2022), 108741.","journal-title":"Pattern Recognition"},{"key":"e_1_3_2_486_2","doi-asserted-by":"crossref","first-page":"1369","DOI":"10.1145\/3292500.3330907","volume-title":"Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining","author":"Tang Fengyi","year":"2019","unstructured":"Fengyi Tang, Cao Xiao, Fei Wang, Jiayu Zhou, and Li-wei H. Lehman. 2019. Retaining privileged information for multi-task learning. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 1369\u20131377."},{"issue":"3","key":"e_1_3_2_487_2","doi-asserted-by":"crossref","first-page":"786","DOI":"10.1109\/TPAMI.2019.2942592","article-title":"Adversarial distillation for learning with privileged provisions","volume":"43","author":"Wang Xiaojie","year":"2019","unstructured":"Xiaojie Wang, Rui Zhang, Yu Sun, and Jianzhong Qi. 2019. Adversarial distillation for learning with privileged provisions. IEEE Transactions on Pattern Analysis and Machine Intelligence 43, 3 (2019), 786\u2013797.","journal-title":"IEEE Transactions on Pattern Analysis and Machine Intelligence"},{"key":"e_1_3_2_488_2","first-page":"2590","volume-title":"Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining","author":"Xu Chen","year":"2020","unstructured":"Chen Xu, Quan Li, Junfeng Ge, Jinyang Gao, Xiaoyong Yang, Changhua Pei, Fei Sun, Jian Wu, Hanxiao Sun, and Wenwu Ou. 2020. Privileged features distillation at Taobao recommendations. In Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 2590\u20132598."},{"key":"e_1_3_2_489_2","first-page":"3514","volume-title":"Proceedings of the IEEE\/CVF International Conference on Computer Vision","author":"Chen Hanting","year":"2019","unstructured":"Hanting Chen, Yunhe Wang, Chang Xu, Zhaohui Yang, Chuanjian Liu, Boxin Shi, Chunjing Xu, Chao Xu, and Qi Tian. 2019. Data-free learning of student networks. In Proceedings of the IEEE\/CVF International Conference on Computer Vision. 3514\u20133522."},{"key":"e_1_3_2_490_2","article-title":"Data-free adversarial distillation","author":"Fang Gongfan","year":"2019","unstructured":"Gongfan Fang, Jie Song, Chengchao Shen, Xinchao Wang, Da Chen, and Mingli Song. 2019. Data-free adversarial distillation. arXiv preprint arXiv:1912.11006 (2019).","journal-title":"arXiv preprint arXiv:1912.11006"},{"key":"e_1_3_2_491_2","first-page":"3340","volume-title":"2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP\u201921)","author":"Qu Xiaoyang","year":"2021","unstructured":"Xiaoyang Qu, Jianzong Wang, and Jing Xiao. 2021. Enhancing data-free adversarial distillation with activation regularization and virtual interpolation. In 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP\u201921). IEEE, 3340\u20133344."},{"key":"e_1_3_2_492_2","first-page":"1","article-title":"Dual discriminator adversarial distillation for data-free model compression","author":"Zhao Haoran","year":"2022","unstructured":"Haoran Zhao, Xin Sun, Junyu Dong, Milos Manic, Huiyu Zhou, and Hui Yu. 2022. Dual discriminator adversarial distillation for data-free model compression. International Journal of Machine Learning and Cybernetics (2022), 1\u201318.","journal-title":"International Journal of Machine Learning and Cybernetics"},{"key":"e_1_3_2_493_2","article-title":"Data-free adversarial knowledge distillation for graph neural networks","author":"Zhuang Yuanxin","year":"2022","unstructured":"Yuanxin Zhuang, Lingjuan Lyu, Chuan Shi, Carl Yang, and Lichao Sun. 2022. Data-free adversarial knowledge distillation for graph neural networks. arXiv preprint arXiv:2205.03811 (2022).","journal-title":"arXiv preprint arXiv:2205.03811"},{"key":"e_1_3_2_494_2","first-page":"7852","volume-title":"Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition","author":"Zhang Yiman","year":"2021","unstructured":"Yiman Zhang, Hanting Chen, Xinghao Chen, Yiping Deng, Chunjing Xu, and Yunhe Wang. 2021. Data-free knowledge distillation for image super-resolution. In Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition. 7852\u20137861."},{"key":"e_1_3_2_495_2","article-title":"Contrastive model inversion for data-free knowledge distillation","author":"Fang Gongfan","year":"2021","unstructured":"Gongfan Fang, Jie Song, Xinchao Wang, Chengchao Shen, Xingen Wang, and Mingli Song. 2021. Contrastive model inversion for data-free knowledge distillation. arXiv preprint arXiv:2105.08584 (2021).","journal-title":"arXiv preprint arXiv:2105.08584"},{"key":"e_1_3_2_496_2","article-title":"Knowledge distillation using unlabeled mismatched images","author":"Kulkarni Mandar","year":"2017","unstructured":"Mandar Kulkarni, Kalpesh Patil, and Shirish Karande. 2017. Knowledge distillation using unlabeled mismatched images. arXiv preprint arXiv:1703.07131 (2017).","journal-title":"arXiv preprint arXiv:1703.07131"},{"key":"e_1_3_2_497_2","first-page":"3662","volume-title":"Proceedings of the IEEE\/CVF International Conference on Computer Vision","author":"Liu Qing","year":"2019","unstructured":"Qing Liu, Lingxi Xie, Huiyu Wang, and Alan L. Yuille. 2019. Semantic-aware knowledge preservation for zero-shot sketch-based image retrieval. In Proceedings of the IEEE\/CVF International Conference on Computer Vision. 3662\u20133671."},{"key":"e_1_3_2_498_2","first-page":"14639","volume-title":"Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition","author":"Li Tianhong","year":"2020","unstructured":"Tianhong Li, Jianguo Li, Zhuang Liu, and Changshui Zhang. 2020. Few sample knowledge distillation for efficient network compression. In Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition. 14639\u201314647."},{"key":"e_1_3_2_499_2","article-title":"Few-shot learning of neural networks from scratch by pseudo example optimization","author":"Kimura Akisato","year":"2018","unstructured":"Akisato Kimura, Zoubin Ghahramani, Koh Takeuchi, Tomoharu Iwata, and Naonori Ueda. 2018. Few-shot learning of neural networks from scratch by pseudo example optimization. arXiv preprint arXiv:1802.03039 (2018).","journal-title":"arXiv preprint arXiv:1802.03039"},{"key":"e_1_3_2_500_2","first-page":"3203","volume-title":"Proceedings of the AAAI Conference on Artificial Intelligence","author":"Bai Haoli","year":"2020","unstructured":"Haoli Bai, Jiaxiang Wu, Irwin King, and Michael Lyu. 2020. Few shot network compression via cross distillation. In Proceedings of the AAAI Conference on Artificial Intelligence. 3203\u20133210."},{"key":"e_1_3_2_501_2","first-page":"701","volume-title":"Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition","author":"Wang Huanyu","year":"2022","unstructured":"Huanyu Wang, Junjie Liu, Xin Ma, Yang Yong, Zhenhua Chai, and Jianxin Wu. 2022. Compressing models with few samples: Mimicking then replacing. In Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition. 701\u2013710."},{"key":"e_1_3_2_502_2","doi-asserted-by":"crossref","first-page":"535","DOI":"10.1145\/1150402.1150464","volume-title":"Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining","author":"Bucilu\u01ce Cristian","year":"2006","unstructured":"Cristian Bucilu\u01ce, Rich Caruana, and Alexandru Niculescu-Mizil. 2006. Model compression. In Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 535\u2013541."},{"key":"e_1_3_2_503_2","article-title":"Learning efficient object detection models with knowledge distillation","volume":"30","author":"Chen Guobin","year":"2017","unstructured":"Guobin Chen, Wongun Choi, Xiang Yu, Tony Han, and Manmohan Chandraker. 2017. Learning efficient object detection models with knowledge distillation. Advances in Neural Information Processing Systems 30 (2017).","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_3_2_504_2","first-page":"1","volume-title":"2020 IEEE International Conference on Multimedia and Expo (ICME\u201920)","author":"Luo Xiangzhong","year":"2020","unstructured":"Xiangzhong Luo, H. K. Luan Duong, and Weichen Liu. 2020. Person re-identification via pose-aware multi-semantic learning. In 2020 IEEE International Conference on Multimedia and Expo (ICME\u201920). IEEE, 1\u20136."},{"key":"e_1_3_2_505_2","volume-title":"Advances in Neural Information Processing Systems","author":"Goodfellow Ian","year":"2014","unstructured":"Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. 2014. Generative adversarial nets. In Advances in Neural Information Processing Systems."},{"key":"e_1_3_2_506_2","first-page":"7539","volume-title":"Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition","author":"Liu Yu","year":"2020","unstructured":"Yu Liu, Xuhui Jia, Mingxing Tan, Raviteja Vemulapalli, Yukun Zhu, Bradley Green, and Xiaogang Wang. 2020. Search to distill: Pearls are everywhere but not the eyes. In Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition. 7539\u20137548."},{"key":"e_1_3_2_507_2","article-title":"DisWOT: Student architecture search for distillation WithOut training","author":"Dong Peijie","year":"2023","unstructured":"Peijie Dong, Lujun Li, and Zimian Wei. 2023. DisWOT: Student architecture search for distillation WithOut training. arXiv preprint arXiv:2303.15678 (2023).","journal-title":"arXiv preprint arXiv:2303.15678"},{"issue":"1","key":"e_1_3_2_508_2","first-page":"75","article-title":"AutoML for architecting efficient and specialized neural networks","volume":"40","author":"Cai Han","year":"2019","unstructured":"Han Cai, Ji Lin, Yujun Lin, Zhijian Liu, Kuan Wang, Tianzhe Wang, Ligeng Zhu, and Song Han. 2019. AutoML for architecting efficient and specialized neural networks. IEEE Micro 40, 1 (2019), 75\u201382.","journal-title":"IEEE Micro"},{"key":"e_1_3_2_509_2","first-page":"374","article-title":"Towards federated learning at scale: System design","author":"Bonawitz Keith","year":"2019","unstructured":"Keith Bonawitz, Hubert Eichner, Wolfgang Grieskamp, Dzmitry Huba, Alex Ingerman, Vladimir Ivanov, Chloe Kiddon, Jakub Kone\u010dn\u1ef3, Stefano Mazzocchi, Brendan McMahan, et\u00a0al. 2019. Towards federated learning at scale: System design. Proceedings of Machine Learning and Systems (2019), 374\u2013388.","journal-title":"Proceedings of Machine Learning and Systems"},{"key":"e_1_3_2_510_2","first-page":"2810","volume-title":"Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition","author":"Li Rundong","year":"2019","unstructured":"Rundong Li, Yan Wang, Feng Liang, Hongwei Qin, Junjie Yan, and Rui Fan. 2019. Fully quantized network for object detection. In Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition. 2810\u20132819."},{"key":"e_1_3_2_511_2","doi-asserted-by":"crossref","first-page":"400","DOI":"10.1016\/j.neucom.2020.03.056","article-title":"Localization-aware channel pruning for object detection","volume":"403","author":"Xie Zihao","year":"2020","unstructured":"Zihao Xie, Li Zhu, Lin Zhao, Bo Tao, Liman Liu, and Wenbing Tao. 2020. Localization-aware channel pruning for object detection. Neurocomputing 403 (2020), 400\u2013408.","journal-title":"Neurocomputing"},{"key":"e_1_3_2_512_2","unstructured":"PyTorch. 2021. Automatic Mixed Precision. Retrieved from https:\/\/pytorch.org\/blog\/accelerating-training-on-nvidia-gpus-with-pytorch-automatic-mixed-precision\/ (2021)."},{"key":"e_1_3_2_513_2","first-page":"517","article-title":"MicroNets: Neural network architectures for deploying TinyML applications on commodity microcontrollers","volume":"3","author":"Banbury Colby","year":"2021","unstructured":"Colby Banbury, Chuteng Zhou, Igor Fedorov, Ramon Matas, Urmish Thakker, Dibakar Gope, Vijay Janapa Reddi, Matthew Mattina, and Paul Whatmough. 2021. MicroNets: Neural network architectures for deploying TinyML applications on commodity microcontrollers. Proceedings of Machine Learning and Systems 3 (2021), 517\u2013532.","journal-title":"Proceedings of Machine Learning and Systems"},{"key":"e_1_3_2_514_2","first-page":"11711","article-title":"MCUNet: Tiny deep learning on IoT devices","volume":"33","author":"Lin Ji","year":"2020","unstructured":"Ji Lin, Wei-Ming Chen, Yujun Lin, Chuang Gan, Song Han, et\u00a0al. 2020. MCUNet: Tiny deep learning on IoT devices. Advances in Neural Information Processing Systems 33 (2020), 11711\u201311722.","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_3_2_515_2","first-page":"2346","article-title":"Memory-efficient patch-based inference for tiny deep learning","volume":"34","author":"Lin Ji","year":"2021","unstructured":"Ji Lin, Wei-Ming Chen, Han Cai, Chuang Gan, and Song Han. 2021. Memory-efficient patch-based inference for tiny deep learning. Advances in Neural Information Processing Systems 34 (2021), 2346\u20132358.","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_3_2_516_2","volume-title":"Proceedings of the AAAI Conference on Artificial Intelligence","author":"Xu Kunran","year":"2022","unstructured":"Kunran Xu, Yishi Li, Huawei Zhang, Rui Lai, and Lin Gu. 2022. EtinyNet: Extremely tiny network for TinyML. In Proceedings of the AAAI Conference on Artificial Intelligence."},{"key":"e_1_3_2_517_2","article-title":"Training deep nets with sublinear memory cost","author":"Chen Tianqi","year":"2016","unstructured":"Tianqi Chen, Bing Xu, Chiyuan Zhang, and Carlos Guestrin. 2016. Training deep nets with sublinear memory cost. arXiv preprint arXiv:1604.06174 (2016).","journal-title":"arXiv preprint arXiv:1604.06174"},{"key":"e_1_3_2_518_2","first-page":"11433","volume-title":"Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition","author":"Feng Jianwei","year":"2021","unstructured":"Jianwei Feng and Dong Huang. 2021. Optimal gradient checkpoint search for arbitrary computation graphs. In Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition. 11433\u201311442."},{"key":"e_1_3_2_519_2","first-page":"2930","article-title":"Sketch-GNN: Scalable graph neural networks with sublinear training complexity","volume":"35","author":"Ding Mucong","year":"2022","unstructured":"Mucong Ding, Tahseen Rabbani, Bang An, Evan Wang, and Furong Huang. 2022. Sketch-GNN: Scalable graph neural networks with sublinear training complexity. Advances in Neural Information Processing Systems 35 (2022), 2930\u20132943.","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_3_2_520_2","first-page":"322","volume-title":"Computer Vision\u2013ECCV 2020: 16th European Conference, Glasgow, UK, August 23\u201328, 2020, Proceedings, Part XXV 16","author":"Ye Xucheng","year":"2020","unstructured":"Xucheng Ye, Pengcheng Dai, Junyu Luo, Xin Guo, Yingjie Qi, Jianlei Yang, and Yiran Chen. 2020. Accelerating CNN training by pruning activation gradients. In Computer Vision\u2013ECCV 2020: 16th European Conference, Glasgow, UK, August 23\u201328, 2020, Proceedings, Part XXV 16. Springer, 322\u2013338."},{"key":"e_1_3_2_521_2","article-title":"Efficient on-device training via gradient filtering","author":"Yang Yuedong","year":"2023","unstructured":"Yuedong Yang, Guihong Li, and Radu Marculescu. 2023. Efficient on-device training via gradient filtering. arXiv preprint arXiv:2301.00330 (2023).","journal-title":"arXiv preprint arXiv:2301.00330"},{"key":"e_1_3_2_522_2","article-title":"Dynamic sparse graph for efficient deep learning","author":"Liu Liu","year":"2019","unstructured":"Liu Liu, Lei Deng, Xing Hu, Maohua Zhu, Guoqi Li, Yufei Ding, and Yuan Xie. 2019. Dynamic sparse graph for efficient deep learning. International Conference on Learning Representations (2019).","journal-title":"International Conference on Learning Representations"},{"key":"e_1_3_2_523_2","first-page":"1737","volume-title":"International Conference on Machine Learning","author":"Gupta Suyog","year":"2015","unstructured":"Suyog Gupta, Ankur Agrawal, Kailash Gopalakrishnan, and Pritish Narayanan. 2015. Deep learning with limited numerical precision. In International Conference on Machine Learning. PMLR, 1737\u20131746."},{"key":"e_1_3_2_524_2","first-page":"177","volume-title":"USENIX Annual Technical Conference","author":"Zhou Qihua","year":"2021","unstructured":"Qihua Zhou, Song Guo, Zhihao Qu, Jingcai Guo, Zhenda Xu, Jiewei Zhang, Tao Guo, Boyuan Luo, and Jingren Zhou. 2021. Octo: INT8 training with loss-aware compensation and backward quantization for tiny on-device learning. In USENIX Annual Technical Conference. 177\u2013191."},{"key":"e_1_3_2_525_2","doi-asserted-by":"crossref","first-page":"789","DOI":"10.1109\/JETCAS.2021.3121554","article-title":"A TinyML platform for on-device continual learning with quantized latent replays","author":"Ravaglia Leonardo","year":"2021","unstructured":"Leonardo Ravaglia, Manuele Rusci, Davide Nadalini, Alessandro Capotondi, Francesco Conti, and Luca Benini. 2021. A TinyML platform for on-device continual learning with quantized latent replays. IEEE Journal on Emerging and Selected Topics in Circuits and Systems (2021), 789\u2013802.","journal-title":"IEEE Journal on Emerging and Selected Topics in Circuits and Systems"},{"key":"e_1_3_2_526_2","article-title":"Online continual learning for embedded devices","author":"Hayes Tyler L.","year":"2022","unstructured":"Tyler L. Hayes and Christopher Kanan. 2022. Online continual learning for embedded devices. arXiv preprint arXiv:2203.10681 (2022).","journal-title":"arXiv preprint arXiv:2203.10681"},{"key":"e_1_3_2_527_2","article-title":"Continual learning at the edge: Real-time training on smartphone devices","author":"Pellegrini Lorenzo","year":"2021","unstructured":"Lorenzo Pellegrini, Vincenzo Lomonaco, Gabriele Graffieti, and Davide Maltoni. 2021. Continual learning at the edge: Real-time training on smartphone devices. arXiv preprint arXiv:2105.13127 (2021).","journal-title":"arXiv preprint arXiv:2105.13127"},{"key":"e_1_3_2_528_2","article-title":"Continual learning on the edge with TensorFlow Lite","author":"Demosthenous Giorgos","year":"2021","unstructured":"Giorgos Demosthenous and Vassilis Vassiliades. 2021. Continual learning on the edge with TensorFlow Lite. arXiv preprint arXiv:2105.01946 (2021).","journal-title":"arXiv preprint arXiv:2105.01946"},{"key":"e_1_3_2_529_2","article-title":"Continual learning for on-device environmental sound classification","author":"Xiao Yang","year":"2022","unstructured":"Yang Xiao, Xubo Liu, James King, Arshdeep Singh, Eng Siong Chng, Mark D. Plumbley, and Wenwu Wang. 2022. Continual learning for on-device environmental sound classification. arXiv preprint arXiv:2207.07429 (2022).","journal-title":"arXiv preprint arXiv:2207.07429"},{"key":"e_1_3_2_530_2","first-page":"319","volume-title":"2021 IEEE\/ACM Symposium on Edge Computing (SEC\u201921)","author":"Kwon Young D.","year":"2021","unstructured":"Young D. Kwon, Jagmohan Chauhan, Abhishek Kumar, Pan Hui, and Cecilia Mascolo. 2021. Exploring system performance of continual learning for mobile and embedded sensing applications. In 2021 IEEE\/ACM Symposium on Edge Computing (SEC\u201921). IEEE, 319\u2013332."},{"key":"e_1_3_2_531_2","first-page":"1","volume-title":"2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP\u201923)","author":"Diwan Anuj","year":"2023","unstructured":"Anuj Diwan, Ching-Feng Yeh, Wei-Ning Hsu, Paden Tomasello, Eunsol Choi, David Harwath, and Abdelrahman Mohamed. 2023. Continual learning for on-device speech recognition using disentangled conformers. In 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP\u201923). IEEE, 1\u20135."},{"key":"e_1_3_2_532_2","first-page":"1","volume-title":"2022 IEEE 13th International Green and Sustainable Computing Conference (IGSC\u201922)","author":"Dequino Alberto","year":"2022","unstructured":"Alberto Dequino, Francesco Conti, and Luca Benini. 2022. ViT-LR: Pushing the envelope for transformer-based on-device embedded continual learning. In 2022 IEEE 13th International Green and Sustainable Computing Conference (IGSC\u201922). IEEE, 1\u20136."},{"key":"e_1_3_2_533_2","first-page":"1","volume-title":"2020 57th ACM\/IEEE Design Automation Conference (DAC\u201920)","author":"Shin Jaekang","year":"2020","unstructured":"Jaekang Shin, Seungkyu Choi, Yeongjae Choi, and Lee-Sup Kim. 2020. A pragmatic approach to on-device incremental learning system with selective weight updates. In 2020 57th ACM\/IEEE Design Automation Conference (DAC\u201920). IEEE, 1\u20136."},{"key":"e_1_3_2_534_2","first-page":"538","volume-title":"2022 27th Asia and South Pacific Design Automation Conference (ASP-DAC\u201922)","author":"Wang Ze-Han","year":"2022","unstructured":"Ze-Han Wang, Zhenli He, Hui Fang, Yi-Xiong Huang, Ying Sun, Yu Yang, Zhi-Yuan Zhang, and Di Liu. 2022. Efficient on-device incremental learning by weight freezing. In 2022 27th Asia and South Pacific Design Automation Conference (ASP-DAC\u201922). IEEE, 538\u2013543."},{"key":"e_1_3_2_535_2","doi-asserted-by":"crossref","first-page":"31","DOI":"10.1145\/3212725.3212728","volume-title":"Proceedings of the 2nd International Workshop on Embedded and Mobile Deep Learning","author":"Sundaramoorthy Prahalathan","year":"2018","unstructured":"Prahalathan Sundaramoorthy, Gautham Krishna Gudur, Manav Rajiv Moorthy, R. Nidhi Bhandari, and Vineeth Vijayaraghavan. 2018. HARNet: Towards on-device incremental learning using deep ensembles on constrained devices. In Proceedings of the 2nd International Workshop on Embedded and Mobile Deep Learning. 31\u201336."},{"key":"e_1_3_2_536_2","first-page":"4109","volume-title":"Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition","author":"Cui Yin","year":"2018","unstructured":"Yin Cui, Yang Song, Chen Sun, Andrew Howard, and Serge Belongie. 2018. Large scale fine-grained categorization and domain-specific transfer learning. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 4109\u20134118."},{"key":"e_1_3_2_537_2","first-page":"2661","volume-title":"Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition","author":"Kornblith Simon","year":"2019","unstructured":"Simon Kornblith, Jonathon Shlens, and Quoc V. Le. 2019. Do better ImageNet models transfer better?. In Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition. 2661\u20132671."},{"key":"e_1_3_2_538_2","article-title":"K for the price of 1: Parameter-efficient multi-task and transfer learning","author":"Mudrakarta Pramod Kaushik","year":"2019","unstructured":"Pramod Kaushik Mudrakarta, Mark Sandler, Andrey Zhmoginov, and Andrew Howard. 2019. K for the price of 1: Parameter-efficient multi-task and transfer learning. International Conference on Learning Representations (2019).","journal-title":"International Conference on Learning Representations"},{"key":"e_1_3_2_539_2","article-title":"Training BatchNorm and only BatchNorm: On the expressive power of random features in CNNs","author":"Frankle Jonathan","year":"2021","unstructured":"Jonathan Frankle, David J. Schwab, and Ari S. Morcos. 2021. Training BatchNorm and only BatchNorm: On the expressive power of random features in CNNs. International Conference on Learning Representations (2021).","journal-title":"International Conference on Learning Representations"},{"key":"e_1_3_2_540_2","first-page":"338","volume-title":"Medical Imaging with Deep Learning","author":"Kanavati Fahdi","year":"2021","unstructured":"Fahdi Kanavati and Masayuki Tsuneki. 2021. Partial transfusion: On the expressive influence of trainable batch norm parameters for transfer learning. In Medical Imaging with Deep Learning. PMLR, 338\u2013353."},{"key":"e_1_3_2_541_2","first-page":"9109","volume-title":"Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition","author":"Yazdanpanah Moslem","year":"2022","unstructured":"Moslem Yazdanpanah, Aamer Abdul Rahman, Muawiz Chaudhary, Christian Desrosiers, Mohammad Havaei, Eugene Belilovsky, and Samira Ebrahimi Kahou. 2022. Revisiting learnable affines for batch norm in few-shot transfer learning. In Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition. 9109\u20139118."},{"key":"e_1_3_2_542_2","first-page":"1273","volume-title":"Artificial Intelligence and Statistics","author":"McMahan Brendan","year":"2017","unstructured":"Brendan McMahan, Eider Moore, Daniel Ramage, Seth Hampson, and Blaise Aguera y Arcas. 2017. Communication-efficient learning of deep networks from decentralized data. In Artificial Intelligence and Statistics. PMLR, 1273\u20131282."},{"key":"e_1_3_2_543_2","article-title":"Expanding the reach of federated learning by reducing client resource requirements","author":"Caldas Sebastian","year":"2018","unstructured":"Sebastian Caldas, Jakub Kone\u010dny, H. Brendan McMahan, and Ameet Talwalkar. 2018. Expanding the reach of federated learning by reducing client resource requirements. arXiv preprint arXiv:1812.07210 (2018).","journal-title":"arXiv preprint arXiv:1812.07210"},{"key":"e_1_3_2_544_2","article-title":"Deep gradient compression: Reducing the communication bandwidth for distributed training","author":"Lin Yujun","year":"2018","unstructured":"Yujun Lin, Song Han, Huizi Mao, Yu Wang, and William J. Dally. 2018. Deep gradient compression: Reducing the communication bandwidth for distributed training. International Conference on Learning Representations (2018).","journal-title":"International Conference on Learning Representations"},{"key":"e_1_3_2_545_2","first-page":"29995","article-title":"Delayed gradient averaging: Tolerate the communication latency for federated learning","volume":"34","author":"Zhu Ligeng","year":"2021","unstructured":"Ligeng Zhu, Hongzhou Lin, Yao Lu, Yujun Lin, and Song Han. 2021. Delayed gradient averaging: Tolerate the communication latency for federated learning. Advances in Neural Information Processing Systems 34 (2021), 29995\u201330007.","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_3_2_546_2","first-page":"4348","volume-title":"2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP\u201922)","author":"Yang Tien-Ju","year":"2022","unstructured":"Tien-Ju Yang, Dhruv Guliani, Fran\u00e7oise Beaufays, and Giovanni Motta. 2022. Partial variable training for efficient on-device federated learning. In 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP\u201922). IEEE, 4348\u20134352."},{"key":"e_1_3_2_547_2","article-title":"On-device training: A first overview on existing systems","author":"Zhu Shuai","year":"2022","unstructured":"Shuai Zhu, Thiemo Voigt, JeongGil Ko, and Fatemeh Rahimian. 2022. On-device training: A first overview on existing systems. arXiv preprint arXiv:2212.00824 (2022).","journal-title":"arXiv preprint arXiv:2212.00824"},{"key":"e_1_3_2_548_2","volume-title":"International Conference on Learning Representations","author":"Cai Han","year":"2022","unstructured":"Han Cai, Chuang Gan, Ji Lin, and Song Han. 2022. Network augmentation for tiny deep learning. In International Conference on Learning Representations."},{"key":"e_1_3_2_549_2","article-title":"Return of the devil in the details: Delving deep into convolutional nets","author":"Chatfield Ken","year":"2014","unstructured":"Ken Chatfield, Karen Simonyan, Andrea Vedaldi, and Andrew Zisserman. 2014. Return of the devil in the details: Delving deep into convolutional nets. arXiv preprint arXiv:1405.3531 (2014).","journal-title":"arXiv preprint arXiv:1405.3531"},{"key":"e_1_3_2_550_2","first-page":"627","volume-title":"2022 IEEE 40th International Conference on Computer Design (ICCD\u201922)","author":"Huai Shuo","year":"2022","unstructured":"Shuo Huai, Di Liu, Hao Kong, Xiangzhong Luo, Weichen Liu, Ravi Subramaniam, Christian Makaya, and Qian Lin. 2022. Collate: Collaborative neural network learning for latency-critical edge systems. In 2022 IEEE 40th International Conference on Computer Design (ICCD\u201922). IEEE, 627\u2013634."},{"key":"e_1_3_2_551_2","article-title":"Federated learning for mobile keyboard prediction","author":"Hard Andrew","year":"2018","unstructured":"Andrew Hard, Kanishka Rao, Rajiv Mathews, Swaroop Ramaswamy, Fran\u00e7oise Beaufays, Sean Augenstein, Hubert Eichner, Chlo\u00e9 Kiddon, and Daniel Ramage. 2018. Federated learning for mobile keyboard prediction. arXiv preprint arXiv:1811.03604 (2018).","journal-title":"arXiv preprint arXiv:1811.03604"},{"issue":"1","key":"e_1_3_2_552_2","first-page":"1","article-title":"Federated learning and differential privacy for medical image analysis","volume":"12","author":"Adnan Mohammed","year":"2022","unstructured":"Mohammed Adnan, Shivam Kalra, Jesse C. Cresswell, Graham W. Taylor, and Hamid R. Tizhoosh. 2022. Federated learning and differential privacy for medical image analysis. Scientific Reports 12, 1 (2022), 1\u201310.","journal-title":"Scientific Reports"},{"issue":"4","key":"e_1_3_2_553_2","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1145\/3501813","article-title":"Federated learning for healthcare: Systematic review and architecture proposal","volume":"13","author":"Antunes Rodolfo Stoffel","year":"2022","unstructured":"Rodolfo Stoffel Antunes, Cristiano Andr\u00e9 da Costa, Arne K\u00fcderle, Imrana Abdullahi Yari, and Bj\u00f6rn Eskofier. 2022. Federated learning for healthcare: Systematic review and architecture proposal. ACM Transactions on Intelligent Systems and Technology (TIST) 13, 4 (2022), 1\u201323.","journal-title":"ACM Transactions on Intelligent Systems and Technology (TIST)"},{"key":"e_1_3_2_554_2","first-page":"101","volume-title":"2021 IEEE\/ACM 43rd International Conference on Software Engineering: Software Engineering in Practice (ICSE-SEIP\u201921)","author":"Huang Yujin","year":"2021","unstructured":"Yujin Huang, Han Hu, and Chunyang Chen. 2021. Robustness of on-device models: Adversarial attack to deep learning models on Android apps. In 2021 IEEE\/ACM 43rd International Conference on Software Engineering: Software Engineering in Practice (ICSE-SEIP\u201921). IEEE, 101\u2013110."},{"issue":"1","key":"e_1_3_2_555_2","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1145\/3469032","article-title":"DeepMTD: Moving target defense for deep visual sensing against adversarial examples","volume":"18","author":"Song Qun","year":"2021","unstructured":"Qun Song, Zhenyu Yan, and Rui Tan. 2021. DeepMTD: Moving target defense for deep visual sensing against adversarial examples. ACM Transactions on Sensor Networks (TOSN) 18, 1 (2021), 1\u201332.","journal-title":"ACM Transactions on Sensor Networks (TOSN)"},{"key":"e_1_3_2_556_2","doi-asserted-by":"crossref","first-page":"124","DOI":"10.1145\/3356250.3360025","volume-title":"Proceedings of the 17th Conference on Embedded Networked Sensor Systems","author":"Song Qun","year":"2019","unstructured":"Qun Song, Zhenyu Yan, and Rui Tan. 2019. Moving target defense for embedded deep visual sensing against adversarial examples. In Proceedings of the 17th Conference on Embedded Networked Sensor Systems. 124\u2013137."},{"issue":"140","key":"e_1_3_2_557_2","first-page":"1","article-title":"Exploring the limits of transfer learning with a unified text-to-text transformer","volume":"21","author":"Raffel Colin","year":"2020","unstructured":"Colin Raffel, Noam Shazeer, Adam Roberts, Katherine Lee, Sharan Narang, Michael Matena, Yanqi Zhou, Wei Li, and Peter J. Liu. 2020. Exploring the limits of transfer learning with a unified text-to-text transformer. Journal of Machine Learning Research 21, 140 (2020), 1\u201367.","journal-title":"Journal of Machine Learning Research"},{"key":"e_1_3_2_558_2","article-title":"BART: Denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension","author":"Lewis Mike","year":"2019","unstructured":"Mike Lewis, Yinhan Liu, Naman Goyal, Marjan Ghazvininejad, Abdelrahman Mohamed, Omer Levy, Ves Stoyanov, and Luke Zettlemoyer. 2019. BART: Denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension. arXiv preprint arXiv:1910.13461 (2019).","journal-title":"arXiv preprint arXiv:1910.13461"},{"key":"e_1_3_2_559_2","first-page":"21665","article-title":"Fast transformers with clustered attention","volume":"33","author":"Vyas Apoorv","year":"2020","unstructured":"Apoorv Vyas, Angelos Katharopoulos, and Fran\u00e7ois Fleuret. 2020. Fast transformers with clustered attention. Advances in Neural Information Processing Systems 33 (2020), 21665\u201321674.","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_3_2_560_2","first-page":"14138","volume-title":"Proceedings of the AAAI Conference on Artificial Intelligence","volume":"35","author":"Xiong Yunyang","year":"2021","unstructured":"Yunyang Xiong, Zhanpeng Zeng, Rudrasis Chakraborty, Mingxing Tan, Glenn Fung, Yin Li, and Vikas Singh. 2021. Nystr\u00f6mformer: A Nystr\u00f6m-based algorithm for approximating self-attention. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 35. 14138\u201314148."},{"key":"e_1_3_2_561_2","first-page":"40605","volume-title":"International Conference on Machine Learning","author":"Zandieh Amir","year":"2023","unstructured":"Amir Zandieh, Insu Han, Majid Daliri, and Amin Karbasi. 2023. KDEformer: Accelerating transformers via kernel density estimation. In International Conference on Machine Learning. PMLR, 40605\u201340623."},{"key":"e_1_3_2_562_2","article-title":"Mega: Moving average equipped gated attention","author":"Ma Xuezhe","year":"2022","unstructured":"Xuezhe Ma, Chunting Zhou, Xiang Kong, Junxian He, Liangke Gui, Graham Neubig, Jonathan May, and Luke Zettlemoyer. 2022. Mega: Moving average equipped gated attention. arXiv preprint arXiv:2209.10655 (2022).","journal-title":"arXiv preprint arXiv:2209.10655"},{"key":"e_1_3_2_563_2","first-page":"72","volume-title":"Topological, Algebraic and Geometric Learning Workshops 2023","author":"Alberti Silas","year":"2023","unstructured":"Silas Alberti, Niclas Dern, Laura Thesing, and Gitta Kutyniok. 2023. Sumformer: Universal approximation for efficient transformers. In Topological, Algebraic and Geometric Learning Workshops 2023. PMLR, 72\u201386."},{"key":"e_1_3_2_564_2","article-title":"FLuRKA: Fast fused Low-Rank & Kernel Attention","author":"Gupta Ahan","year":"2023","unstructured":"Ahan Gupta, Yueming Yuan, Yanqi Zhou, and Charith Mendis. 2023. FLuRKA: Fast fused Low-Rank & Kernel Attention. arXiv preprint arXiv:2306.15799 (2023).","journal-title":"arXiv preprint arXiv:2306.15799"},{"key":"e_1_3_2_565_2","first-page":"328","volume-title":"2020 IEEE International Symposium on High Performance Computer Architecture (HPCA\u201920)","author":"Ham Tae Jun","year":"2020","unstructured":"Tae Jun Ham, Sung Jun Jung, Seonghak Kim, Young H. Oh, Yeonhong Park, Yoonho Song, Jung-Hun Park, Sanghee Lee, Kyoung Park, Jae W. Lee, et\u00a0al. 2020. A^3: Accelerating attention mechanisms in neural networks with approximation. In 2020 IEEE International Symposium on High Performance Computer Architecture (HPCA\u201920). IEEE, 328\u2013341."},{"key":"e_1_3_2_566_2","first-page":"692","volume-title":"2021 ACM\/IEEE 48th Annual International Symposium on Computer Architecture (ISCA\u201921)","author":"Ham Tae Jun","year":"2021","unstructured":"Tae Jun Ham, Yejin Lee, Seong Hoon Seo, Soosung Kim, Hyunji Choi, Sung Jun Jung, and Jae W. Lee. 2021. ELSA: Hardware-software co-design for efficient, lightweight self-attention mechanism in neural networks. In 2021 ACM\/IEEE 48th Annual International Symposium on Computer Architecture (ISCA\u201921). IEEE, 692\u2013705."},{"key":"e_1_3_2_567_2","article-title":"Gated linear attention transformers with hardware-efficient training","author":"Yang Songlin","year":"2023","unstructured":"Songlin Yang, Bailin Wang, Yikang Shen, Rameswar Panda, and Yoon Kim. 2023. Gated linear attention transformers with hardware-efficient training. arXiv preprint arXiv:2312.06635 (2023).","journal-title":"arXiv preprint arXiv:2312.06635"},{"key":"e_1_3_2_568_2","first-page":"10323","volume-title":"International Conference on Machine Learning","author":"Frantar Elias","year":"2023","unstructured":"Elias Frantar and Dan Alistarh. 2023. SparseGTP: Massive language models can be accurately pruned in one-shot. In International Conference on Machine Learning. PMLR, 10323\u201310337."},{"key":"e_1_3_2_569_2","article-title":"SliceGPT: Compress large language models by deleting rows and columns","author":"Ashkboos Saleh","year":"2024","unstructured":"Saleh Ashkboos, Maximilian L. Croci, Marcelo Gennari do Nascimento, Torsten Hoefler, and James Hensman. 2024. SliceGPT: Compress large language models by deleting rows and columns. arXiv preprint arXiv:2401.15024 (2024).","journal-title":"arXiv preprint arXiv:2401.15024"},{"key":"e_1_3_2_570_2","article-title":"ReLU strikes back: Exploiting activation sparsity in large language models","author":"Mirzadeh Iman","year":"2023","unstructured":"Iman Mirzadeh, Keivan Alizadeh, Sachin Mehta, Carlo C. Del Mundo, Oncel Tuzel, Golnoosh Samei, Mohammad Rastegari, and Mehrdad Farajtabar. 2023. ReLU strikes back: Exploiting activation sparsity in large language models. arXiv preprint arXiv:2310.04564 (2023).","journal-title":"arXiv preprint arXiv:2310.04564"},{"key":"e_1_3_2_571_2","first-page":"2134","volume-title":"Uncertainty in Artificial Intelligence","author":"Thangarasa Vithursan","year":"2023","unstructured":"Vithursan Thangarasa, Abhay Gupta, William Marshall, Tianda Li, Kevin Leong, Dennis DeCoste, Sean Lie, and Shreyas Saxena. 2023. SPDF: Sparse pre-training and dense fine-tuning for large language models. In Uncertainty in Artificial Intelligence. PMLR, 2134\u20132146."},{"key":"e_1_3_2_572_2","article-title":"Scaling sparse fine-tuning to large language models","author":"Ansell Alan","year":"2024","unstructured":"Alan Ansell, Ivan Vuli\u0107, Hannah Sterz, Anna Korhonen, and Edoardo M. Ponti. 2024. Scaling sparse fine-tuning to large language models. arXiv preprint arXiv:2401.16405 (2024).","journal-title":"arXiv preprint arXiv:2401.16405"},{"key":"e_1_3_2_573_2","article-title":"Sparse finetuning for inference acceleration of large language models","author":"Kurtic Eldar","year":"2023","unstructured":"Eldar Kurtic, Denis Kuznedelev, Elias Frantar, Michael Goin, and Dan Alistarh. 2023. Sparse finetuning for inference acceleration of large language models. arXiv preprint arXiv:2310.06927 (2023).","journal-title":"arXiv preprint arXiv:2310.06927"},{"key":"e_1_3_2_574_2","first-page":"10865","volume-title":"Proceedings of the AAAI Conference on Artificial Intelligence","volume":"38","author":"An Yongqi","year":"2024","unstructured":"Yongqi An, Xu Zhao, Tao Yu, Ming Tang, and Jinqiao Wang. 2024. Fluctuation-based adaptive structured pruning for large language models. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 38. 10865\u201310873."},{"key":"e_1_3_2_575_2","article-title":"ZipLM: Inference-aware structured pruning of language models","volume":"36","author":"Kurti\u0107 Eldar","year":"2024","unstructured":"Eldar Kurti\u0107, Elias Frantar, and Dan Alistarh. 2024. ZipLM: Inference-aware structured pruning of language models. Advances in Neural Information Processing Systems 36 (2024).","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_3_2_576_2","article-title":"LoRAShear: Efficient large language model structured pruning and knowledge recovery","author":"Chen Tianyi","year":"2023","unstructured":"Tianyi Chen, Tianyu Ding, Badal Yadav, Ilya Zharkov, and Luming Liang. 2023. LoRAShear: Efficient large language model structured pruning and knowledge recovery. arXiv preprint arXiv:2310.18356 (2023).","journal-title":"arXiv preprint arXiv:2310.18356"},{"key":"e_1_3_2_577_2","article-title":"Sheared LLaMA: Accelerating language model pre-training via structured pruning","author":"Xia Mengzhou","year":"2023","unstructured":"Mengzhou Xia, Tianyu Gao, Zhiyuan Zeng, and Danqi Chen. 2023. Sheared LLaMA: Accelerating language model pre-training via structured pruning. arXiv preprint arXiv:2310.06694 (2023).","journal-title":"arXiv preprint arXiv:2310.06694"},{"key":"e_1_3_2_578_2","article-title":"Compressing large language models by streamlining the unimportant layer","author":"Chen Xiaodong","year":"2024","unstructured":"Xiaodong Chen, Yuxuan Hu, and Jing Zhang. 2024. Compressing large language models by streamlining the unimportant layer. arXiv preprint arXiv:2403.19135 (2024).","journal-title":"arXiv preprint arXiv:2403.19135"},{"key":"e_1_3_2_579_2","article-title":"ShortGPT: Layers in large language models are more redundant than you expect","author":"Men Xin","year":"2024","unstructured":"Xin Men, Mingyu Xu, Qingyu Zhang, Bingning Wang, Hongyu Lin, Yaojie Lu, Xianpei Han, and Weipeng Chen. 2024. ShortGPT: Layers in large language models are more redundant than you expect. arXiv preprint arXiv:2403.03853 (2024).","journal-title":"arXiv preprint arXiv:2403.03853"},{"key":"e_1_3_2_580_2","article-title":"Shortened LLaMA: A simple depth pruning for large language models","author":"Kim Bo-Kyeong","year":"2024","unstructured":"Bo-Kyeong Kim, Geonmin Kim, Tae-Ho Kim, Thibault Castells, Shinkook Choi, Junho Shin, and Hyoung-Kyu Song. 2024. Shortened LLaMA: A simple depth pruning for large language models. arXiv preprint arXiv:2402.02834 (2024).","journal-title":"arXiv preprint arXiv:2402.02834"},{"key":"e_1_3_2_581_2","article-title":"SPQR: A sparse-quantized representation for near-lossless LLM weight compression","author":"Dettmers Tim","year":"2023","unstructured":"Tim Dettmers, Ruslan Svirschevski, Vage Egiazarian, Denis Kuznedelev, Elias Frantar, Saleh Ashkboos, Alexander Borzunov, Torsten Hoefler, and Dan Alistarh. 2023. SPQR: A sparse-quantized representation for near-lossless LLM weight compression. arXiv preprint arXiv:2306.03078 (2023).","journal-title":"arXiv preprint arXiv:2306.03078"},{"key":"e_1_3_2_582_2","article-title":"Outlier suppression+: Accurate quantization of large language models by equivalent and optimal shifting and scaling","author":"Wei Xiuying","year":"2023","unstructured":"Xiuying Wei, Yunchen Zhang, Yuhang Li, Xiangguo Zhang, Ruihao Gong, Jinyang Guo, and Xianglong Liu. 2023. Outlier suppression+: Accurate quantization of large language models by equivalent and optimal shifting and scaling. arXiv preprint arXiv:2304.09145 (2023).","journal-title":"arXiv preprint arXiv:2304.09145"},{"key":"e_1_3_2_583_2","article-title":"OWQ: Lessons learned from activation outliers for weight quantization in large language models","author":"Lee Changhun","year":"2023","unstructured":"Changhun Lee, Jungyu Jin, Taesu Kim, Hyungjun Kim, and Eunhyeok Park. 2023. OWQ: Lessons learned from activation outliers for weight quantization in large language models. arXiv preprint arXiv:2306.02272 (2023).","journal-title":"arXiv preprint arXiv:2306.02272"},{"key":"e_1_3_2_584_2","article-title":"QuIP: 2-bit quantization of large language models with guarantees","volume":"36","author":"Chee Jerry","year":"2024","unstructured":"Jerry Chee, Yaohui Cai, Volodymyr Kuleshov, and Christopher M. De Sa. 2024. QuIP: 2-bit quantization of large language models with guarantees. Advances in Neural Information Processing Systems 36 (2024).","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_3_2_585_2","article-title":"OmniQuant: Omnidirectionally calibrated quantization for large language models","author":"Shao Wenqi","year":"2023","unstructured":"Wenqi Shao, Mengzhao Chen, Zhaoyang Zhang, Peng Xu, Lirui Zhao, Zhiqian Li, Kaipeng Zhang, Peng Gao, Yu Qiao, and Ping Luo. 2023. OmniQuant: Omnidirectionally calibrated quantization for large language models. arXiv preprint arXiv:2308.13137 (2023).","journal-title":"arXiv preprint arXiv:2308.13137"},{"key":"e_1_3_2_586_2","article-title":"Evaluating quantized large language models","author":"Li Shiyao","year":"2024","unstructured":"Shiyao Li, Xuefei Ning, Luning Wang, Tengxuan Liu, Xiangsheng Shi, Shengen Yan, Guohao Dai, Huazhong Yang, and Yu Wang. 2024. Evaluating quantized large language models. arXiv preprint arXiv:2402.18158 (2024).","journal-title":"arXiv preprint arXiv:2402.18158"},{"key":"e_1_3_2_587_2","first-page":"1","volume-title":"Proceedings of the 50th Annual International Symposium on Computer Architecture","author":"Guo Cong","year":"2023","unstructured":"Cong Guo, Jiaming Tang, Weiming Hu, Jingwen Leng, Chen Zhang, Fan Yang, Yunxin Liu, Minyi Guo, and Yuhao Zhu. 2023. OliVe: Accelerating large language models via hardware-friendly outlier-victim pair quantization. In Proceedings of the 50th Annual International Symposium on Computer Architecture. 1\u201315."},{"key":"e_1_3_2_588_2","article-title":"Instruction tuning with GPT-4","author":"Peng Baolin","year":"2023","unstructured":"Baolin Peng, Chunyuan Li, Pengcheng He, Michel Galley, and Jianfeng Gao. 2023. Instruction tuning with GPT-4. arXiv preprint arXiv:2304.03277 (2023).","journal-title":"arXiv preprint arXiv:2304.03277"},{"key":"e_1_3_2_589_2","article-title":"LaMini-LM: A diverse herd of distilled models from large-scale instructions","author":"Wu Minghao","year":"2023","unstructured":"Minghao Wu, Abdul Waheed, Chiyu Zhang, Muhammad Abdul-Mageed, and Alham Fikri Aji. 2023. LaMini-LM: A diverse herd of distilled models from large-scale instructions. arXiv preprint arXiv:2304.14402 (2023).","journal-title":"arXiv preprint arXiv:2304.14402"},{"key":"e_1_3_2_590_2","article-title":"Lion: Adversarial distillation of closed-source large language model","author":"Jiang Yuxin","year":"2023","unstructured":"Yuxin Jiang, Chunkit Chan, Mingyang Chen, and Wei Wang. 2023. Lion: Adversarial distillation of closed-source large language model. arXiv preprint arXiv:2305.12870 (2023).","journal-title":"arXiv preprint arXiv:2305.12870"},{"key":"e_1_3_2_591_2","first-page":"20852","volume-title":"International Conference on Machine Learning","author":"Liang Chen","year":"2023","unstructured":"Chen Liang, Simiao Zuo, Qingru Zhang, Pengcheng He, Weizhu Chen, and Tuo Zhao. 2023. Less is more: Task-aware layer-wise distillation for language model compression. In International Conference on Machine Learning. PMLR, 20852\u201320867."},{"key":"e_1_3_2_592_2","volume-title":"The 12th International Conference on Learning Representations","author":"Agarwal Rishabh","year":"2024","unstructured":"Rishabh Agarwal, Nino Vieillard, Yongchao Zhou, Piotr Stanczyk, Sabela Ramos Garea, Matthieu Geist, and Olivier Bachem. 2024. On-policy distillation of language models: Learning from self-generated mistakes. In The 12th International Conference on Learning Representations."},{"key":"e_1_3_2_593_2","article-title":"Token-scaled logit distillation for ternary weight generative language models","volume":"36","author":"Kim Minsoo","year":"2024","unstructured":"Minsoo Kim, Sihwa Lee, Janghwan Lee, Sukjin Hong, Du-Seong Chang, Wonyong Sung, and Jungwook Choi. 2024. Token-scaled logit distillation for ternary weight generative language models. Advances in Neural Information Processing Systems 36 (2024).","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_3_2_594_2","first-page":"1","volume-title":"SC22: International Conference for High Performance Computing, Networking, Storage and Analysis","author":"Aminabadi Reza Yazdani","year":"2022","unstructured":"Reza Yazdani Aminabadi, Samyam Rajbhandari, Ammar Ahmad Awan, Cheng Li, Du Li, Elton Zheng, Olatunji Ruwase, Shaden Smith, Minjia Zhang, Jeff Rasley, et\u00a0al. 2022. DeepSpeed-inference: Enabling efficient inference of transformer models at unprecedented scale. In SC22: International Conference for High Performance Computing, Networking, Storage and Analysis. IEEE, 1\u201315."},{"key":"e_1_3_2_595_2","article-title":"Fast distributed inference serving for large language models","author":"Wu Bingyang","year":"2023","unstructured":"Bingyang Wu, Yinmin Zhong, Zili Zhang, Gang Huang, Xuanzhe Liu, and Xin Jin. 2023. Fast distributed inference serving for large language models. arXiv preprint arXiv:2305.05920 (2023).","journal-title":"arXiv preprint arXiv:2305.05920"},{"key":"e_1_3_2_596_2","article-title":"\\(S^3\\) : Increasing GPU utilization during generative inference for higher throughput","volume":"36","author":"Jin Yunho","year":"2024","unstructured":"Yunho Jin, Chun-Feng Wu, David Brooks, and Gu-Yeon Wei. 2024. \\(S^3\\) : Increasing GPU utilization during generative inference for higher throughput. Advances in Neural Information Processing Systems 36 (2024).","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_3_2_597_2","article-title":"Splitwise: Efficient generative LLM inference using phase splitting","author":"Patel Pratyush","year":"2023","unstructured":"Pratyush Patel, Esha Choukse, Chaojie Zhang, \u00cd\u00f1igo Goiri, Aashaka Shah, Saeed Maleki, and Ricardo Bianchini. 2023. Splitwise: Efficient generative LLM inference using phase splitting. arXiv preprint arXiv:2311.18677 (2023).","journal-title":"arXiv preprint arXiv:2311.18677"},{"key":"e_1_3_2_598_2","article-title":"DistServe: Disaggregating prefill and decoding for goodput-optimized large language model serving","author":"Zhong Yinmin","year":"2024","unstructured":"Yinmin Zhong, Shengyu Liu, Junda Chen, Jianbo Hu, Yibo Zhu, Xuanzhe Liu, Xin Jin, and Hao Zhang. 2024. DistServe: Disaggregating prefill and decoding for goodput-optimized large language model serving. arXiv preprint arXiv:2401.09670 (2024).","journal-title":"arXiv preprint arXiv:2401.09670"},{"key":"e_1_3_2_599_2","first-page":"42","volume-title":"Proceedings of the 29th ACM SIGPLAN Annual Symposium on Principles and Practice of Parallel Programming","author":"Du Jiangsu","year":"2024","unstructured":"Jiangsu Du, Jinhui Wei, Jiazhi Jiang, Shenggan Cheng, Dan Huang, Zhiguang Chen, and Yutong Lu. 2024. Liger: Interleaving intra-and inter-operator parallelism for distributed large model inference. In Proceedings of the 29th ACM SIGPLAN Annual Symposium on Principles and Practice of Parallel Programming. 42\u201354."},{"key":"e_1_3_2_600_2","first-page":"597","volume-title":"International Conference on Algorithmic Learning Theory","author":"Keles Feyza Duman","year":"2023","unstructured":"Feyza Duman Keles, Pruthuvi Mahesakya Wijewardena, and Chinmay Hegde. 2023. On the computational complexity of self-attention. In International Conference on Algorithmic Learning Theory. PMLR, 597\u2013619."},{"key":"e_1_3_2_601_2","article-title":"LoRA: Low-rank adaptation of large language models","author":"Hu Edward J.","year":"2021","unstructured":"Edward J. Hu, Yelong Shen, Phillip Wallis, Zeyuan Allen-Zhu, Yuanzhi Li, Shean Wang, Lu Wang, and Weizhu Chen. 2021. LoRA: Low-rank adaptation of large language models. arXiv preprint arXiv:2106.09685 (2021).","journal-title":"arXiv preprint arXiv:2106.09685"},{"key":"e_1_3_2_602_2","article-title":"RWKV: Reinventing RNNs for the transformer era","author":"Peng Bo","year":"2023","unstructured":"Bo Peng, Eric Alcaide, Quentin Anthony, Alon Albalak, Samuel Arcadinho, Huanqi Cao, Xin Cheng, Michael Chung, Matteo Grella, Kranthi Kiran GV, et\u00a0al. 2023. RWKV: Reinventing RNNs for the transformer era. arXiv preprint arXiv:2305.13048 (2023).","journal-title":"arXiv preprint arXiv:2305.13048"},{"key":"e_1_3_2_603_2","article-title":"Mamba: Linear-time sequence modeling with selective state spaces","author":"Gu Albert","year":"2023","unstructured":"Albert Gu and Tri Dao. 2023. Mamba: Linear-time sequence modeling with selective state spaces. arXiv preprint arXiv:2312.00752 (2023).","journal-title":"arXiv preprint arXiv:2312.00752"},{"key":"e_1_3_2_604_2","article-title":"Retentive network: A successor to transformer for large language models","author":"Sun Yutao","year":"2023","unstructured":"Yutao Sun, Li Dong, Shaohan Huang, Shuming Ma, Yuqing Xia, Jilong Xue, Jianyong Wang, and Furu Wei. 2023. Retentive network: A successor to transformer for large language models. arXiv preprint arXiv:2307.08621 (2023).","journal-title":"arXiv preprint arXiv:2307.08621"},{"key":"e_1_3_2_605_2","article-title":"Fully neural network based speech recognition on mobile and embedded devices","volume":"31","author":"Park Jinhwan","year":"2018","unstructured":"Jinhwan Park, Yoonho Boo, Iksoo Choi, Sungho Shin, and Wonyong Sung. 2018. Fully neural network based speech recognition on mobile and embedded devices. Advances in Neural Information Processing Systems 31 (2018).","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_3_2_606_2","doi-asserted-by":"crossref","first-page":"103058","DOI":"10.1016\/j.micpro.2020.103058","article-title":"Real time speech recognition algorithm on embedded system based on continuous Markov model","volume":"75","author":"He Yongqiang","year":"2020","unstructured":"Yongqiang He and Xiguang Dong. 2020. Real time speech recognition algorithm on embedded system based on continuous Markov model. Microprocessors and Microsystems 75 (2020), 103058.","journal-title":"Microprocessors and Microsystems"},{"issue":"2","key":"e_1_3_2_607_2","first-page":"392","article-title":"DAC-SDC low power object detection challenge for UAV applications","volume":"43","author":"Xu Xiaowei","year":"2019","unstructured":"Xiaowei Xu, Xinyi Zhang, Bei Yu, Xiaobo Sharon Hu, Christopher Rowen, Jingtong Hu, and Yiyu Shi. 2019. DAC-SDC low power object detection challenge for UAV applications. IEEE Transactions on Pattern Analysis and Machine Intelligence 43, 2 (2019), 392\u2013403.","journal-title":"IEEE Transactions on Pattern Analysis and Machine Intelligence"},{"key":"e_1_3_2_608_2","first-page":"216","article-title":"SkyNet: A hardware-efficient method for object detection and tracking on embedded systems","volume":"2","author":"Zhang Xiaofan","year":"2020","unstructured":"Xiaofan Zhang, Haoming Lu, Cong Hao, Jiachen Li, Bowen Cheng, Yuhong Li, Kyle Rupnow, Jinjun Xiong, Thomas Huang, Honghui Shi, et\u00a0al. 2020. SkyNet: A hardware-efficient method for object detection and tracking on embedded systems. Proceedings of Machine Learning and Systems 2 (2020), 216\u2013229.","journal-title":"Proceedings of Machine Learning and Systems"},{"key":"e_1_3_2_609_2","first-page":"1","volume-title":"2020 57th ACM\/IEEE Design Automation Conference (DAC\u201920)","author":"Baidya Sabur","year":"2020","unstructured":"Sabur Baidya, Yu-Jen Ku, Hengyu Zhao, Jishen Zhao, and Sujit Dey. 2020. Vehicular and edge computing for emerging connected and autonomous vehicle applications. In 2020 57th ACM\/IEEE Design Automation Conference (DAC\u201920). IEEE, 1\u20136."},{"key":"e_1_3_2_610_2","doi-asserted-by":"crossref","first-page":"283","DOI":"10.1145\/3489517.3530444","volume-title":"Proceedings of the 59th ACM\/IEEE Design Automation Conference","author":"Zeng Xiaoming","year":"2022","unstructured":"Xiaoming Zeng, Zhendong Wang, and Yang Hu. 2022. Enabling efficient deep convolutional neural network-based sensor fusion for autonomous driving. In Proceedings of the 59th ACM\/IEEE Design Automation Conference. 283\u2013288."},{"key":"e_1_3_2_611_2","first-page":"675","volume-title":"Proceedings of the 22nd ACM International Conference on Multimedia","author":"Jia Yangqing","year":"2014","unstructured":"Yangqing Jia, Evan Shelhamer, Jeff Donahue, Sergey Karayev, Jonathan Long, Ross Girshick, Sergio Guadarrama, and Trevor Darrell. 2014. Caffe: Convolutional architecture for fast feature embedding. In Proceedings of the 22nd ACM International Conference on Multimedia. 675\u2013678."},{"key":"e_1_3_2_612_2","article-title":"MXNet: A flexible and efficient machine learning library for heterogeneous distributed systems","author":"Chen Tianqi","year":"2015","unstructured":"Tianqi Chen, Mu Li, Yutian Li, Min Lin, Naiyan Wang, Minjie Wang, Tianjun Xiao, Bing Xu, Chiyuan Zhang, and Zheng Zhang. 2015. MXNet: A flexible and efficient machine learning library for heterogeneous distributed systems. arXiv preprint arXiv:1512.01274 (2015).","journal-title":"arXiv preprint arXiv:1512.01274"},{"key":"e_1_3_2_613_2","doi-asserted-by":"crossref","first-page":"97","DOI":"10.1007\/978-1-4842-2766-4_7","article-title":"Introduction to Keras","author":"Ketkar Nikhil","year":"2017","unstructured":"Nikhil Ketkar and Nikhil Ketkar. 2017. Introduction to Keras. Deep Learning with Python: A Hands-on Introduction (2017), 97\u2013111.","journal-title":"Deep Learning with Python: A Hands-on Introduction"},{"key":"e_1_3_2_614_2","doi-asserted-by":"crossref","DOI":"10.1007\/978-1-4842-4297-1","volume-title":"Beginning Machine Learning in iOS","author":"Thakkar Mohit","year":"2019","unstructured":"Mohit Thakkar. 2019. Introduction to core ML framework. In Beginning Machine Learning in iOS."},{"issue":"1","key":"e_1_3_2_615_2","first-page":"105","article-title":"PaddlePaddle: An open-source deep learning platform from industrial practice","volume":"1","author":"Ma Yanjun","year":"2019","unstructured":"Yanjun Ma, Dianhai Yu, Tian Wu, and Haifeng Wang. 2019. PaddlePaddle: An open-source deep learning platform from industrial practice. Frontiers of Data and Computing 1, 1 (2019), 105\u2013115.","journal-title":"Frontiers of Data and Computing"},{"key":"e_1_3_2_616_2","first-page":"50","volume-title":"Proceedings of the ACM Symposium on Cloud Computing (SoCC\u201919)","author":"Dai Jason Jinquan","year":"2019","unstructured":"Jason Jinquan Dai, Yiheng Wang, Xin Qiu, Ding Ding, Yao Zhang, Yanzhang Wang, Xianyan Jia, Cherry Li Zhang, Yan Wan, Zhichao Li, Jiao Wang, Shengsheng Huang, Zhongyuan Wu, Yang Wang, Yuhao Yang, Bowen She, Dongjie Shi, Qi Lu, Kai Huang, and Guoqiong Song. 2019. BigDL: A distributed deep learning framework for big data. In Proceedings of the ACM Symposium on Cloud Computing (SoCC\u201919). 50\u201360. DOI:10.1145\/3357223.3362707"},{"key":"e_1_3_2_617_2","unstructured":"Google. Google Coral Dev Board. Retrieved from https:\/\/coral.ai\/products\/dev-board\/ ([n. d.])."},{"key":"e_1_3_2_618_2","unstructured":"Huawei. Huawei HiKey 970. Retrieved from https:\/\/www.96boards.org\/product\/hikey970 ([n. d.])."},{"key":"e_1_3_2_619_2","unstructured":"Limited Shenzhen Xunlong Software CO.Orange Pi AI Stick Lite. Retrieved from http:\/\/www.orangepi.org\/html\/hardWare\/computerAndMicrocontrollers\/details\/Orange-Pi-AI-Stick-Lite.html ([n. d.])."},{"key":"e_1_3_2_620_2","first-page":"1","volume-title":"Proceedings of the 59th ACM\/IEEE Design Automation Conference","author":"Wang Hanrui","year":"2022","unstructured":"Hanrui Wang, Jiaqi Gu, Yongshan Ding, Zirui Li, Frederic T. Chong, David Z. Pan, and Song Han. 2022. QuantumNAT: Quantum noise-aware training with noise injection, quantization and normalization. In Proceedings of the 59th ACM\/IEEE Design Automation Conference. 1\u20136."},{"key":"e_1_3_2_621_2","article-title":"QuEst: Graph transformer for quantum circuit reliability estimation","author":"Wang Hanrui","year":"2022","unstructured":"Hanrui Wang, Pengyu Liu, Jinglei Cheng, Zhiding Liang, Jiaqi Gu, Zirui Li, Yongshan Ding, Weiwen Jiang, Yiyu Shi, Xuehai Qian, et\u00a0al. 2022. QuEst: Graph transformer for quantum circuit reliability estimation. arXiv preprint arXiv:2210.16724 (2022).","journal-title":"arXiv preprint arXiv:2210.16724"},{"key":"e_1_3_2_622_2","doi-asserted-by":"crossref","first-page":"655","DOI":"10.1145\/3489517.3530495","volume-title":"Proceedings of the 59th ACM\/IEEE Design Automation Conference","author":"Wang Hanrui","year":"2022","unstructured":"Hanrui Wang, Zirui Li, Jiaqi Gu, Yongshan Ding, David Z. Pan, and Song Han. 2022. QOC: Quantum on-chip training with parameter shift and gradient pruning. In Proceedings of the 59th ACM\/IEEE Design Automation Conference. 655\u2013660."},{"key":"e_1_3_2_623_2","article-title":"The de-democratization of AI: Deep learning and the compute divide in artificial intelligence research","author":"Ahmed Nur","year":"2020","unstructured":"Nur Ahmed and Muntasir Wahed. 2020. The de-democratization of AI: Deep learning and the compute divide in artificial intelligence research. arXiv preprint arXiv:2010.15581 (2020).","journal-title":"arXiv preprint arXiv:2010.15581"},{"key":"e_1_3_2_624_2","first-page":"11976","volume-title":"Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition","author":"Liu Zhuang","year":"2022","unstructured":"Zhuang Liu, Hanzi Mao, Chao-Yuan Wu, Christoph Feichtenhofer, Trevor Darrell, and Saining Xie. 2022. A ConvNet for the 2020s. In Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition. 11976\u201311986."},{"key":"e_1_3_2_625_2","article-title":"ConvNeXt V2: Co-designing and scaling ConvNets with masked autoencoders","author":"Woo Sanghyun","year":"2023","unstructured":"Sanghyun Woo, Shoubhik Debnath, Ronghang Hu, Xinlei Chen, Zhuang Liu, In So Kweon, and Saining Xie. 2023. ConvNeXt V2: Co-designing and scaling ConvNets with masked autoencoders. arXiv preprint arXiv:2301.00808 (2023).","journal-title":"arXiv preprint arXiv:2301.00808"},{"key":"e_1_3_2_626_2","article-title":"More ConvNets in the 2020s: Scaling up kernels beyond 51x51 using sparsity","author":"Liu Shiwei","year":"2022","unstructured":"Shiwei Liu, Tianlong Chen, Xiaohan Chen, Xuxi Chen, Qiao Xiao, Boqian Wu, Mykola Pechenizkiy, Decebal Mocanu, and Zhangyang Wang. 2022. More ConvNets in the 2020s: Scaling up kernels beyond 51x51 using sparsity. arXiv preprint arXiv:2207.03620 (2022).","journal-title":"arXiv preprint arXiv:2207.03620"},{"issue":"12","key":"e_1_3_2_627_2","doi-asserted-by":"crossref","first-page":"2295","DOI":"10.1109\/JPROC.2017.2761740","article-title":"Efficient processing of deep neural networks: A tutorial and survey","volume":"105","author":"Sze Vivienne","year":"2017","unstructured":"Vivienne Sze, Yu-Hsin Chen, Tien-Ju Yang, and Joel S. Emer. 2017. Efficient processing of deep neural networks: A tutorial and survey. Proc. IEEE 105, 12 (2017), 2295\u20132329.","journal-title":"Proc. IEEE"},{"key":"e_1_3_2_628_2","first-page":"0","volume-title":"Proceedings of the IEEE\/CVF International Conference on Computer Vision Workshops","author":"Cai Han","year":"2019","unstructured":"Han Cai, Tianzhe Wang, Zhanghao Wu, Kuan Wang, Ji Lin, and Song Han. 2019. On-device image classification with proxyless neural architecture search and quantization-aware fine-tuning. In Proceedings of the IEEE\/CVF International Conference on Computer Vision Workshops. 0\u20130."},{"key":"e_1_3_2_629_2","first-page":"1","volume-title":"2023 IEEE International Conference on Consumer Electronics (ICCE\u201923)","author":"Boragule Abhijeet","year":"2023","unstructured":"Abhijeet Boragule, Kin Choong Yow, and Moongu Jeon. 2023. On-device face authentication system for ATMs and privacy preservation. In 2023 IEEE International Conference on Consumer Electronics (ICCE\u201923). IEEE, 1\u20134."},{"key":"e_1_3_2_630_2","article-title":"On-device real-time hand gesture recognition","author":"Sung George","year":"2021","unstructured":"George Sung, Kanstantsin Sokal, Esha Uboweja, Valentin Bazarevsky, Jonathan Baccash, Eduard Gabriel Bazavan, Chuo-Ling Chang, and Matthias Grundmann. 2021. On-device real-time hand gesture recognition. arXiv preprint arXiv:2111.00038 (2021).","journal-title":"arXiv preprint arXiv:2111.00038"},{"key":"e_1_3_2_631_2","first-page":"2245","volume-title":"Proceedings of the AAAI Conference on Artificial Intelligence","volume":"37","author":"Shi Xiangsheng","year":"2023","unstructured":"Xiangsheng Shi, Xuefei Ning, Lidong Guo, Tianchen Zhao, Enshu Liu, Yi Cai, Yuhan Dong, Huazhong Yang, and Yu Wang. 2023. Memory-oriented structural pruning for efficient image restoration. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 37. 2245\u20132253."},{"key":"e_1_3_2_632_2","article-title":"Blazepose ghum holistic: Real-time 3D human landmarks and pose estimation","author":"Grishchenko Ivan","year":"2022","unstructured":"Ivan Grishchenko, Valentin Bazarevsky, Andrei Zanfir, Eduard Gabriel Bazavan, Mihai Zanfir, Richard Yee, Karthik Raveendran, Matsvei Zhdanovich, Matthias Grundmann, and Cristian Sminchisescu. 2022. Blazepose ghum holistic: Real-time 3D human landmarks and pose estimation. arXiv preprint arXiv:2206.11678 (2022).","journal-title":"arXiv preprint arXiv:2206.11678"},{"key":"e_1_3_2_633_2","first-page":"4894","volume-title":"Proceedings of the IEEE International Conference on Computer Vision","author":"Yao Ting","year":"2017","unstructured":"Ting Yao, Yingwei Pan, Yehao Li, Zhaofan Qiu, and Tao Mei. 2017. Boosting image captioning with attributes. In Proceedings of the IEEE International Conference on Computer Vision. 4894\u20134902."},{"key":"e_1_3_2_634_2","first-page":"4651","volume-title":"Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition","author":"You Quanzeng","year":"2016","unstructured":"Quanzeng You, Hailin Jin, Zhaowen Wang, Chen Fang, and Jiebo Luo. 2016. Image captioning with semantic attention. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 4651\u20134659."},{"key":"e_1_3_2_635_2","first-page":"1","volume-title":"Proceedings of the 50th Annual International Symposium on Computer Architecture","author":"Fu Yonggan","year":"2023","unstructured":"Yonggan Fu, Zhifan Ye, Jiayi Yuan, Shunyao Zhang, Sixu Li, Haoran You, and Yingyan Lin. 2023. Gen-NeRF: Efficient and generalizable neural radiance fields via algorithm-hardware co-design. In Proceedings of the 50th Annual International Symposium on Computer Architecture. 1\u201312."},{"key":"e_1_3_2_636_2","first-page":"1207","volume-title":"Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition","author":"Tang Yansong","year":"2019","unstructured":"Yansong Tang, Dajun Ding, Yongming Rao, Yu Zheng, Danyang Zhang, Lili Zhao, Jiwen Lu, and Jie Zhou. 2019. COIN: A large-scale dataset for comprehensive instructional video analysis. In Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition. 1207\u20131216."},{"key":"e_1_3_2_637_2","doi-asserted-by":"crossref","first-page":"187","DOI":"10.1146\/annurev-control-060117-105157","article-title":"Planning and decision-making for autonomous vehicles","volume":"1","author":"Schwarting Wilko","year":"2018","unstructured":"Wilko Schwarting, Javier Alonso-Mora, and Daniela Rus. 2018. Planning and decision-making for autonomous vehicles. Annual Review of Control, Robotics, and Autonomous Systems 1 (2018), 187\u2013210.","journal-title":"Annual Review of Control, Robotics, and Autonomous Systems"},{"key":"e_1_3_2_638_2","doi-asserted-by":"crossref","first-page":"103116","DOI":"10.1016\/j.jvcir.2021.103116","article-title":"A review of video surveillance systems","volume":"77","author":"Elharrouss Omar","year":"2021","unstructured":"Omar Elharrouss, Noor Almaadeed, and Somaya Al-Maadeed. 2021. A review of video surveillance systems. Journal of Visual Communication and Image Representation 77 (2021), 103116.","journal-title":"Journal of Visual Communication and Image Representation"},{"issue":"11","key":"e_1_3_2_639_2","doi-asserted-by":"crossref","first-page":"922","DOI":"10.1038\/s42256-022-00549-6","article-title":"Development of metaverse for intelligent healthcare","volume":"4","author":"Wang Ge","year":"2022","unstructured":"Ge Wang, Andreu Badal, Xun Jia, Jonathan S. Maltz, Klaus Mueller, Kyle J. Myers, Chuang Niu, Michael Vannier, Pingkun Yan, Zhou Yu, et\u00a0al. 2022. Development of metaverse for intelligent healthcare. Nature Machine Intelligence 4, 11 (2022), 922\u2013929.","journal-title":"Nature Machine Intelligence"},{"key":"e_1_3_2_640_2","first-page":"1","volume-title":"Proceedings of the 2nd ACM\/IEEE Symposium on Edge Computing","author":"Drolia Utsav","year":"2017","unstructured":"Utsav Drolia, Katherine Guo, and Priya Narasimhan. 2017. Precog: Prefetching for image recognition applications at the edge. In Proceedings of the 2nd ACM\/IEEE Symposium on Edge Computing. 1\u201313."},{"key":"e_1_3_2_641_2","article-title":"YOLv3: An incremental improvement","author":"Redmon Joseph","year":"2018","unstructured":"Joseph Redmon and Ali Farhadi. 2018. YOLv3: An incremental improvement. arXiv preprint arXiv:1804.02767 (2018).","journal-title":"arXiv preprint arXiv:1804.02767"},{"key":"e_1_3_2_642_2","doi-asserted-by":"crossref","first-page":"7","DOI":"10.1145\/3384613.3384619","volume-title":"Proceedings of the 2020 12th International Conference on Computer and Automation Engineering","author":"Anwar Muhammad Waseem","year":"2020","unstructured":"Muhammad Waseem Anwar, Imran Ahsan, Farooque Azam, Wasi Haider Butt, and Muhammad Rashid. 2020. A natural language processing (NLP) framework for embedded systems to automatically extract verification aspects from textual design requirements. In Proceedings of the 2020 12th International Conference on Computer and Automation Engineering. 7\u201312."},{"key":"e_1_3_2_643_2","doi-asserted-by":"crossref","first-page":"360","DOI":"10.1007\/978-3-030-80091-8_42","volume-title":"Advances in Usability, User Experience, Wearable and Assistive Technology: Proceedings of the AHFE 2021 Virtual Conferences on Usability and User Experience, Human Factors and Wearable Technologies, Human Factors in Virtual Environments and Game Design, and Human Factors and Assistive Technology, July 25\u201329, 2021, USA","author":"Zhou Jin","year":"2021","unstructured":"Jin Zhou and Meiyu Zhou. 2021. Sentiment analysis of elderly wearable device users based on text mining. In Advances in Usability, User Experience, Wearable and Assistive Technology: Proceedings of the AHFE 2021 Virtual Conferences on Usability and User Experience, Human Factors and Wearable Technologies, Human Factors in Virtual Environments and Game Design, and Human Factors and Assistive Technology, July 25\u201329, 2021, USA. Springer, 360\u2013365."},{"issue":"07","key":"e_1_3_2_644_2","article-title":"Mental health monitoring using sentiment analysis","volume":"7","author":"Shah Aagam","year":"2020","unstructured":"Aagam Shah, Rohan Shah, Praneeta Desai, and Chirag Desai. 2020. Mental health monitoring using sentiment analysis. International Research Journal of Engineering and Technology (IRJET) 7, 07 (2020), 2395\u20130056.","journal-title":"International Research Journal of Engineering and Technology (IRJET)"},{"key":"e_1_3_2_645_2","first-page":"1","volume-title":"2020 57th ACM\/IEEE Design Automation Conference (DAC\u201920)","author":"Dong Peiyan","year":"2020","unstructured":"Peiyan Dong, Siyue Wang, Wei Niu, Chengming Zhang, Sheng Lin, Zhengang Li, Yifan Gong, Bin Ren, Xue Lin, and Dingwen Tao. 2020. RTMobile: Beyond real-time mobile acceleration of RNNs for speech recognition. In 2020 57th ACM\/IEEE Design Automation Conference (DAC\u201920). IEEE, 1\u20136."},{"issue":"3","key":"e_1_3_2_646_2","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1145\/3510028","article-title":"Reduced memory Viterbi decoding for hardware-accelerated speech recognition","volume":"21","author":"Raj Pani Prithvi","year":"2022","unstructured":"Pani Prithvi Raj, Pakala Akhil Reddy, and Nitin Chandrachoodan. 2022. Reduced memory Viterbi decoding for hardware-accelerated speech recognition. ACM Transactions on Embedded Computing Systems (TECS) 21, 3 (2022), 1\u201318.","journal-title":"ACM Transactions on Embedded Computing Systems (TECS)"},{"key":"e_1_3_2_647_2","first-page":"1557","volume-title":"Proceedings of the 2019 on Designing Interactive Systems Conference","author":"Cho Minji","year":"2019","unstructured":"Minji Cho, Sang-su Lee, and Kun-Pyo Lee. 2019. Once a kind friend is now a thing: Understanding how conversational agents at home are forgotten. In Proceedings of the 2019 on Designing Interactive Systems Conference. 1557\u20131569."},{"key":"e_1_3_2_648_2","unstructured":"Apple Incorporation. 2010. Siri. (2010). Retrieved from https:\/\/www.apple.com\/siri\/"},{"key":"e_1_3_2_649_2","article-title":"Fairseq S2T: Fast speech-to-text modeling with fairseq","author":"Wang Changhan","year":"2020","unstructured":"Changhan Wang, Yun Tang, Xutai Ma, Anne Wu, Sravya Popuri, Dmytro Okhonko, and Juan Pino. 2020. Fairseq S2T: Fast speech-to-text modeling with fairseq. arXiv preprint arXiv:2010.05171 (2020).","journal-title":"arXiv preprint arXiv:2010.05171"},{"key":"e_1_3_2_650_2","first-page":"1","volume-title":"The 25th Annual International Conference on Mobile Computing and Networking","author":"Hou Jiahui","year":"2019","unstructured":"Jiahui Hou, Xiang-Yang Li, Peide Zhu, Zefan Wang, Yu Wang, Jianwei Qian, and Panlong Yang. 2019. Signspeaker: A real-time, high-precision smartwatch-based sign language translator. In The 25th Annual International Conference on Mobile Computing and Networking. 1\u201315."},{"key":"e_1_3_2_651_2","doi-asserted-by":"crossref","first-page":"3787","DOI":"10.18653\/v1\/2020.acl-main.350","volume-title":"Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics","author":"Ren Yi","year":"2020","unstructured":"Yi Ren, Jinglin Liu, Xu Tan, Chen Zhang, Tao Qin, Zhou Zhao, and Tie-Yan Liu. 2020. SimulSpeech: End-to-end simultaneous speech to text translation. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. 3787\u20133796."},{"key":"e_1_3_2_652_2","doi-asserted-by":"crossref","first-page":"1496","DOI":"10.1109\/WACV.2019.00164","volume-title":"2019 IEEE Winter Conference on Applications of Computer Vision (WACV\u201919)","author":"Chowdhuri Sauhaarda","year":"2019","unstructured":"Sauhaarda Chowdhuri, Tushar Pankaj, and Karl Zipser. 2019. MultiNet: Multi-modal multi-task learning for autonomous driving. In 2019 IEEE Winter Conference on Applications of Computer Vision (WACV\u201919). IEEE, 1496\u20131504."}],"container-title":["ACM Transactions on Embedded Computing Systems"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3701728","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3701728","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,19]],"date-time":"2025-06-19T01:57:16Z","timestamp":1750298236000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3701728"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,12,10]]},"references-count":651,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2025,1,31]]}},"alternative-id":["10.1145\/3701728"],"URL":"https:\/\/doi.org\/10.1145\/3701728","relation":{},"ISSN":["1539-9087","1558-3465"],"issn-type":[{"value":"1539-9087","type":"print"},{"value":"1558-3465","type":"electronic"}],"subject":[],"published":{"date-parts":[[2024,12,10]]},"assertion":[{"value":"2023-10-30","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2024-10-17","order":2,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2024-12-10","order":3,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}