{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,6,17]],"date-time":"2026-06-17T05:11:37Z","timestamp":1781673097612,"version":"3.54.5"},"reference-count":159,"publisher":"Association for Computing Machinery (ACM)","issue":"3","license":[{"start":{"date-parts":[[2024,11,22]],"date-time":"2024-11-22T00:00:00Z","timestamp":1732233600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Comput. Surv."],"published-print":{"date-parts":[[2025,3,31]]},"abstract":"<jats:p>Deep neural networks (DNNs) typically have a single exit point that makes predictions by running the entire stack of neural layers. Since not all inputs require the same amount of computation to reach a confident prediction, recent research has focused on incorporating multiple \u201cexits\u201d into the conventional DNN architecture. Early-exit DNNs are multi-exit neural networks that attach many side branches to the conventional DNN, enabling inference to stop early at intermediate points. This approach offers several advantages, including speeding up the inference process, mitigating the vanishing gradients problems, reducing overfitting and overthinking tendencies. It also supports DNN partitioning across devices and is ideal for multi-tier computation platforms such as edge computing. This article decomposes the early-exit DNN architecture and reviews the recent advances in the field. The study explores its benefits, designs, training strategies, and adaptive inference mechanisms. Various design challenges, application scenarios, and future directions are also extensively discussed.<\/jats:p>","DOI":"10.1145\/3698767","type":"journal-article","created":{"date-parts":[[2024,10,7]],"date-time":"2024-10-07T10:18:29Z","timestamp":1728296309000},"page":"1-37","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":54,"title":["Early-Exit Deep Neural Network - A Comprehensive Survey"],"prefix":"10.1145","volume":"57","author":[{"ORCID":"https:\/\/orcid.org\/0000-0003-1128-5964","authenticated-orcid":false,"given":"Haseena","family":"Rahmath P","sequence":"first","affiliation":[{"name":"Bennett University, Noida, India"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-2064-5805","authenticated-orcid":false,"given":"Vishal","family":"Srivastava","sequence":"additional","affiliation":[{"name":"Computer Science and Engineering, Motilal Nehru National Institute of Technology, Allahabad, India"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-3471-782X","authenticated-orcid":false,"given":"Kuldeep","family":"Chaurasia","sequence":"additional","affiliation":[{"name":"Bennett University, Noida, India"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-6763-7255","authenticated-orcid":false,"given":"Roberto G.","family":"Pacheco","sequence":"additional","affiliation":[{"name":"Universidade Federal Fluminense, Rio das Ostras, Brazil"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-6921-7756","authenticated-orcid":false,"given":"Rodrigo S.","family":"Couto","sequence":"additional","affiliation":[{"name":"GTA\/PEE-COPPE, Universidade Federal do Rio de Janeiro, Rio de Janeiro, Brazil"}],"role":[{"vocabulary":"crossref","role":"author"}]}],"member":"320","published-online":{"date-parts":[[2024,11,22]]},"reference":[{"key":"e_1_3_1_2_2","unstructured":"Adrian Galdran Aitor Alvarez-Gila Maria Ines Meyer Cristina L. Saratxaga Teresa Ara\u00fajo Estibaliz Garrote Guilherme Aresta Pedro Costa A. M. Mendon\u00e7a and Aur\u00e9lio Campilho. 2017. Data-driven color augmentation techniques for deep skin image analysis. arXiv preprint arXiv:1703.03702 (2017)."},{"key":"e_1_3_1_3_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.ins.2020.02.041"},{"key":"e_1_3_1_4_2","volume-title":"Proceedings of the NeurIPS 2022 Workshop on Score-Based Methods","author":"Bao Fan","year":"2022","unstructured":"Fan Bao, Chongxuan Li, Yue Cao, and Jun Zhu. 2022. All are worth words: A ViT backbone for score-based diffusion models. In Proceedings of the NeurIPS 2022 Workshop on Score-Based Methods. Neural Information Processing Systems Foundation, New Orleans, LA, USA."},{"key":"e_1_3_1_5_2","doi-asserted-by":"publisher","DOI":"10.1007\/s10462-021-09975-1"},{"key":"e_1_3_1_6_2","unstructured":"Eugene Belilovsky Michael Eickenberg and Edouard Oyallon. 2019. Shallow learning for deep networks. Submitted to the International Conference on Learning Representations (ICLR'19)."},{"key":"e_1_3_1_7_2","first-page":"583","volume-title":"Proceedings of the International Conference on Machine Learning","author":"Belilovsky Eugene","year":"2019","unstructured":"Eugene Belilovsky, Michael Eickenberg, and Edouard Oyallon. 2019. Greedy layerwise learning can scale to imagenet. In Proceedings of the International Conference on Machine Learning. PMLR, Long Beach, CA, USA, 583\u2013593."},{"key":"e_1_3_1_8_2","first-page":"153","volume-title":"Proceedings of the Advances in Neural Information Processing Systems 19","author":"Bengio Yoshua","year":"2006","unstructured":"Yoshua Bengio, Pascal Lamblin, Dan Popovici, and Hugo Larochelle. 2006. Greedy layer-wise training of deep networks. In Proceedings of the Advances in Neural Information Processing Systems 19. MIT Press, Vancouver, British Columbia, Canada, 153\u2013160."},{"key":"e_1_3_1_9_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-030-30484-3_26"},{"key":"e_1_3_1_10_2","first-page":"527","volume-title":"Proceedings of the International Conference on Machine Learning","author":"Bolukbasi Tolga","year":"2017","unstructured":"Tolga Bolukbasi, Joseph Wang, Ofer Dekel, and Venkatesh Saligrama. 2017. Adaptive neural networks for efficient inference. In Proceedings of the International Conference on Machine Learning. PMLR, Sydney, Australia, 527\u2013536."},{"key":"e_1_3_1_11_2","volume-title":"Proceedings of the NIPS 2017 Workshop on Optimization","author":"Brock Andrew","year":"2017","unstructured":"Andrew Brock, Theodore Lim, James Millar Ritchie, and Nicholas J. Weston. 2017. FreezeOut: Accelerate training by progressively freezing layers. In Proceedings of the NIPS 2017 Workshop on Optimization. Online, Long Beach, CA, United States, 5."},{"key":"e_1_3_1_12_2","volume-title":"Proceedings of the Advances in Neural Information Processing Systems","volume":"13","author":"Caruana Rich","year":"2000","unstructured":"Rich Caruana, Steve Lawrence, and C. Giles. 2000. Overfitting in neural nets: Backpropagation, conjugate gradient, and early stopping. In Proceedings of the Advances in Neural Information Processing Systems, T. Leen, T. Dietterich, and V. Tresp (Eds.). Vol. 13. MIT Press, Denver, CO, USA."},{"key":"e_1_3_1_13_2","first-page":"2","volume-title":"Proceedings of the 11th International Workshop on Spoken Language Translation: Evaluation Campaign","author":"Cettolo Mauro","year":"2014","unstructured":"Mauro Cettolo, Jan Niehues, Sebastian St\u00fcker, Luisa Bentivogli, and Marcello Federico. 2014. Report on the 11th IWSLT evaluation campaign. In Proceedings of the 11th International Workshop on Spoken Language Translation: Evaluation Campaign. International Workshop on Spoken Language Translation (IWSLT), Lake Tahoe, CA, USA, 2\u201317. IWSLT 2014."},{"key":"e_1_3_1_14_2","doi-asserted-by":"publisher","DOI":"10.1109\/JPROC.2019.2921977"},{"key":"e_1_3_1_15_2","first-page":"1520","volume-title":"Proceedings of the International Conference on Machine Learning","author":"Chen Xinshi","year":"2020","unstructured":"Xinshi Chen, Hanjun Dai, Yu Li, Xin Gao, and Le Song. 2020. Learning to stop while learning to predict. In Proceedings of the International Conference on Machine Learning. PMLR, Vienna, Austria, 1520\u20131530. ICML 2020."},{"key":"e_1_3_1_16_2","unstructured":"Ching-Hao Chiu Hao-Wei Chung Yu-Jen Chen Yiyu Shi and Tsung-Yi Ho. 2023. Fair multi-exit framework for facial attribute classification. arXiv preprint arXiv:2301.02989 (2023)."},{"key":"e_1_3_1_17_2","doi-asserted-by":"publisher","DOI":"10.1145\/3340531.3411973"},{"key":"e_1_3_1_18_2","volume-title":"Early-exit Convolutional Neural Networks","author":"Demir Edanur","year":"2019","unstructured":"Edanur Demir. 2019. Early-exit Convolutional Neural Networks. Master\u2019s thesis. Middle East Technical University."},{"key":"e_1_3_1_19_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICASSP.2013.6639344"},{"key":"e_1_3_1_20_2","doi-asserted-by":"publisher","DOI":"10.1109\/QoMEX.2016.7498955"},{"key":"e_1_3_1_21_2","doi-asserted-by":"publisher","DOI":"10.1145\/3517206.3526270"},{"key":"e_1_3_1_22_2","first-page":"1","volume-title":"Proceedings of the ICLR 2020-8th International Conference on Learning Representations","author":"Elbayad Maha","year":"2020","unstructured":"Maha Elbayad, Jiatao Gu, Edouard Grave, and Michael Auli. 2020. Depth-adaptive Transformer. In Proceedings of the ICLR 2020-8th International Conference on Learning Representations. OpenReview.net, Addis Ababa, Ethiopia, 1\u201314."},{"key":"e_1_3_1_23_2","doi-asserted-by":"publisher","DOI":"10.1109\/SEC50012.2020.00014"},{"key":"e_1_3_1_24_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2017.194"},{"key":"e_1_3_1_25_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.neucom.2023.126690"},{"key":"e_1_3_1_26_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2021.emnlp-main.583"},{"key":"e_1_3_1_27_2","doi-asserted-by":"publisher","DOI":"10.1609\/aaai.v34i05.6282"},{"key":"e_1_3_1_28_2","doi-asserted-by":"publisher","DOI":"10.1145\/3587135.3592204"},{"key":"e_1_3_1_29_2","doi-asserted-by":"publisher","DOI":"10.1109\/LATINCOM59467.2023.10361853"},{"key":"e_1_3_1_30_2","first-page":"249","volume-title":"Proceedings of the 13th International Conference on Artificial Intelligence and Statistics","author":"Glorot Xavier","year":"2010","unstructured":"Xavier Glorot and Yoshua Bengio. 2010. Understanding the difficulty of training deep feedforward neural networks. In Proceedings of the 13th International Conference on Artificial Intelligence and Statistics. JMLR Workshop and Conference Proceedings, Chia Laguna Resort, Sardinia, Italy, 249\u2013256."},{"key":"e_1_3_1_31_2","doi-asserted-by":"publisher","DOI":"10.1109\/IJCNN55064.2022.9891952"},{"key":"e_1_3_1_32_2","volume-title":"Proceedings of the NIPS 2016 Deep Learning Symposium","author":"Graves Alex","year":"2016","unstructured":"Alex Graves. 2016. Adaptive computation time for recurrent neural networks. In Proceedings of the NIPS 2016 Deep Learning Symposium. Neural Information Processing Systems Foundation, Barcelona, Spain, 19."},{"key":"e_1_3_1_33_2","doi-asserted-by":"publisher","DOI":"10.5555\/3304889.3304963"},{"key":"e_1_3_1_34_2","doi-asserted-by":"publisher","DOI":"10.1109\/IC2E.2018.00042"},{"key":"e_1_3_1_35_2","doi-asserted-by":"publisher","DOI":"10.1109\/COMST.2015.2495297"},{"key":"e_1_3_1_36_2","doi-asserted-by":"publisher","DOI":"10.1109\/TNNLS.2023.3328643"},{"key":"e_1_3_1_37_2","volume-title":"Proceedings of the ICLR 2016-4th International Conference on Learning Representations","author":"Han Song","year":"2016","unstructured":"Song Han, Huizi Mao, and William J. Dally. 2016. Deep compression: Compressing deep neural networks with pruning, trained quantization and Huffman coding. In Proceedings of the ICLR 2016-4th International Conference on Learning Representations. International Conference on Learning Representations (ICLR), San Juan, Puerto Rico."},{"key":"e_1_3_1_38_2","doi-asserted-by":"publisher","DOI":"10.1109\/TPAMI.2021.3117837"},{"key":"e_1_3_1_39_2","first-page":"444","volume-title":"Proceedings of the International Conference on Intelligent Systems Design and Applications","author":"Rahmath P. Haseena","year":"2023","unstructured":"P. Haseena Rahmath and Kuldeep Chaurasia. 2023. Adaptive early-exit inference in graph neural networks based hyperspectral image classification. In Proceedings of the International Conference on Intelligent Systems Design and Applications. Springer, 444\u2013453."},{"key":"e_1_3_1_40_2","doi-asserted-by":"publisher","DOI":"10.1080\/01431161.2024.2370501"},{"key":"e_1_3_1_41_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-981-19-7615-5_5"},{"key":"e_1_3_1_42_2","unstructured":"Jianing He Qi Zhang Weiping Ding Duoqian Miao Jun Zhao Liang Hu and Longbing Cao. 2024. DE3-BERT: Distance-enhanced early exiting for BERT based on prototypical networks. arXiv preprint arXiv:2402.05948 (2024)."},{"key":"e_1_3_1_43_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.2015.123"},{"key":"e_1_3_1_44_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2016.90"},{"key":"e_1_3_1_45_2","doi-asserted-by":"publisher","DOI":"10.1007\/s10710-017-9314-z"},{"key":"e_1_3_1_46_2","doi-asserted-by":"publisher","DOI":"10.3389\/frsc.2021.675889"},{"key":"e_1_3_1_47_2","unstructured":"Chris Hettinger Tanner Christensen Ben Ehlert Jeffrey Humpherys Tyler J. Jarvis and Sean Wade. 2017. Forward thinking: Building and training neural networks one layer at a time. arXiv preprint arXiv:1706.02480 (2017)."},{"key":"e_1_3_1_48_2","doi-asserted-by":"crossref","unstructured":"Geoffrey Hinton Li Deng Dong Yu George E. Dahl Abdel-rahman Mohamed Navdeep Jaitly Andrew Senior Vincent Vanhoucke Patrick Nguyen Tara N. Sainath and Brian Kingsbury. 2012. Deep neural networks for acoustic modeling in speech recognition: The shared views of four research groups. IEEE Signal Processing Magazine 29 6 (2012) 82--97.","DOI":"10.1109\/MSP.2012.2205597"},{"key":"e_1_3_1_49_2","unstructured":"Geoffrey E. Hinton Oriol Vinyals and Jeffrey Dean. 2015. Distilling the knowledge in a neural network. arXiv preprint arXiv:1503.02531 (2015)."},{"key":"e_1_3_1_50_2","unstructured":"Andrew G. Howard Menglong Zhu Bo Chen Dmitry Kalenichenko Weijun Wang Tobias Weyand Marco Andreetto and Hartwig Adam. 2017. MobileNets: Efficient convolutional neural networks for mobile vision applications. arXiv preprint arXiv:1704.04861 (2017)."},{"key":"e_1_3_1_51_2","unstructured":"Gao Huang Danlu Chen Tianhong Li Felix Wu Laurens Van Der Maaten and Kilian Q. Weinberger. 2017. Multi-scale dense networks for resource efficient image classification. arXiv preprint arXiv:1703.09844 (2017)."},{"key":"e_1_3_1_52_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2017.243"},{"key":"e_1_3_1_53_2","volume-title":"Proceedings of the 2024 IEEE Winter Conference on Applications of Computer Vision, WACV 2024","unstructured":"Fatih Ilhan, Ka-Ho Chow, Sihao Hu, Tiansheng Huang, Selim Tekin, Wenqi Wei, Yanzhao Wu, Myungjin Lee, Ramana Kompella, Hugo Latapie, Gaowen Liu, and Ling Liu. 2024. Adaptive deep neural network inference optimization with EENet. In Proceedings of the 2024 IEEE Winter Conference on Applications of Computer Vision, WACV 2024. IEEE\/CVF, Waikoloa, Hawaii, 10."},{"key":"e_1_3_1_54_2","doi-asserted-by":"publisher","DOI":"10.1145\/3543873.3587370"},{"key":"e_1_3_1_55_2","first-page":"448","volume-title":"Proceedings of the International Conference on Machine Learning","author":"Ioffe Sergey","year":"2015","unstructured":"Sergey Ioffe and Christian Szegedy. 2015. Batch normalization: Accelerating deep network training by reducing internal covariate shift. In Proceedings of the International Conference on Machine Learning. PMLR, Lille, France, 448\u2013456."},{"key":"e_1_3_1_56_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2018.00286"},{"key":"e_1_3_1_57_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.adhoc.2019.101913"},{"key":"e_1_3_1_58_2","doi-asserted-by":"publisher","DOI":"10.1145\/3394171.3413701"},{"key":"e_1_3_1_59_2","doi-asserted-by":"publisher","DOI":"10.1109\/TCAD.2023.3282579"},{"key":"e_1_3_1_60_2","doi-asserted-by":"publisher","DOI":"10.1145\/3459637.3482335"},{"key":"e_1_3_1_61_2","doi-asserted-by":"publisher","DOI":"10.1145\/3576915.3623069"},{"key":"e_1_3_1_62_2","first-page":"12032","volume-title":"Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR)","author":"Karpikova Polina","year":"2023","unstructured":"Polina Karpikova, Ekaterina Radionova, Anastasia Yaschenko, Andrei Spiridonov, Leonid Kostyushko, Riccardo Fabbricatore, and Aleksei Ivakhnenko. 2023. FIANCEE: Faster inference of adversarial networks via conditional early exits. In Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, New Orleans, LA, USA, 12032\u201312043."},{"key":"e_1_3_1_63_2","first-page":"3301","volume-title":"Proceedings of the International Conference on Machine Learning","author":"Kaya Yigitcan","year":"2019","unstructured":"Yigitcan Kaya, Sanghyun Hong, and Tudor Dumitras. 2019. Shallow-deep networks: Understanding and mitigating network overthinking. In Proceedings of the International Conference on Machine Learning. PMLR, Long Beach, CA, USA, 3301\u20133310."},{"key":"e_1_3_1_64_2","first-page":"2","volume-title":"Proceedings of NAACL-HLT","volume":"1","author":"Kenton Jacob Devlin Ming-Wei Chang","year":"2019","unstructured":"Jacob Devlin Ming-Wei Chang Kenton and Lee Kristina Toutanova. 2019. Bert: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of NAACL-HLT, Vol. 1. ACL, Minneapolis, MN, USA, 2."},{"key":"e_1_3_1_65_2","volume-title":"Proceedings of the ICLR 2016-4th International Conference on Learning Representations","author":"Kim Yong-Deok","year":"2016","unstructured":"Yong-Deok Kim, Eunhyeok Park, Sungjoo Yoo, Taelim Choi, Lu Yang, and Dongjun Shin. 2016. Compression of deep convolutional neural networks for fast and low power mobile applications. In Proceedings of the ICLR 2016-4th International Conference on Learning Representations. International Conference on Learning Representations (ICLR), San Juan, Puerto Rico."},{"key":"e_1_3_1_66_2","doi-asserted-by":"publisher","DOI":"10.1145\/3065386"},{"key":"e_1_3_1_67_2","doi-asserted-by":"publisher","DOI":"10.1109\/ACCESS.2022.3187002"},{"key":"e_1_3_1_68_2","volume-title":"Proceedings of the International Conference on Learning Representations","author":"Lan Zhenzhong","year":"2020","unstructured":"Zhenzhong Lan, Mingda Chen, Sebastian Goodman, Kevin Gimpel, Piyush Sharma, and Radu Soricut. 2020. ALBERT: A lite BERT for self-supervised learning of language representations. In Proceedings of the International Conference on Learning Representations. OpenReview.net, 10."},{"key":"e_1_3_1_69_2","unstructured":"Hugo Larochelle Yoshua Bengio J\u00e9r\u00f4me Louradour and Pascal Lamblin. 2009. Exploring strategies for training deep neural networks. Journal of Machine Learning Research 10 1 (2009) 1--40."},{"key":"e_1_3_1_70_2","doi-asserted-by":"publisher","DOI":"10.1145\/3469116.3470012"},{"key":"e_1_3_1_71_2","doi-asserted-by":"publisher","DOI":"10.1145\/3372224.3419194"},{"key":"e_1_3_1_72_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.engappai.2023.106035"},{"key":"e_1_3_1_73_2","doi-asserted-by":"publisher","DOI":"10.1109\/5.726791"},{"key":"e_1_3_1_74_2","first-page":"9","volume-title":"Proceedings of the Neural Networks: Tricks of the Trade","author":"LeCun Yann","year":"2002","unstructured":"Yann LeCun, L\u00e9on Bottou, Genevieve B Orr, and Klaus-Robert M\u00fcller. 2002. Efficient backprop. In Proceedings of the Neural Networks: Tricks of the Trade. Springer, New York, NY, USA, 9\u201350."},{"key":"e_1_3_1_75_2","series-title":"Proceedings of Machine Learning Research","first-page":"562","volume-title":"Proceedings of the 18th International Conference on Artificial Intelligence and Statistics","volume":"38","author":"Lee Chen-Yu","year":"2015","unstructured":"Chen-Yu Lee, Saining Xie, Patrick Gallagher, Zhengyou Zhang, and Zhuowen Tu. 2015. Deeply-supervised nets. In Proceedings of the 18th International Conference on Artificial Intelligence and Statistics, Guy Lebanon and S. V. N. Vishwanathan (Eds.). Proceedings of Machine Learning Research, Vol. 38, PMLR, San Diego, CA, USA, 562\u2013570."},{"key":"e_1_3_1_76_2","doi-asserted-by":"publisher","DOI":"10.1145\/3446382.3448359"},{"key":"e_1_3_1_77_2","doi-asserted-by":"crossref","unstructured":"Sam Leroux Steven Bohez Elias De Coninck Tim Verbelen Bert Vankeirsbilck Pieter Simoens and Bart Dhoedt. 2017. The cascading neural network: Building the internet of smart things. Knowledge and Information Systems 52 3 (2017) 791--814.","DOI":"10.1007\/s10115-017-1029-1"},{"key":"e_1_3_1_78_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.2019.00198"},{"key":"e_1_3_1_79_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR46437.2021.00774"},{"key":"e_1_3_1_80_2","doi-asserted-by":"publisher","DOI":"10.1109\/DAC56929.2023.10247701"},{"key":"e_1_3_1_81_2","doi-asserted-by":"crossref","unstructured":"Shaoshan Liu Liangkai Liu Jie Tang Bo Yu Yifan Wang and Weisong Shi. 2019. Edge computing for autonomous driving: Opportunities and challenges. Proc. IEEE 107 8 (2019) 1697--1716.","DOI":"10.1109\/JPROC.2019.2915983"},{"key":"e_1_3_1_82_2","doi-asserted-by":"publisher","DOI":"10.1109\/TVLSI.2022.3171308"},{"key":"e_1_3_1_83_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-031-02380-4_7"},{"key":"e_1_3_1_84_2","volume-title":"Proceedings of the International Conference on Learning Representations","author":"Liu Hanxiao","year":"2019","unstructured":"Hanxiao Liu, Karen Simonyan, and Yiming Yang. 2019. DARTS: Differentiable architecture search. In Proceedings of the International Conference on Learning Representations. OpenReview, New Orleans, LA, USA, 13."},{"key":"e_1_3_1_85_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2020.acl-main.537"},{"key":"e_1_3_1_86_2","first-page":"1952","volume-title":"Proceedings of the 27th International Conference on Computational Linguistics","author":"Liu Xin","year":"2018","unstructured":"Xin Liu, Qingcai Chen, Chong Deng, Huajun Zeng, Jing Chen, Dongfang Li, and Buzhou Tang. 2018. Lcqmc: A large-scale chinese question matching corpus. In Proceedings of the 27th International Conference on Computational Linguistics. ACL, Santa Fe, NM, USA, 1952\u20131962."},{"key":"e_1_3_1_87_2","doi-asserted-by":"crossref","unstructured":"Xianggen Liu Lili Mou Haotian Cui Zhengdong Lu and Sen Song. 2020. Finding decision jumps in text classification. Neurocomputing 371 C (2020) 177--187.","DOI":"10.1016\/j.neucom.2019.08.082"},{"key":"e_1_3_1_88_2","doi-asserted-by":"publisher","DOI":"10.1145\/3587464"},{"key":"e_1_3_1_89_2","unstructured":"Yinhan Liu Myle Ott Naman Goyal Jingfei Du Mandar Joshi Danqi Chen Omer Levy Mike Lewis Luke Zettlemoyer and Veselin Stoyanov. 2019. Roberta: A robustly optimized BERT pretraining approach. arXiv preprint arXiv:1907.11692 (2019)."},{"key":"e_1_3_1_90_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICCD.2017.49"},{"key":"e_1_3_1_91_2","doi-asserted-by":"publisher","DOI":"10.1109\/COMST.2017.2682318"},{"key":"e_1_3_1_92_2","doi-asserted-by":"publisher","DOI":"10.1109\/TNNLS.2018.2805098"},{"key":"e_1_3_1_93_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICPR48806.2021.9412388"},{"key":"e_1_3_1_94_2","doi-asserted-by":"publisher","DOI":"10.1145\/3527155"},{"key":"e_1_3_1_95_2","volume-title":"Proceedings of the ICML 2023 Workshop on Structured Probabilistic Inference and Generative Modeling","author":"Moon T.","year":"2023","unstructured":"T. Moon, M. Choi, E. Yun, J. Yoon, G. Lee, and J. Lee. 2023. Early exiting for accelerated inference in diffusion models. In Proceedings of the ICML 2023 Workshop on Structured Probabilistic Inference and Generative Modeling. PMLR, Honolulu, HI, USA, 14."},{"key":"e_1_3_1_96_2","first-page":"8080","volume-title":"Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition","author":"Mullapudi Ravi Teja","year":"2018","unstructured":"Ravi Teja Mullapudi, William R. Mark, Noam Shazeer, and Kayvon Fatahalian. 2018. Hydranets: Specialized dynamic architectures for efficient inference. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. IEEE, Salt Lake City, UT, USA, 8080\u20138089."},{"key":"e_1_3_1_97_2","volume-title":"Proceedings of the Advances in Neural Information Processing Systems","volume":"30","author":"Nan Feng","year":"2017","unstructured":"Feng Nan and Venkatesh Saligrama. 2017. Adaptive classification for prediction under a budget. In Proceedings of the Advances in Neural Information Processing Systems, I. Guyon, U. Von Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett (Eds.). Vol. 30, Curran Associates, Inc., Long Beach, CA, USA."},{"key":"e_1_3_1_98_2","doi-asserted-by":"publisher","DOI":"10.1109\/ISQED.2019.8697497"},{"key":"e_1_3_1_99_2","doi-asserted-by":"publisher","DOI":"10.3390\/info12100431"},{"key":"e_1_3_1_100_2","doi-asserted-by":"publisher","DOI":"10.1109\/ISCC50000.2020.9219647"},{"key":"e_1_3_1_101_2","doi-asserted-by":"publisher","DOI":"10.1109\/GLOBECOM46510.2021.9685469"},{"key":"e_1_3_1_102_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICC45041.2023.10279243"},{"key":"e_1_3_1_103_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.procs.2018.07.099"},{"key":"e_1_3_1_104_2","doi-asserted-by":"publisher","DOI":"10.5555\/2971808.2971918"},{"key":"e_1_3_1_105_2","doi-asserted-by":"publisher","DOI":"10.1145\/3007192"},{"key":"e_1_3_1_106_2","doi-asserted-by":"publisher","DOI":"10.1109\/JPROC.2019.2941458"},{"key":"e_1_3_1_107_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.2019.00144"},{"key":"e_1_3_1_108_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-030-01716-3_18"},{"key":"e_1_3_1_109_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/D16-1264"},{"key":"e_1_3_1_110_2","doi-asserted-by":"publisher","DOI":"10.1109\/MCOM.001.2000243"},{"key":"e_1_3_1_111_2","doi-asserted-by":"crossref","unstructured":"Amin Sabet Jonathon Hare Bashir Al-Hashimi and Geoff V. Merrett. 2021. Temporal early exits for efficient video object detection. arXiv preprint arXiv:2106.11208 (2021).","DOI":"10.2139\/ssrn.4001359"},{"key":"e_1_3_1_112_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICASSP40776.2020.9054209"},{"key":"e_1_3_1_113_2","doi-asserted-by":"publisher","DOI":"10.1007\/s12559-020-09734-4"},{"key":"e_1_3_1_114_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2020.acl-main.593"},{"key":"e_1_3_1_115_2","article-title":"Hierarchical training of deep neural networks using early exiting","volume":"2024","author":"Sepehri Y.","year":"2024","unstructured":"Y. Sepehri, P. Pad, A. C. Y\u00fcz\u00fcg\u00fcler, P. Frossard, and L. A. Dunbar. 2024. Hierarchical training of deep neural networks using early exiting. IEEE Transactions on Neural Networks and Learning Systems TNNLS.2024.3396628 (2024), 15.","journal-title":"IEEE Transactions on Neural Networks and Learning Systems"},{"key":"e_1_3_1_116_2","doi-asserted-by":"publisher","DOI":"10.1145\/3097983.3098177"},{"key":"e_1_3_1_117_2","unstructured":"Karen Simonyan and Andrew Zisserman. 2014. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014)."},{"key":"e_1_3_1_118_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2020.acl-main.504"},{"key":"e_1_3_1_119_2","doi-asserted-by":"publisher","DOI":"10.5555\/2627435.2670313"},{"key":"e_1_3_1_120_2","unstructured":"Rupesh Kumar Srivastava Klaus Greff and J\u00fcrgen Schmidhuber. 2015. Highway networks. arXiv preprint arXiv:1505.00387 (2015)."},{"key":"e_1_3_1_121_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2019.00293"},{"key":"e_1_3_1_122_2","unstructured":"Shengkun Tang Yaqing Wang Caiwen Ding Yi Liang Yao Li and Dongkuan Xu. 2023. DeeDiff: Dynamic uncertainty-aware early exiting for accelerating diffusion model generation. arXiv preprint arXiv:2309.17074 (2023)."},{"key":"e_1_3_1_123_2","first-page":"6166","volume-title":"Proceedings of the International Conference on Machine Learning","author":"Tanno Ryutaro","year":"2019","unstructured":"Ryutaro Tanno, Kai Arulkumaran, Daniel Alexander, Antonio Criminisi, and Aditya Nori. 2019. Adaptive neural trees. In Proceedings of the International Conference on Machine Learning. PMLR, Long Beach, CA, USA, 6166\u20136175."},{"key":"e_1_3_1_124_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICPR.2016.7900006"},{"key":"e_1_3_1_125_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICDCS.2017.226"},{"key":"e_1_3_1_126_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICCE53296.2022.9730182"},{"key":"e_1_3_1_127_2","volume-title":"Proceedings of the Advances in Neural Information Processing Systems 30","author":"Vaswani Ashish","year":"2017","unstructured":"Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, \u0141ukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. In Proceedings of the Advances in Neural Information Processing Systems 30. Curran Associates, Inc., Long Beach, CA, USA."},{"key":"e_1_3_1_128_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-030-01246-5_1"},{"key":"e_1_3_1_129_2","doi-asserted-by":"publisher","DOI":"10.1145\/2744769.2744904"},{"issue":"1","key":"e_1_3_1_130_2","first-page":"7068349","article-title":"Deep learning for computer vision: A brief review","volume":"2018","author":"Voulodimos Athanasios","year":"2018","unstructured":"Athanasios Voulodimos, Nikolaos Doulamis, Anastasios Doulamis, and Eftychios Protopapadakis. 2018. Deep learning for computer vision: A brief review. Computational Intelligence and Neuroscience 2018, 1 (2018), 7068349.","journal-title":"Computational Intelligence and Neuroscience"},{"key":"e_1_3_1_131_2","unstructured":"Tai Vu Emily Wen and Roy Nehoran. 2020. How not to give a FLOP: Combining regularization and pruning for efficient inference. arXiv preprint arXiv:2003.13593 (2020)."},{"key":"e_1_3_1_132_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.2017.306"},{"key":"e_1_3_1_133_2","doi-asserted-by":"publisher","DOI":"10.1109\/SiPS47522.2019.9020551"},{"key":"e_1_3_1_134_2","doi-asserted-by":"publisher","DOI":"10.1109\/JIOT.2023.3293506"},{"key":"e_1_3_1_135_2","volume-title":"Proceedings of the 34th Conference on Uncertainty in Artificial Intelligence (UAI)","author":"Wang Xin","year":"2018","unstructured":"Xin Wang, Yujia Luo, Daniel Crankshaw, Alexey Tumanov, Fisher Yu, and Joseph E. Gonzalez. 2018. Idk cascades: Fast deep learning by learning not to overthink. In Proceedings of the 34th Conference on Uncertainty in Artificial Intelligence (UAI). AUAI Press, Monterey, CA, USA, 11."},{"key":"e_1_3_1_136_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-030-01261-8_25"},{"key":"e_1_3_1_137_2","doi-asserted-by":"publisher","DOI":"10.1109\/JSTSP.2020.2979669"},{"key":"e_1_3_1_138_2","doi-asserted-by":"publisher","DOI":"10.3389\/fphys.2023.1171467"},{"key":"e_1_3_1_139_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2018.00919"},{"key":"e_1_3_1_140_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2020.sustainlp-1.11"},{"key":"e_1_3_1_141_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2020.acl-main.204"},{"key":"e_1_3_1_142_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2021.eacl-main.8"},{"key":"e_1_3_1_143_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-030-58517-4_17"},{"key":"e_1_3_1_144_2","doi-asserted-by":"publisher","DOI":"10.1145\/3581783.3611762"},{"key":"e_1_3_1_145_2","doi-asserted-by":"crossref","unstructured":"Alexandru Rancea Ionut Anghel and Tudor Cioara. 2024. Edge Computing in Healthcare: Innovations Opportunities and Challenges. Future Internet 16 9 (2024) 329.","DOI":"10.3390\/fi16090329"},{"key":"e_1_3_1_146_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR42600.2020.00244"},{"key":"e_1_3_1_147_2","doi-asserted-by":"publisher","DOI":"10.1109\/TCDS.2023.3274214"},{"key":"e_1_3_1_148_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/P17-1172"},{"key":"e_1_3_1_149_2","first-page":"1","article-title":"A dynamic transformer network with early exit mechanism for fast detection of multiscale surface defects","volume":"72","author":"Yu Haitao","year":"2023","unstructured":"Haitao Yu, Dongliang Liu, Zhen Zhang, and Jiang Wang. 2023. A dynamic transformer network with early exit mechanism for fast detection of multiscale surface defects. IEEE Transactions on Instrumentation and Measurement 72 (2023), 1\u201310.","journal-title":"IEEE Transactions on Instrumentation and Measurement"},{"key":"e_1_3_1_150_2","doi-asserted-by":"publisher","DOI":"10.1109\/MNET.001.1800506"},{"key":"e_1_3_1_151_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2023.acl-srw.43"},{"key":"e_1_3_1_152_2","doi-asserted-by":"publisher","DOI":"10.1109\/MCOM.2018.1701370"},{"key":"e_1_3_1_153_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2018.00068"},{"key":"e_1_3_1_154_2","volume-title":"Proceedings of the Advances in Neural Information Processing Systems","volume":"28","author":"Zhang Xiang","year":"2015","unstructured":"Xiang Zhang, Junbo Zhao, and Yann LeCun. 2015. Character-level convolutional networks for text classification. In Proceedings of the Advances in Neural Information Processing Systems, C. Cortes, N. Lawrence, D. Lee, M. Sugiyama, and R. Garnett (Eds.). Vol. 28. Curran Associates, Inc., Montreal, Quebec, Canada."},{"key":"e_1_3_1_155_2","doi-asserted-by":"crossref","unstructured":"Shaojun Zhang Wei Li Yongwei Wu Paul Watson and Albert Zomaya. 2018. Enabling edge intelligence for activity recognition in smart homes. In 2018 IEEE 15th International Conference on Mobile Ad Hoc and Sensor Systems (MASS). IEEE 228--236.","DOI":"10.1109\/MASS.2018.00044"},{"key":"e_1_3_1_156_2","doi-asserted-by":"publisher","DOI":"10.1145\/3450268.3453520"},{"key":"e_1_3_1_157_2","first-page":"18330","article-title":"Bert loses patience: Fast and robust inference with early exit","volume":"33","author":"Zhou Wangchunshu","year":"2020","unstructured":"Wangchunshu Zhou, Canwen Xu, Tao Ge, Julian McAuley, Ke Xu, and Furu Wei. 2020. Bert loses patience: Fast and robust inference with early exit. Advances in Neural Information Processing Systems 33 (2020), 18330\u201318341.","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_3_1_158_2","doi-asserted-by":"publisher","DOI":"10.1109\/JPROC.2019.2918951"},{"key":"e_1_3_1_159_2","first-page":"4510","volume-title":"Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)","author":"Zhu Menglong","year":"2018","unstructured":"Menglong Zhu and Andrey Zhmoginov Liang-Chieh Chen. 2018. Mobilenetv2: Inverted residuals and linear bottlenecks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, Salt Lake City, UT, USA, 4510\u20134520."},{"key":"e_1_3_1_160_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2021.emnlp-main.242"}],"container-title":["ACM Computing Surveys"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3698767","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3698767","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,19]],"date-time":"2025-06-19T01:09:44Z","timestamp":1750295384000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3698767"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,11,22]]},"references-count":159,"journal-issue":{"issue":"3","published-print":{"date-parts":[[2025,3,31]]}},"alternative-id":["10.1145\/3698767"],"URL":"https:\/\/doi.org\/10.1145\/3698767","relation":{},"ISSN":["0360-0300","1557-7341"],"issn-type":[{"value":"0360-0300","type":"print"},{"value":"1557-7341","type":"electronic"}],"subject":[],"published":{"date-parts":[[2024,11,22]]},"assertion":[{"value":"2022-11-27","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2024-09-25","order":2,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2024-11-22","order":3,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}