{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,15]],"date-time":"2026-04-15T18:33:06Z","timestamp":1776277986022,"version":"3.50.1"},"reference-count":69,"publisher":"Association for Computing Machinery (ACM)","issue":"3","content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Evol. Learn. Optim."],"published-print":{"date-parts":[[2025,9,30]]},"abstract":"<jats:p>Deploying deep neural networks (DNNs) on microcontrollers (TinyML) is a common trend to process the increasing amount of sensor data generated at the edge, but in practice, resource and latency constraints make it difficult to find optimal DNN candidates. Neural architecture search (NAS) is an excellent approach to automate this search and can easily be combined with DNN compression techniques commonly used in TinyML. However, many NAS techniques are not only computationally expensive, especially hyperparameter optimization (HPO), but also often focus on optimizing only a single objective, e.g., maximizing accuracy, without considering additional objectives such as memory requirements or computational complexity of a DNN, which are key to making deployment at the edge feasible. In this article, we propose a novel NAS strategy for TinyML based on multi-objective Bayesian optimization (MOBOpt) and an ensemble of competing parametric policies trained using augmented random search (ARS) reinforcement learning (RL) agents. Our methodology aims at efficiently finding tradeoffs between a DNN\u2019s predictive accuracy, memory requirements on a given target system, and computational complexity. Our experiments show that we consistently outperform existing MOBOpt approaches on different datasets and architectures such as ResNet-18 and MobileNetv3.<\/jats:p>","DOI":"10.1145\/3715012","type":"journal-article","created":{"date-parts":[[2025,1,23]],"date-time":"2025-01-23T08:48:01Z","timestamp":1737622081000},"page":"1-21","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":3,"title":["Combining Multi-Objective Bayesian Optimization with Reinforcement Learning for TinyML"],"prefix":"10.1145","volume":"5","author":[{"ORCID":"https:\/\/orcid.org\/0000-0001-8932-5212","authenticated-orcid":false,"given":"Mark","family":"Deutel","sequence":"first","affiliation":[{"name":"Friedrich-Alexander-Universit\u00e4t Erlangen-N\u00fcrnberg, Erlangen, Germany and Fraunhofer IIS, Fraunhofer Institute for Integrated Circuits IIS, Erlangen, Germany"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-7051-0960","authenticated-orcid":false,"given":"Georgios","family":"Kontes","sequence":"additional","affiliation":[{"name":"Fraunhofer IIS, Fraunhofer Institute for Integrated Circuits IIS, Erlangen, Germany"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-8108-0230","authenticated-orcid":false,"given":"Christopher","family":"Mutschler","sequence":"additional","affiliation":[{"name":"Fraunhofer IIS, Fraunhofer Institute for Integrated Circuits IIS, Erlangen, Germany"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-6285-5862","authenticated-orcid":false,"given":"J\u00fcrgen","family":"Teich","sequence":"additional","affiliation":[{"name":"Friedrich-Alexander-Universit\u00e4t Erlangen-N\u00fcrnberg, Erlangen, Germany"}]}],"member":"320","published-online":{"date-parts":[[2025,8,29]]},"reference":[{"key":"e_1_3_2_2_1","doi-asserted-by":"publisher","DOI":"10.1145\/3292500.3330701"},{"key":"e_1_3_2_3_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR52688.2022.01203"},{"key":"e_1_3_2_4_1","volume-title":"International Conference on Learning Representations","author":"Ashok Anubhav","year":"2018","unstructured":"Anubhav Ashok, Nicholas Rhinehart, Fares Beainy, and Kris M. Kitani. 2018. N2N learning: Network to network compression via policy gradient reinforcement learning. In International Conference on Learning Representations."},{"key":"e_1_3_2_5_1","doi-asserted-by":"crossref","unstructured":"A. Bagnall J. Lines A. Bostrom J. Large and E. Keogh. 2017. The great time series classification bake off: A review and experimental evaluation of recent algorithmic advances. Data Mining and Knowledge Discovery 31 3 (2017) 606\u2013660.","DOI":"10.1007\/s10618-016-0483-9"},{"key":"e_1_3_2_6_1","first-page":"21524","volume-title":"Conference on Neural Information Processing Systems","author":"Balandat Maximilian","year":"2020","unstructured":"Maximilian Balandat, Brian Karrer, Daniel Jiang, Samuel Daulton, Ben Letham, Andrew G. Wilson, and Eytan Bakshy. 2020. BoTorch: A framework for efficient Monte-Carlo Bayesian optimization. In Conference on Neural Information Processing Systems, 21524\u201321538."},{"key":"e_1_3_2_7_1","volume-title":"Conference on Neural Information Processing Systems","author":"Banbury Colby","year":"2021","unstructured":"Colby Banbury, Vijay Janapa Reddi, Peter Torelli, Jeremy Holleman, Nat Jeffries, Csaba Kiraly, Pietro Montino, David Kanter, Sebastian Ahmed, Danilo Pau, et\u00a0al. 2021. MLPerf tiny benchmark. In Conference on Neural Information Processing Systems."},{"key":"e_1_3_2_8_1","unstructured":"James Bergstra Daniel Yamins and David Cox. 2013. Making a science of model search: Hyperparameter optimization in hundreds of dimensions for vision architectures. In International Conference on Machine Learning. PMLR 115\u2013123."},{"key":"e_1_3_2_9_1","first-page":"527","volume-title":"International Conference on Machine Learning","author":"Bolukbasi Tolga","year":"2017","unstructured":"Tolga Bolukbasi, Joseph Wang, Ofer Dekel, and Venkatesh Saligrama. 2017. Adaptive neural networks for efficient inference. In International Conference on Machine Learning. PMLR, 527\u2013536."},{"key":"e_1_3_2_10_1","volume-title":"International Conference on Learning Representations","author":"Cai Han","year":"2020","unstructured":"Han Cai, Chuang Gan, Tianzhe Wang, Zhekai Zhang, and Song Han. 2020. Once-for-all: Train one network and specialize it for efficient deployment. In International Conference on Learning Representations."},{"key":"e_1_3_2_11_1","volume-title":"International Conference on Learning Representations","author":"Cai Han","year":"2018","unstructured":"Han Cai, Ligeng Zhu, and Song Han. 2018. ProxylessNAS: Direct neural architecture search on target task and hardware. In International Conference on Learning Representations."},{"key":"e_1_3_2_12_1","volume-title":"International Conference on Learning Representations","author":"Cao Shengcao","year":"2018","unstructured":"Shengcao Cao, Xiaofang Wang, and Kris M. Kitani. 2018. Learnable embedding space for efficient neural architecture compression. In International Conference on Learning Representations."},{"key":"e_1_3_2_13_1","doi-asserted-by":"crossref","unstructured":"Krishna Teja Chitty-Venkata and Arun K. Somani. 2022. Neural architecture search survey: A hardware perspective. ACM Computing Surveys 55 4 (2022) 1\u201336.","DOI":"10.1145\/3524500"},{"key":"e_1_3_2_14_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.2019.00489"},{"key":"e_1_3_2_15_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2017.195"},{"key":"e_1_3_2_16_1","unstructured":"Samuel Daulton David Eriksson Maximilian Balandat and Eytan Bakshy. 2022. Multi-objective Bayesian optimization over high-dimensional search spaces. In Uncertainty in Artificial Intelligence. PMLR 507\u2013517."},{"key":"e_1_3_2_17_1","first-page":"800","volume-title":"Proceedings of Machine Learning and Systems","volume":"3","author":"David Robert","year":"2021","unstructured":"Robert David, Jared Duke, Advait Jain, Vijay Janapa Reddi, Nat Jeffries, Jian Li, Nick Kreeger, Ian Nappier, Meghna Natraj, Tiezhen Wang, et\u00a0al. 2021. Tensorflow lite micro: Embedded machine learning for TinyML systems. Proceedings of Machine Learning and Systems 3 (2021), 800\u2013811."},{"key":"e_1_3_2_18_1","doi-asserted-by":"crossref","unstructured":"K. Deb A. Pratap S. Agarwal and T. Meyarivan. 2002. A fast and elitist multiobjective genetic algorithm: NSGA-II. IEEE Transactions on Evolutionary Computation 6 2 (2002) 182\u2013197.","DOI":"10.1109\/4235.996017"},{"key":"e_1_3_2_19_1","unstructured":"Mark Deutel Philipp Woller Christopher Mutschler and Juergen Teich. 2023. Energy-efficient deployment of deep learning applications on cortex-M based microcontrollers using deep compression. In 26th Workshop on MBMV 2023 1\u201312."},{"key":"e_1_3_2_20_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-030-01252-6_32"},{"key":"e_1_3_2_21_1","volume-title":"International Conference on Learning Representations","author":"Elsken Thomas","year":"2018","unstructured":"Thomas Elsken, Jan Hendrik Metzen, and Frank Hutter. 2018. Efficient multi-objective neural architecture search via Lamarckian evolution. In International Conference on Learning Representations."},{"key":"e_1_3_2_22_1","volume-title":"Conference on Neural Information Processing Systems","author":"Eriksson David","year":"2019","unstructured":"David Eriksson, Michael Pearce, Jacob R. Gardner, Ryan Turner, and Matthias Poloczek. 2019. Scalable global optimization via local bayesian optimization. In Conference on Neural Information Processing Systems."},{"key":"e_1_3_2_23_1","volume-title":"International Conference on Learning Representations","author":"Gao Xitong","year":"2019","unstructured":"Xitong Gao, Yiren Zhao, \u0141ukasz Dudziak, Robert Mullins, and Cheng-zhong Xu. 2019. Dynamic channel pruning: Feature boosting and suppression. In International Conference on Learning Representations."},{"key":"e_1_3_2_24_1","doi-asserted-by":"publisher","DOI":"10.1201\/9781003162810-13"},{"key":"e_1_3_2_25_1","volume-title":"International Conference on Learning Representations","author":"Han Song","year":"2016","unstructured":"Song Han, Huizi Mao, and William J. Dally. 2016. Deep compression: Compressing deep neural networks with pruning, trained quantization and Huffman coding. In International Conference on Learning Representations."},{"key":"e_1_3_2_26_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2016.90"},{"key":"e_1_3_2_27_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-030-01234-2_48"},{"key":"e_1_3_2_28_1","unstructured":"Geoffrey Hinton Oriol Vinyals and Jeff Dean. 2015. Distilling the knowledge in a neural network. arXiv:1503.02531. Retrieved from https:\/\/arxiv.org\/abs\/1503.02531"},{"key":"e_1_3_2_29_1","doi-asserted-by":"crossref","unstructured":"Lior Hirsch and Gilad Katz. 2022. Multi-objective pruning of dense neural networks using deep reinforcement learning. Information Sciences 610 (2022) 381\u2013400.","DOI":"10.1016\/j.ins.2022.07.134"},{"key":"e_1_3_2_30_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.2019.00140"},{"key":"e_1_3_2_31_1","unstructured":"Andrew G. Howard Menglong Zhu Bo Chen Dmitry Kalenichenko Weijun Wang Tobias Weyand Marco Andreetto and Hartwig Adam. 2017. MobileNets: Efficient convolutional neural networks for mobile vision applications. arXiv:1704.04861. Retrieved from https:\/\/arxiv.org\/abs\/1704.04861"},{"key":"e_1_3_2_32_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2018.00745"},{"key":"e_1_3_2_33_1","doi-asserted-by":"crossref","unstructured":"Hassan Ismail Fawaz Benjamin Lucas Germain Forestier Charlotte Pelletier Daniel F. Schmidt Jonathan Weber Geoffrey I. Webb Lhassane Idoumghar Pierre-Alain Muller and Fran\u00e7ois Petitjean. 2020. InceptionTime: Finding AlexNet for time series classification. Data Mining and Knowledge Discovery 34 6 (2020) 1936\u20131962.","DOI":"10.1007\/s10618-020-00710-y"},{"key":"e_1_3_2_34_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2018.00286"},{"key":"e_1_3_2_35_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.jcp.2022.111263"},{"key":"e_1_3_2_36_1","doi-asserted-by":"crossref","unstructured":"J. Knowles. 2006. ParEGO: A hybrid algorithm with on-line landscape approximation for expensive multiobjective optimization problems. IEEE Transactions on Evolutionary Computation 10 1 (2006) 50\u201366.","DOI":"10.1109\/TEVC.2005.851274"},{"key":"e_1_3_2_37_1","unstructured":"Alex Krizhevsky. 2012. Learning multiple layers of features from tiny images. Technical Report. University of Toronto."},{"key":"e_1_3_2_38_1","doi-asserted-by":"crossref","unstructured":"Heike Leutheuser Dominik Schuldhaus and Bjoern M. Eskofier. 2013. Hierarchical multi-sensor based classification of daily life activities: Comparison with state-of-the-art algorithms using a benchmark dataset. PLoS One 8 10 (2013) e75196.","DOI":"10.1371\/journal.pone.0075196"},{"key":"e_1_3_2_39_1","unstructured":"Hao Li Asim Kadav Igor Durdanovic Hanan Samet and Hans Peter Graf. 2017. Pruning filters for efficient ConvNets. arXiv:1608.08710. Retrieved from https:\/\/arxiv.org\/abs\/1608.08710"},{"key":"e_1_3_2_40_1","doi-asserted-by":"crossref","unstructured":"Kaiwen Li Tao Zhang and Rui Wang. 2020. Deep reinforcement learning for multiobjective optimization. IEEE Transactions on Cybernetics 51 6 (2020) 3103\u20133114.","DOI":"10.1109\/TCYB.2020.2977661"},{"key":"e_1_3_2_41_1","unstructured":"Timothy P. Lillicrap Jonathan J. Hunt Alexander Pritzel Nicolas Heess Tom Erez Yuval Tassa David Silver and Daan Wierstra. 2015. Continuous control with deep reinforcement learning. arXiv:1509.02971. Retrieved from https:\/\/arxiv.org\/abs\/1509.02971"},{"key":"e_1_3_2_42_1","volume-title":"Conference on Neural Information Processing Systems","author":"Lin Ji","year":"2021","unstructured":"Ji Lin, Wei-Ming Chen, Han Cai, Chuang Gan, and Song Han. 2021. MCUNetV2: Memory-efficient patch-based inference for tiny deep learning. In Conference on Neural Information Processing Systems."},{"key":"e_1_3_2_43_1","volume-title":"Conference on Neural Information Processing Systems","author":"Lin Ji","year":"2020","unstructured":"Ji Lin, Wei-Ming Chen, Yujun Lin, John Cohn, Chuang Gan, and Song Han. 2020. MCUNet: Tiny deep learning on IoT devices. In Conference on Neural Information Processing Systems."},{"key":"e_1_3_2_44_1","volume-title":"International Conference on Learning Representations","author":"Liu Hanxiao","year":"2019","unstructured":"Hanxiao Liu, Karen Simonyan, and Yiming Yang. 2019. DARTS: Differentiable architecture search. In International Conference on Learning Representations."},{"key":"e_1_3_2_45_1","doi-asserted-by":"publisher","DOI":"10.1609\/aaai.v32i1.11630"},{"key":"e_1_3_2_46_1","doi-asserted-by":"publisher","DOI":"10.1145\/3321707.3321729"},{"key":"e_1_3_2_47_1","first-page":"1805","volume-title":"Conference on Neural Information Processing Systems","author":"Mania Horia","year":"2018","unstructured":"Horia Mania, Aurelia Guy, and Benjamin Recht. 2018. Simple random search of static linear policies is competitive for reinforcement learning. In Conference on Neural Information Processing Systems, 1805\u20131814."},{"key":"e_1_3_2_48_1","doi-asserted-by":"crossref","unstructured":"Michael D. McKay Richard J. Beckman and William J. Conover. 2000. A comparison of three methods for selecting values of input variables in the analysis of output from a computer code. Technometrics 42 1 (2000) 55\u201361.","DOI":"10.1080\/00401706.2000.10485979"},{"key":"e_1_3_2_49_1","first-page":"107","article-title":"Multi-objective reinforcement learning using sets of pareto dominating policies","volume":"15","author":"Moffaert Kristof Van","year":"2014","unstructured":"Kristof Van Moffaert and Ann Now\u00e9. 2014. Multi-objective reinforcement learning using sets of pareto dominating policies. Journal of Machine Learning Research 15, 107 (2014), 3663\u20133692.","journal-title":"Journal of Machine Learning Research"},{"key":"e_1_3_2_50_1","doi-asserted-by":"publisher","DOI":"10.5555\/646296.687872"},{"key":"e_1_3_2_51_1","unstructured":"Biswajit Paria Kirthevasan Kandasamy and Barnab\u00e1s P\u00f3czos. 2020. A flexible framework for multi-objective Bayesian optimization using random scalarizations. In Uncertainty in Artificial Intelligence. PMLR 766\u2013776."},{"key":"e_1_3_2_52_1","unstructured":"Kevin M. Passino. 2005. Biomimicry for Optimization Control and Automation. Springer Science & Business Media."},{"key":"e_1_3_2_53_1","first-page":"268","article-title":"Stable-Baselines3: Reliable reinforcement learning implementations","volume":"22","author":"Raffin Antonin","year":"2021","unstructured":"Antonin Raffin, Ashley Hill, Adam Gleave, Anssi Kanervisto, Maximilian Ernestus, and Noah Dormann. 2021. Stable-Baselines3: Reliable reinforcement learning implementations. Journal of Machine Learning Research 22, 268 (2021), 1\u20138.","journal-title":"Journal of Machine Learning Research"},{"key":"e_1_3_2_54_1","doi-asserted-by":"publisher","DOI":"10.1609\/aaai.v33i01.33014780"},{"key":"e_1_3_2_55_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2018.00474"},{"key":"e_1_3_2_56_1","first-page":"1889","volume-title":"International Conference on Machine Learning","author":"Schulman John","year":"2015","unstructured":"John Schulman, Sergey Levine, Pieter Abbeel, Michael Jordan, and Philipp Moritz. 2015. Trust region policy optimization. In International Conference on Machine Learning. PMLR, 1889\u20131897."},{"key":"e_1_3_2_57_1","unstructured":"John Schulman Filip Wolski Prafulla Dhariwal Alec Radford and Oleg Klimov. 2017. Proximal policy optimization algorithms. arXiv:1707.06347. Retrieved from https:\/\/arxiv.org\/abs\/1707.06347"},{"key":"e_1_3_2_58_1","first-page":"6906","volume-title":"Conference on Neural Information Processing Systems","author":"Stanton Samuel","year":"2021","unstructured":"Samuel Stanton, Pavel Izmailov, Polina Kirichenko, Alexander A. Alemi, and Andrew G. Wilson. 2021. Does knowledge distillation really work? In Conference on Neural Information Processing Systems, 6906\u20136919."},{"key":"e_1_3_2_59_1","doi-asserted-by":"crossref","unstructured":"Richard S. Sutton and Andrew G. Barto. 1999. Reinforcement learning: An introduction. Robotica 17 2 (1999) 229\u2013235.","DOI":"10.1017\/S0263574799211174"},{"key":"e_1_3_2_60_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2019.00293"},{"key":"e_1_3_2_61_1","unstructured":"Haibin Wang Ce Ge Hesen Chen and Xiuyu Sun. 2023. PreNAS: Preferred one-shot learning towards efficient neural architecture search. arXiv:2304.14636. Retrieved from https:\/\/arxiv.org\/abs\/2304.14636"},{"key":"e_1_3_2_62_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-030-01261-8_25"},{"key":"e_1_3_2_63_1","unstructured":"Colin White Mikhail Khodak Renbo Tu Shital Shah S\u00e9bastien Bubeck and Debadeepta Dey. 2022. A deeper look at zero-cost proxies for lightweight NAS. ICLR Blog Track (2022). Retrieved from https:\/\/iclr-blog-track.github.io\/2022\/03\/25\/zero-cost-proxies\/"},{"key":"e_1_3_2_64_1","doi-asserted-by":"publisher","DOI":"10.1609\/aaai.v35i12.17233"},{"key":"e_1_3_2_65_1","unstructured":"Colin White Mahmoud Safari Rhea Sukthanker Binxin Ru Thomas Elsken Arber Zela Debadeepta Dey and Frank Hutter. 2023. Neural architecture search: Insights from 1000 papers. arXiv:2301.08727. Retrieved from https:\/\/arxiv.org\/abs\/2301.08727"},{"key":"e_1_3_2_66_1","volume-title":"Conference on Neural Information Processing Systems","volume":"8","author":"Williams Christopher","year":"1995","unstructured":"Christopher Williams and Carl Rasmussen. 1995. Gaussian processes for regression. In Conference on Neural Information Processing Systems. D. Touretzky, M. C. Mozer, and M. Hasselmo (Eds.), Vol. 8, MIT Press."},{"key":"e_1_3_2_67_1","unstructured":"James T. Wilson Riccardo Moriconi Frank Hutter and Marc Peter Deisenroth. 2017. The reparameterization trick for acquisition functions. In NIPS 2017 Workshop on Bayesian Optimization."},{"key":"e_1_3_2_68_1","volume-title":"International Conference on Learning Representations","author":"Zela Arber","year":"2020","unstructured":"Arber Zela, Thomas Elsken, Tonmoy Saikia, Yassine Marrakchi, Thomas Brox, and Frank Hutter. 2020. Understanding and robustifying differentiable architecture search. In International Conference on Learning Representations."},{"key":"e_1_3_2_69_1","unstructured":"Michael Zhu and Suyog Gupta. 2017. To prune or not to prune: Exploring the efficacy of pruning for model compression. arXiv:1710.01878. Retrieved from https:\/\/arxiv.org\/abs\/1710.01878"},{"key":"e_1_3_2_70_1","volume-title":"International Conference on Learning Representations","author":"Zoph Barret","year":"2017","unstructured":"Barret Zoph and Quoc Le. 2017. Neural architecture search with reinforcement learning. In International Conference on Learning Representations."}],"container-title":["ACM Transactions on Evolutionary Learning and Optimization"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3715012","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,8,29]],"date-time":"2025-08-29T15:23:03Z","timestamp":1756480983000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3715012"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,8,29]]},"references-count":69,"journal-issue":{"issue":"3","published-print":{"date-parts":[[2025,9,30]]}},"alternative-id":["10.1145\/3715012"],"URL":"https:\/\/doi.org\/10.1145\/3715012","relation":{},"ISSN":["2688-3007"],"issn-type":[{"value":"2688-3007","type":"electronic"}],"subject":[],"published":{"date-parts":[[2025,8,29]]},"assertion":[{"value":"2023-11-30","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2025-01-07","order":2,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2025-08-29","order":3,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}