{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,11,27]],"date-time":"2025-11-27T20:57:41Z","timestamp":1764277061935,"version":"3.37.3"},"reference-count":50,"publisher":"Springer Science and Business Media LLC","issue":"6","license":[{"start":{"date-parts":[[2021,6,1]],"date-time":"2021-06-01T00:00:00Z","timestamp":1622505600000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2021,6,4]],"date-time":"2021-06-04T00:00:00Z","timestamp":1622764800000},"content-version":"vor","delay-in-days":3,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"DOI":"10.13039\/501100001659","name":"Deutsche Forschungsgemeinschaft","doi-asserted-by":"publisher","award":["SFB 876"],"award-info":[{"award-number":["SFB 876"]}],"id":[{"id":"10.13039\/501100001659","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100016378","name":"Technische Universit\u00e4t Dortmund","doi-asserted-by":"crossref","id":[{"id":"10.13039\/501100016378","id-type":"DOI","asserted-by":"crossref"}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Mach Learn"],"published-print":{"date-parts":[[2021,6]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>The predictive performance of a machine learning model highly depends on the corresponding hyper-parameter setting. Hence, hyper-parameter tuning is often indispensable. Normally such tuning requires the dedicated machine learning model to be trained and evaluated on centralized data to obtain a performance estimate. However, in a distributed machine learning scenario, it is not always possible to collect all the data from all nodes due to privacy concerns or storage limitations. Moreover, if data has to be transferred through low bandwidth connections it reduces the time available for tuning. Model-Based Optimization (MBO) is one state-of-the-art method for tuning hyper-parameters but the application on distributed machine learning models or federated learning lacks research. This work proposes a framework<jats:inline-formula><jats:alternatives><jats:tex-math>$$\\textit{MODES}$$<\/jats:tex-math><mml:math xmlns:mml=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"><mml:mi>MODES<\/mml:mi><\/mml:math><\/jats:alternatives><\/jats:inline-formula>that allows to deploy MBO on resource-constrained distributed embedded systems. Each node trains an individual model based on its local data. The goal is to optimize the combined prediction accuracy. The presented framework offers two optimization modes: (1)<jats:inline-formula><jats:alternatives><jats:tex-math>$$\\textit{MODES}$$<\/jats:tex-math><mml:math xmlns:mml=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"><mml:mi>MODES<\/mml:mi><\/mml:math><\/jats:alternatives><\/jats:inline-formula>-B considers the whole ensemble as a single black box and optimizes the hyper-parameters of each individual model jointly, and (2)<jats:inline-formula><jats:alternatives><jats:tex-math>$$\\textit{MODES}$$<\/jats:tex-math><mml:math xmlns:mml=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"><mml:mi>MODES<\/mml:mi><\/mml:math><\/jats:alternatives><\/jats:inline-formula>-I considers all models as clones of the same black box which allows it to efficiently parallelize the optimization in a distributed setting. We evaluate<jats:inline-formula><jats:alternatives><jats:tex-math>$$\\textit{MODES}$$<\/jats:tex-math><mml:math xmlns:mml=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"><mml:mi>MODES<\/mml:mi><\/mml:math><\/jats:alternatives><\/jats:inline-formula>by conducting experiments on the optimization for the hyper-parameters of a random forest and a multi-layer perceptron. The experimental results demonstrate that, with an improvement in terms of mean accuracy (<jats:inline-formula><jats:alternatives><jats:tex-math>$$\\textit{MODES}$$<\/jats:tex-math><mml:math xmlns:mml=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"><mml:mi>MODES<\/mml:mi><\/mml:math><\/jats:alternatives><\/jats:inline-formula>-B), run-time efficiency (<jats:inline-formula><jats:alternatives><jats:tex-math>$$\\textit{MODES}$$<\/jats:tex-math><mml:math xmlns:mml=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"><mml:mi>MODES<\/mml:mi><\/mml:math><\/jats:alternatives><\/jats:inline-formula>-I), and statistical stability for both modes,<jats:inline-formula><jats:alternatives><jats:tex-math>$$\\textit{MODES}$$<\/jats:tex-math><mml:math xmlns:mml=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"><mml:mi>MODES<\/mml:mi><\/mml:math><\/jats:alternatives><\/jats:inline-formula>outperforms the baseline, i.e., carry out tuning with MBO on each node individually with its local sub-data set.<\/jats:p>","DOI":"10.1007\/s10994-021-06014-6","type":"journal-article","created":{"date-parts":[[2021,6,4]],"date-time":"2021-06-04T16:03:02Z","timestamp":1622822582000},"page":"1527-1547","update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":6,"title":["MODES: model-based optimization on distributed embedded systems"],"prefix":"10.1007","volume":"110","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-9879-1394","authenticated-orcid":false,"given":"Junjie","family":"Shi","sequence":"first","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Jiang","family":"Bian","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Jakob","family":"Richter","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Kuan-Hsun","family":"Chen","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"J\u00f6rg","family":"Rahnenf\u00fchrer","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Haoyi","family":"Xiong","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Jian-Jia","family":"Chen","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"297","published-online":{"date-parts":[[2021,6,4]]},"reference":[{"key":"6014_CR2","unstructured":"Anguita, D., Ghio, A., et al. (2013). A public domain dataset for human activity recognition using smartphones. In Esann"},{"key":"6014_CR3","unstructured":"Baek, O. K. (2011). Data-centric distributed computing. US Patent 8060464."},{"key":"6014_CR4","unstructured":"Balandat, M., Karrer, B., Jiang, D. R., Daulton, S., Letham, B., Wilson, A. G., & Bakshy, E. (2020). BoTorch: A framework for efficient Monte\u2013Carlo Bayesian optimization. In Advances in neural information processing systems"},{"key":"6014_CR5","unstructured":"Bergstra, J., & Bengio, Y. (2012). Random search for hyper-parameter optimization. Journal of Machine Learning Research."},{"key":"6014_CR6","unstructured":"Bergstra, J., Yamins, D.,&Cox, D. (2013). Making a science of model search: Hyperparameter optimization in hundreds of dimensions for vision architectures. In ICML (vol. 28, pp. 115\u2013123), Atlanta, Georgia, USA, 17\u201319 June PMLR."},{"key":"6014_CR7","doi-asserted-by":"crossref","unstructured":"Berk, J., Nguyen, V., Gupta, S., Rana, S., & Venkatesh, S. (2018). Exploration enhanced expected improvement for Bayesian optimization. In Machine learning and knowledge discovery in databases\u2014ECML\/PKDD proceedings, volume 11052 of lecture notes in computer science (pp. 621\u2013637). Springer.","DOI":"10.1007\/978-3-030-10928-8_37"},{"key":"6014_CR8","doi-asserted-by":"crossref","unstructured":"Bian, J., Xiong, H., Fu, Y., & Das, S. K. (2018). Cswa: Aggregation-free spatial-temporal community sensing. In AAAI conference on artificial intelligence (pp. 2087\u20132094).","DOI":"10.1609\/aaai.v32i1.11850"},{"key":"6014_CR9","unstructured":"Bischl, B., Richter, J., Bossek, J., Horn, D., Thomas, J., & Lang, M. (2017). mlrMBO: A modular framework for model-based optimization of expensive black-box functions. arXiv:1703.03373 [stat]"},{"issue":"3","key":"6014_CR10","doi-asserted-by":"publisher","first-page":"131","DOI":"10.1016\/S0168-1699(99)00046-0","volume":"24","author":"JA Blackard","year":"1999","unstructured":"Blackard, J. A., & Dean, D. J. (1999). Comparative accuracies of artificial neural networks and discriminant analysis in predicting forest cover types from cartographic variables. Computers and Electronics in Agriculture, 24(3), 131\u2013151.","journal-title":"Computers and Electronics in Agriculture"},{"key":"6014_CR11","doi-asserted-by":"crossref","unstructured":"Buschjager, S., Chen, K.-H., Chen, J.-J., & Morik, K. (2018). Realization of random forest for real-time evaluation through tree framing. In ICDM, IEEE.","DOI":"10.1109\/ICDM.2018.00017"},{"issue":"2","key":"6014_CR12","doi-asserted-by":"publisher","first-page":"469","DOI":"10.1093\/biomet\/asp007","volume":"96","author":"Y-B Chan","year":"2009","unstructured":"Chan, Y.-B., & Hall, P. (2009). Scale adjustments for classifiers in high-dimensional, low sample size settings. Biometrika, 96(2), 469\u2013478.","journal-title":"Biometrika"},{"key":"6014_CR13","volume-title":"Model selection and model averaging","author":"G Claeskens","year":"2008","unstructured":"Claeskens, G., Hjort, N. L., et al. (2008). Model selection and model averaging. Cambridge: Cambridge Books."},{"key":"6014_CR14","unstructured":"Coy, M. A. R., Rehbach, F., Eiben, A. E., & Bartz-Beielstein, T. (2020). Parallelized Bayesian optimization for problems with expensive evaluation functions. In Coello, C. A. C. (ed.), GECCO (pp. 231\u2013232). ACM."},{"key":"6014_CR15","unstructured":"Gal, Y., & Ghahramani, Z. (2016). Dropout as a Bayesian approximation: Representing model uncertainty in deep learning. In ICML (pp. 1050\u20131059). PMLR."},{"issue":"14\u201315","key":"6014_CR16","doi-asserted-by":"publisher","first-page":"2627","DOI":"10.1016\/S1352-2310(97)00447-0","volume":"32","author":"MW Gardner","year":"1998","unstructured":"Gardner, M. W., & Dorling, S. (1998). Artificial neural networks (the multilayer perceptron)\u2014A review of applications in the atmospheric sciences. Atmospheric Environment, 32(14\u201315), 2627\u20132636.","journal-title":"Atmospheric Environment"},{"key":"6014_CR17","unstructured":"Garg, A., Saha, A. K., & Dutta, D. (2020). Direct federated neural architecture search. arXiv:2010.06223"},{"key":"6014_CR18","doi-asserted-by":"crossref","unstructured":"Ginsbourger, D., Le Riche, R., & Carraro, L. (2010). Kriging is well-suited to parallelize optimization. In Computational intelligence in expensive optimization problems (pp. 131\u2013162). Springer.","DOI":"10.1007\/978-3-642-10701-6_6"},{"key":"6014_CR19","unstructured":"Graves, A. (2011). Practical variational inference for neural networks. In Advances in neural information processing systems (pp. 2348\u20132356). Citeseer."},{"key":"6014_CR20","doi-asserted-by":"crossref","unstructured":"Gu, Y., Do, H., Ou, Y., & Sheng, W. (2012). Human gesture recognition through a kinect sensor. In ROBIO (pp. 1379\u20131384). IEEE.","DOI":"10.1109\/ROBIO.2012.6491161"},{"issue":"2","key":"6014_CR21","doi-asserted-by":"publisher","first-page":"159","DOI":"10.1162\/106365601750190398","volume":"9","author":"N Hansen","year":"2001","unstructured":"Hansen, N., & Ostermeier, A. (2001). Completely derandomized self-adaptation in evolution strategies. Evolutionary Computation, 9(2), 159\u2013195.","journal-title":"Evolutionary Computation"},{"key":"6014_CR22","doi-asserted-by":"crossref","unstructured":"Hastie, T., Tibshirani, R., & Friedman, J. (2009). The elements of statistical learning: data mining, inference, and prediction. Springer.","DOI":"10.1007\/978-0-387-84858-7"},{"key":"6014_CR23","unstructured":"He, C., Annavaram, M., & Avestimehr, S. (2020). Fednas: Federated deep learning via neural architecture search. arXiv:2004.08546"},{"key":"6014_CR24","doi-asserted-by":"crossref","unstructured":"Hutter, F., Hoos, H., & Leyton-Brown, K. (2013). An evaluation of sequential model-based optimization for expensive blackbox functions. In GECCO (pp. 1209\u20131216).","DOI":"10.1145\/2464576.2501592"},{"key":"6014_CR25","doi-asserted-by":"crossref","unstructured":"Hutter, F., Hoos, H. H., & Leyton-Brown, K. (2011). Sequential model-based optimization for general algorithm configuration. In LION. Springer.","DOI":"10.1007\/978-3-642-25566-3_40"},{"key":"6014_CR26","doi-asserted-by":"crossref","unstructured":"Hutter, F., Hoos, H. H., & Leyton-Brown, K. (2012). Parallel algorithm configuration. Number 7219 in lecture notes in computer science. In Y. Hamadi & M. Schoenauer (Eds.), Learning and intelligent optimization (pp. 55\u201370). Springer.","DOI":"10.1007\/978-3-642-34413-8_5"},{"key":"6014_CR27","doi-asserted-by":"crossref","unstructured":"Janusevskis, J., Le Riche, R., Ginsbourger, D., & Girdziusas, R. (2012). Expected improvements for the asynchronous parallel global optimization of expensive functions: Potentials and challenges. In LION.Springer.","DOI":"10.1007\/978-3-642-34413-8_37"},{"issue":"4","key":"6014_CR28","doi-asserted-by":"publisher","first-page":"455","DOI":"10.1023\/A:1008306431147","volume":"13","author":"DR Jones","year":"1998","unstructured":"Jones, D. R., Schonlau, M., & Welch, W. J. (1998). Efficient global optimization of expensive black-box functions. Journal of Global Optimization, 13(4), 455\u2013492.","journal-title":"Journal of Global Optimization"},{"key":"6014_CR29","unstructured":"Kone\u010dn\u1ef3, J., McMahan, H. B., Yu, F. X., Richt\u00e1rik, P., Suresh, A. T., & Bacon, D. (2016). Federated learning: Strategies for improving communication efficiency. arXiv:1610.05492"},{"key":"6014_CR30","unstructured":"Kotthaus, H. (2018). Methods for efficient resource utilization in statistical machine learning algorithms. Ph.D. thesis, Technical University of Dortmund, Germany"},{"key":"6014_CR31","doi-asserted-by":"crossref","unstructured":"Kotthaus, H., Richter, J., Lang, A., Thomas, J., Bischl, B., Marwedel, P., et al. (2017). RAMBO: Resource-aware model-based optimization with scheduling for heterogeneous runtimes and a comparison with asynchronous model-based optimization. Lecture notes in computer science. In Learning and intelligent optimization (pp. 180\u2013195). Cham: Springer.","DOI":"10.1007\/978-3-319-69404-7_13"},{"key":"6014_CR32","doi-asserted-by":"crossref","unstructured":"Kotthaus, H., Sch\u00f6nberger, L., Lang, A., Chen, J., & Marwedel, P. (2019). Can flexible multi-core scheduling help to execute machine learning algorithms resource-efficiently? In SCOPES (pp. 59\u201362). ACM.","DOI":"10.1145\/3323439.3323986"},{"issue":"2","key":"6014_CR33","first-page":"341","volume":"52","author":"H-P Kriegel","year":"2017","unstructured":"Kriegel, H.-P., Schubert, E., & Zimek, A. (2017). The (black) art of runtime evaluation: Are we comparing algorithms or implementations? KAIS, 52(2), 341\u2013378.","journal-title":"KAIS"},{"key":"6014_CR34","unstructured":"LeCun, Y., Cortes, C., & Burges, C. J. (1998). The mnist database of handwritten digits. http:\/\/yann.lecun.com\/exdb\/mnist"},{"key":"6014_CR35","doi-asserted-by":"crossref","unstructured":"LeCun, Y. A., Bottou, L., Orr, G. B., & M\u00fcller, K.-R. (2012). Efficient backprop. In Neural networks: Tricks of the trade (pp. 9\u201348). Springer.","DOI":"10.1007\/978-3-642-35289-8_3"},{"key":"6014_CR36","doi-asserted-by":"crossref","unstructured":"Levinson, J., Askeland, J., Becker, J., Dolson, J., Held, D., Kammel, S., Kolter, J. Z., Langer, D., Pink, O., Pratt, V., et al. (2011). Towards fully autonomous driving: Systems and algorithms. In 2011 IEEE intelligent vehicles symposium (IV) (pp. 163\u2013168). IEEE.","DOI":"10.1109\/IVS.2011.5940562"},{"key":"6014_CR37","doi-asserted-by":"crossref","unstructured":"Li, L., Xiong, H., Wang, J., Xu, C.-Z., & Guo, Z. (2019). Smartpc: Hierarchical pace control in real-time federated learning system.","DOI":"10.1109\/RTSS46320.2019.00043"},{"issue":"3","key":"6014_CR38","first-page":"18","volume":"2","author":"A Liaw","year":"2002","unstructured":"Liaw, A., Wiener, M., et al. (2002). Classification and regression by randomforest. R News, 2(3), 18\u201322.","journal-title":"R News"},{"key":"6014_CR39","unstructured":"Loosli, G., Canu, S., & Bottou, L. (2007). Training invariant support vector machines using selective sampling. In L. Bottou, O. Chapelle, D. DeCoste, & J. Weston (Eds.), Large scale kernel machines (pp. 301\u2013320). Cambridge: MIT Press."},{"key":"6014_CR40","unstructured":"Nijssen, S., & Kok, J. (2006). Frequent subgraph miners: Runtimes don$$\\backslash$$\u2019t say everything. In Proceedings of the workshop on mining and learning with graphs (pp. 173\u2013180)."},{"key":"6014_CR1","unstructured":"ODROID-N2. https:\/\/www.hardkernel.com\/shop\/odroid-n2-with-4gbyte-ram\/. Retrieved October 25,2019."},{"key":"6014_CR41","first-page":"2825","volume":"12","author":"F Pedregosa","year":"2011","unstructured":"Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., et al. (2011). Scikit-learn: Machine learning in python. The Journal of Machine Learning Research, 12, 2825\u20132830.","journal-title":"The Journal of Machine Learning Research"},{"key":"6014_CR42","unstructured":"Rezende, D. J., Mohamed, S., & Wierstra, D. (2014). Stochastic backpropagation and approximate inference in deep generative models. In ICML (pp. 1278\u20131286). PMLR."},{"key":"6014_CR43","doi-asserted-by":"crossref","unstructured":"Richter, J., Kotthaus, H., Bischl, B., Marwedel, P., Rahnenf\u00fchrer, J., & Lang, M. (2016). Faster model-based optimization through resource-aware scheduling strategies. In Learning and intelligent optimization (pp. 267\u2013273). Springer.","DOI":"10.1007\/978-3-319-50349-3_22"},{"key":"6014_CR44","doi-asserted-by":"crossref","unstructured":"Shi, J., Bian, J., & Richter, J. (2021). Model-based optimization on distributed embedded system. https:\/\/github.com\/Strange369\/MODES-public","DOI":"10.1007\/s10994-021-06014-6"},{"key":"6014_CR45","doi-asserted-by":"crossref","unstructured":"Singh, I., Zhou, H., Yang, K., Ding, M., Lin, B., & Xie, P. (2020). Differentially-private federated neural architecture search. arXiv:2006.10559","DOI":"10.36227\/techrxiv.12503420.v1"},{"key":"6014_CR46","unstructured":"Snoek, J., Larochelle, H., & Adams, R. P. (2012). Practical Bayesian optimization of machine learning algorithms. In Advances in neural information processing systems (pp. 2951\u20132959)."},{"issue":"1","key":"6014_CR47","first-page":"1929","volume":"15","author":"N Srivastava","year":"2014","unstructured":"Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., & Salakhutdinov, R. (2014). Dropout: A simple way to prevent neural networks from overfitting. The Journal of Machine Learning Research, 15(1), 1929\u20131958.","journal-title":"The Journal of Machine Learning Research"},{"key":"6014_CR48","unstructured":"Xiao, H., Rasul, K., & Vollgraf, R. (2017). Fashion-mnist: A novel image dataset for benchmarking machine learning algorithms. arXiv:1708.07747"},{"issue":"2","key":"6014_CR49","doi-asserted-by":"publisher","first-page":"49","DOI":"10.1109\/TBDATA.2015.2472014","volume":"1","author":"EP Xing","year":"2015","unstructured":"Xing, E. P., Ho, Q., Dai, W., Kim, J. K., Wei, J., Lee, S., et al. (2015). A new platform for distributed machine learning on big data. IEEE Transactions on Big Data, 1(2), 49\u201367.","journal-title":"IEEE Transactions on Big Data"},{"key":"6014_CR50","unstructured":"Zhu, H., & Jin, Y. (2020). Real-time federated evolutionary neural architecture search. arXiv:2003.02793"}],"container-title":["Machine Learning"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s10994-021-06014-6.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1007\/s10994-021-06014-6\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s10994-021-06014-6.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2022,12,29]],"date-time":"2022-12-29T22:43:52Z","timestamp":1672353832000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/s10994-021-06014-6"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2021,6]]},"references-count":50,"journal-issue":{"issue":"6","published-print":{"date-parts":[[2021,6]]}},"alternative-id":["6014"],"URL":"https:\/\/doi.org\/10.1007\/s10994-021-06014-6","relation":{},"ISSN":["0885-6125","1573-0565"],"issn-type":[{"type":"print","value":"0885-6125"},{"type":"electronic","value":"1573-0565"}],"subject":[],"published":{"date-parts":[[2021,6]]},"assertion":[{"value":"20 September 2020","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"25 May 2021","order":2,"name":"revised","label":"Revised","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"27 May 2021","order":3,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"4 June 2021","order":4,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}}]}}