{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,16]],"date-time":"2026-04-16T04:40:22Z","timestamp":1776314422031,"version":"3.50.1"},"reference-count":96,"publisher":"Springer Science and Business Media LLC","issue":"3","license":[{"start":{"date-parts":[[2020,9,30]],"date-time":"2020-09-30T00:00:00Z","timestamp":1601424000000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2020,9,30]],"date-time":"2020-09-30T00:00:00Z","timestamp":1601424000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"DOI":"10.13039\/100010663","name":"H2020 European Research Council","doi-asserted-by":"publisher","award":["ERC-StG-2015-ERC"],"award-info":[{"award-number":["ERC-StG-2015-ERC"]}],"id":[{"id":"10.13039\/100010663","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100010663","name":"H2020 European Research Council","doi-asserted-by":"publisher","award":["Project ID: 678082"],"award-info":[{"award-number":["Project ID: 678082"]}],"id":[{"id":"10.13039\/100010663","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Neural Process Lett"],"published-print":{"date-parts":[[2020,12]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>Joining multiple decision-makers together is a powerful way to obtain more sophisticated decision-making systems, but requires to address the questions of division of labor and specialization. We investigate in how far information constraints in hierarchies of experts not only provide a principled method for regularization but also to enforce specialization. In particular, we devise an information-theoretically motivated on-line learning rule that allows partitioning of the problem space into multiple sub-problems that can be solved by the individual experts. We demonstrate two different ways to apply our method: (i) partitioning problems based on individual data samples and (ii) based on sets of data samples representing tasks. Approach (i) equips the system with the ability to solve complex decision-making problems by finding an optimal combination of local expert decision-makers. Approach (ii) leads to decision-makers specialized in solving families of tasks, which equips the system with the ability to solve meta-learning problems. We show the broad applicability of our approach on a range of problems including classification, regression, density estimation, and reinforcement learning problems, both in the standard machine learning setup and in a meta-learning setting.<\/jats:p>","DOI":"10.1007\/s11063-020-10351-3","type":"journal-article","created":{"date-parts":[[2020,9,30]],"date-time":"2020-09-30T07:03:56Z","timestamp":1601449436000},"page":"2319-2352","update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":16,"title":["Specialization in Hierarchical Learning Systems"],"prefix":"10.1007","volume":"52","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-3244-3661","authenticated-orcid":false,"given":"Heinke","family":"Hihn","sequence":"first","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-8637-6652","authenticated-orcid":false,"given":"Daniel A.","family":"Braun","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"297","published-online":{"date-parts":[[2020,9,30]]},"reference":[{"key":"10351_CR1","unstructured":"Abadi M, Barham P, Chen J, Chen Z, Davis A, Dean J, Devin M, Ghemawat S, Irving G, Isard M et al (2016) Tensorflow: a system for large-scale machine learning. In: 12th USENIX symposium on operating systems design and implementation (OSDI 16), pp 265\u2013283"},{"key":"10351_CR2","unstructured":"Abramova E, Dickens L, Kuhn D, Faisal A (2012) Hierarchical, heterogeneous control of non-linear dynamical systems using reinforcement learning. In: European workshop on reinforcement learning at ICML"},{"key":"10351_CR3","volume-title":"Organizations evolving","author":"H Aldrich","year":"1999","unstructured":"Aldrich H (1999) Organizations evolving. Sage, London"},{"key":"10351_CR4","doi-asserted-by":"crossref","unstructured":"Allamraju R, Chowdhary G (2017) Communication efficient decentralized Gaussian process fusion for multi-UAS path planning. In: Proceedings of the 2017 American control conference (ACC). IEEE, pp 4442\u20134447","DOI":"10.23919\/ACC.2017.7963639"},{"issue":"6","key":"10351_CR5","doi-asserted-by":"crossref","first-page":"26","DOI":"10.1109\/MSP.2017.2743240","volume":"34","author":"K Arulkumaran","year":"2017","unstructured":"Arulkumaran K, Deisenroth MP, Brundage M, Bharath AA (2017) Deep reinforcement learning: a brief survey. IEEE Signal Process Mag 34(6):26\u201338","journal-title":"IEEE Signal Process Mag"},{"key":"10351_CR6","doi-asserted-by":"crossref","unstructured":"Atkeson CG, Moore AW, Schaal S (1997) Locally weighted learning for control. In: Lazy learning. Springer, Berlin, pp 75\u2013113","DOI":"10.1007\/978-94-017-2053-3_3"},{"issue":"3","key":"10351_CR7","doi-asserted-by":"crossref","first-page":"1399","DOI":"10.1007\/s11063-018-9875-8","volume":"49","author":"S Balasundaram","year":"2019","unstructured":"Balasundaram S, Meena Y (2019) Robust support vector regression in primal with asymmetric huber loss. Neural Process Lett 49(3):1399\u20131431","journal-title":"Neural Process Lett"},{"issue":"3","key":"10351_CR8","doi-asserted-by":"crossref","first-page":"295","DOI":"10.1162\/neco.1989.1.3.295","volume":"1","author":"HB Barlow","year":"1989","unstructured":"Barlow HB (1989) Unsupervised learning. Neural Comput 1(3):295\u2013311","journal-title":"Neural Comput"},{"key":"10351_CR9","doi-asserted-by":"crossref","unstructured":"Bellmann P, Thiam P, Schwenker F (2018) Multi-classifier-systems: architectures, algorithms and applications. In: Computational intelligence for pattern recognition, Springer, Berlin, pp 83\u2013113","DOI":"10.1007\/978-3-319-89629-8_4"},{"issue":"7","key":"10351_CR10","doi-asserted-by":"crossref","first-page":"719","DOI":"10.1109\/34.865189","volume":"22","author":"C Biernacki","year":"2000","unstructured":"Biernacki C, Celeux G, Govaert G (2000) Assessing a mixture model for clustering with the integrated completed likelihood. IEEE Trans Pattern Anal Mach Intell 22(7):719\u2013725","journal-title":"IEEE Trans Pattern Anal Mach Intell"},{"key":"10351_CR11","doi-asserted-by":"crossref","unstructured":"Botvinick M, Ritter S, Wang JX, Kurth-Nelson Z, Blundell C, Hassabis D (2019) Reinforcement learning, fast and slow. Trends in cognitive sciences","DOI":"10.1016\/j.tics.2019.02.006"},{"issue":"2","key":"10351_CR12","doi-asserted-by":"crossref","first-page":"157","DOI":"10.1016\/j.bbr.2009.08.031","volume":"206","author":"DA Braun","year":"2010","unstructured":"Braun DA, Mehring C, Wolpert DM (2010) Structure learning in action. Behav Brain Res 206(2):157\u2013165","journal-title":"Behav Brain Res"},{"key":"10351_CR13","volume-title":"Metalearning: applications to data mining","author":"P Brazdil","year":"2008","unstructured":"Brazdil P, Carrier CG, Soares C, Vilalta R (2008) Metalearning: applications to data mining. Springer, Berlin"},{"key":"10351_CR14","unstructured":"Brockman G, Cheung V, Pettersson L, Schneider J, Schulman J, Tang J, Zaremba W (2016) Openai gym. arXiv preprint arXiv:1606.01540"},{"issue":"1","key":"10351_CR15","doi-asserted-by":"crossref","first-page":"41","DOI":"10.1023\/A:1007379606734","volume":"28","author":"R Caruana","year":"1997","unstructured":"Caruana R (1997) Multitask learning. Mach Learn 28(1):41\u201375","journal-title":"Mach Learn"},{"key":"10351_CR16","doi-asserted-by":"crossref","unstructured":"Damasio A (2009) Neuroscience and the emergence of neuroeconomics. In: Neuroeconomics. Elsevier, pp 207\u2013213","DOI":"10.1016\/B978-0-12-374176-9.00014-2"},{"key":"10351_CR17","unstructured":"Daniel Christian, Neumann Gerhard, Peters Jan (2012) Hierarchical relative entropy policy search. In: Artificial Intelligence and Statistics, pages 273\u2013281"},{"issue":"4","key":"10351_CR18","doi-asserted-by":"crossref","first-page":"599","DOI":"10.1111\/cogs.12101","volume":"38","author":"V Edward","year":"2014","unstructured":"Edward V, Noah G, Griffiths TL, Tenenbaum JB (2014) One and done? Optimal decisions from very few samples. Cognit Sci 38(4):599\u2013637","journal-title":"Cognit Sci"},{"key":"10351_CR19","unstructured":"Finn C, Abbeel P, Levine S (2017) Model-agnostic meta-learning for fast adaptation of deep networks. In: Proceedings of the 34th international conference on machine learning, vol 70, pp 1126\u20131135. JMLR. org"},{"key":"10351_CR20","unstructured":"Fox R, Pakman A, Tishby N (2016) Taming the noise in reinforcement learning via soft updates. In: Proceedings of the thirty-second conference on uncertainty in artificial intelligence, pp 202\u2013211"},{"key":"10351_CR21","unstructured":"Galashov A, Jayakumar SM, Hasenclever L, Tirumala D, Schwarz J, Desjardins G, Czarnecki WM, Teh YW, Pascanu R, Heess N (2019) Information asymmetry in KL-regularized RL. In: Proceedings of the international conference on representation learning"},{"issue":"8","key":"10351_CR22","doi-asserted-by":"crossref","first-page":"e1004369","DOI":"10.1371\/journal.pcbi.1004369","volume":"11","author":"T Genewein","year":"2015","unstructured":"Genewein T, Hez E, Razzaghpanah Z, Braun DA (2015) Structure learning in bayesian sensorimotor integration. PLoS Comput Biol 11(8):e1004369","journal-title":"PLoS Comput Biol"},{"key":"10351_CR23","doi-asserted-by":"crossref","first-page":"27","DOI":"10.3389\/frobt.2015.00027","volume":"2","author":"T Genewein","year":"2015","unstructured":"Genewein T, Leibfried F, Grau-Moya J, Braun DA (2015) Bounded rationality, abstraction, and hierarchical decision-making: an information-theoretic optimality principle. Front Robot AI 2:27","journal-title":"Front Robot AI"},{"issue":"6245","key":"10351_CR24","doi-asserted-by":"crossref","first-page":"273","DOI":"10.1126\/science.aac6076","volume":"349","author":"SJ Gershman","year":"2015","unstructured":"Gershman SJ, Horvitz EJ, Tenenbaum JB (2015) Computational rationality: a converging paradigm for intelligence in brains, minds, and machines. Science 349(6245):273\u2013278","journal-title":"Science"},{"key":"10351_CR25","unstructured":"Ghosh D, Singh A, Rajeswaran A, Kumar V, Levine S (2018) Divide-and-conquer reinforcement learning. In: Proceedings of the international conference on representation learning"},{"issue":"1","key":"10351_CR26","doi-asserted-by":"crossref","first-page":"107","DOI":"10.1111\/j.1756-8765.2008.01006.x","volume":"1","author":"G Gigerenzer","year":"2009","unstructured":"Gigerenzer G, Brighton H (2009) Homo heuristicus: why biased minds make better inferences. Top Cognit Sci 1(1):107\u2013143","journal-title":"Top Cognit Sci"},{"key":"10351_CR27","unstructured":"Giraud-Carrier C (2008) Metalearning-a tutorial. In: Tutorial at the 7th international conference on machine learning and applications (ICMLA), San Diego, California, USA"},{"key":"10351_CR28","doi-asserted-by":"crossref","unstructured":"Gottwald S, Braun DA (2019) Bounded rational decision-making from elementary computations that reduce uncertainty. Entropy 21(4)","DOI":"10.3390\/e21040375"},{"issue":"2","key":"10351_CR29","doi-asserted-by":"crossref","first-page":"440","DOI":"10.1162\/neco_a_01153","volume":"31","author":"S Gottwald","year":"2019","unstructured":"Gottwald S, Braun DA (2019) Systems of bounded rational agents with information-theoretic constraints. Neural Comput 31(2):440\u2013476","journal-title":"Neural Comput"},{"issue":"1","key":"10351_CR30","doi-asserted-by":"crossref","first-page":"1","DOI":"10.3390\/e20010001","volume":"20","author":"J Grau-Moya","year":"2017","unstructured":"Grau-Moya J, Kr\u00fcger M, Braun DA (2017) Non-equilibrium relations for bounded rational decision-making in changing environments. Entropy 20(1):1","journal-title":"Entropy"},{"key":"10351_CR31","doi-asserted-by":"crossref","unstructured":"Grau-Moya Jordi, Leibfried Felix, Genewein Tim, Braun Daniel\u00a0A (2016) Planning with information-processing constraints and model uncertainty in markov decision processes. In: Proceeedings of the Joint European Conference on Machine Learning and Knowledge Discovery in Databases, pages 475\u2013491. Springer","DOI":"10.1007\/978-3-319-46227-1_30"},{"key":"10351_CR32","unstructured":"Grau-Moya J, Leibfried F, Vrancx P (2019) Soft q-learning with mutual-information regularization. In: Proceedings of the international conference on learning representations"},{"key":"10351_CR33","unstructured":"Grover A, Ermon S (2019) Uncertainty autoencoders: Learning compressed representations via variational information maximization. In: Proceedings of the the 22nd international conference on artificial intelligence and statistics, pp 2514\u20132524"},{"key":"10351_CR34","unstructured":"Haarnoja T, Tang H, Abbeel P, Levine S (2017) Reinforcement learning with deep energy-based policies. In: Proceedings of the 34th international conference on machine learning, vol 70, pp 1352\u20131361. JMLR. org"},{"key":"10351_CR35","doi-asserted-by":"crossref","unstructured":"Hihn H, Gottwald S, Braun DA (2018) Bounded rational decision-making with adaptive neural network priors. In: IAPR workshop on artificial neural networks in pattern recognition. Springer, pp 213\u2013225","DOI":"10.1007\/978-3-319-99978-4_17"},{"key":"10351_CR36","doi-asserted-by":"crossref","unstructured":"Hihn H, Gottwald S, Braun DA (2019) An information-theoretic on-line learning principle for specialization in hierarchical decision-making systems. In: Proceedings of the 2019 IEEE conference on decision-making and control (CDC)","DOI":"10.1109\/CDC40024.2019.9029255"},{"key":"10351_CR37","unstructured":"Hutter F, Kotthoff L, Vanschoren J, Automated machine learning. Springer, Berlin"},{"key":"10351_CR38","unstructured":"Ioffe S, Szegedy C (2015) Batch normalization: accelerating deep network training by reducing internal covariate shift. arXiv preprint arXiv:1502.03167"},{"issue":"1","key":"10351_CR39","doi-asserted-by":"crossref","first-page":"79","DOI":"10.1162\/neco.1991.3.1.79","volume":"3","author":"RA Jacobs","year":"1991","unstructured":"Jacobs RA, Jordan MI, Nowlan SJ, Hinton GE (1991) Adaptive mixtures of local experts. Neural Comput 3(1):79\u201387","journal-title":"Neural Comput"},{"key":"10351_CR40","doi-asserted-by":"crossref","unstructured":"Jankowski N, Duch W, Grkabczewski K (2011) Meta-learning in computational intelligence, vol 358. Springer, Berlin","DOI":"10.1007\/978-3-642-20980-2"},{"key":"10351_CR41","unstructured":"Jaynes ET (1996) Probability theory: the logic of science. Washington Universityn St. Louis, MO"},{"issue":"3","key":"10351_CR42","doi-asserted-by":"crossref","first-page":"307","DOI":"10.1111\/j.1467-7687.2007.00585.x","volume":"10","author":"C Kemp","year":"2007","unstructured":"Kemp C, Perfors A, Tenenbaum JB (2007) Learning overhypotheses with hierarchical bayesian models. Dev Sci 10(3):307\u2013321","journal-title":"Dev Sci"},{"key":"10351_CR43","unstructured":"Kingma Diederik\u00a0P, Ba Jimmy (2014) Adam: A method for stochastic optimization. In: Proceedings of the International Conference on Representation Learning"},{"key":"10351_CR44","unstructured":"Kingma DP, Welling M (2013) Auto-encoding variational bayes. In: Proceedings of the international conference on representation learning"},{"key":"10351_CR45","unstructured":"Koch G, Zemel R, Salakhutdinov R (2015) Siamese neural networks for one-shot image recognition. In: ICML deep learning workshop, vol\u00a02"},{"key":"10351_CR46","unstructured":"Kuka\u010dka J, Golkov V, Cremers D (2017) Regularization for deep learning: a taxonomy. arXiv preprint arXiv:1710.10686"},{"key":"10351_CR47","doi-asserted-by":"crossref","DOI":"10.1002\/0471660264","volume-title":"Combining pattern classifiers: methods and algorithms","author":"LI Kuncheva","year":"2004","unstructured":"Kuncheva LI (2004) Combining pattern classifiers: methods and algorithms. Wiley, London"},{"key":"10351_CR48","unstructured":"Lake B, Salakhutdinov R, Gross J, Tenenbaum J (2011) One shot learning of simple visual concepts. In: Proceedings of the annual meeting of the cognitive science society, vol\u00a033"},{"issue":"6266","key":"10351_CR49","doi-asserted-by":"crossref","first-page":"1332","DOI":"10.1126\/science.aab3050","volume":"350","author":"BM Lake","year":"2015","unstructured":"Lake BM, Salakhutdinov R, Tenenbaum JB (2015) Human-level concept learning through probabilistic program induction. Science 350(6266):1332\u20131338","journal-title":"Science"},{"key":"10351_CR50","doi-asserted-by":"crossref","unstructured":"Lan L, Li Z, Guan X, Wang P (2019) Meta reinforcement learning with task embedding and shared policy. In: Proceedings of the international joint conference on artificial intelligence","DOI":"10.24963\/ijcai.2019\/387"},{"issue":"8","key":"10351_CR51","doi-asserted-by":"crossref","first-page":"1686","DOI":"10.1162\/NECO_a_00758","volume":"27","author":"F Leibfried","year":"2015","unstructured":"Leibfried F, Braun DA (2015) A reward-maximizing spiking neuron as a bounded rational decision maker. Neural Comput 27(8):1686\u20131720","journal-title":"Neural Comput"},{"issue":"1","key":"10351_CR52","doi-asserted-by":"crossref","first-page":"117","DOI":"10.1007\/s10462-013-9406-y","volume":"44","author":"C Lemke","year":"2015","unstructured":"Lemke C, Budka M, Gabrys B (2015) Metalearning: a survey of trends and technologies. Artif Intell Rev 44(1):117\u2013130","journal-title":"Artif Intell Rev"},{"key":"10351_CR53","doi-asserted-by":"crossref","unstructured":"Li S, Li W, Cook C, Zhu C, Gao Y (2018) Independently recurrent neural network (INDRNN): building a longer and deeper RNN. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 5457\u20135466","DOI":"10.1109\/CVPR.2018.00572"},{"key":"10351_CR54","doi-asserted-by":"crossref","first-page":"1230","DOI":"10.3389\/fnins.2019.01230","volume":"13","author":"Cecilia Lindig-Leon","year":"2019","unstructured":"Lindig-Leon Cecilia, Gottwald Sebastian, Braun Daniel\u00a0Alexander (2019) Analyzing abstraction and hierarchical decision-making in absolute identification by information-theoretic bounded rationality. Front Neurosci 13:1230","journal-title":"Front Neurosci"},{"issue":"9","key":"10351_CR55","doi-asserted-by":"crossref","first-page":"991","DOI":"10.1080\/13658810600830566","volume":"20","author":"SM Manson","year":"2006","unstructured":"Manson SM (2006) Bounded rationality in agent-based models: experiments with evolutionary programs. Int J Geogr Inf Sci 20(9):991\u20131012","journal-title":"Int J Geogr Inf Sci"},{"issue":"5","key":"10351_CR56","doi-asserted-by":"crossref","first-page":"e63400","DOI":"10.1371\/journal.pone.0063400","volume":"8","author":"G Martius","year":"2013","unstructured":"Martius G, Der R, Ay N (2013) Information driven self-organization of complex robotic behaviors. PloS one 8(5):e63400","journal-title":"PloS one"},{"key":"10351_CR57","doi-asserted-by":"crossref","unstructured":"McAllester DA (1999) Pac-bayesian model averaging. In: Proceedings of the twelfth annual conference on Computational learning theory, pp 164\u2013170","DOI":"10.1145\/307400.307435"},{"issue":"1","key":"10351_CR58","doi-asserted-by":"crossref","first-page":"5","DOI":"10.1023\/A:1021840411064","volume":"51","author":"DA McAllester","year":"2003","unstructured":"McAllester DA (2003) Pac-bayesian stochastic model selection. Mach Learn 51(1):5\u201321","journal-title":"Mach Learn"},{"issue":"1","key":"10351_CR59","doi-asserted-by":"crossref","first-page":"6","DOI":"10.1006\/game.1995.1023","volume":"10","author":"RD McKelvey","year":"1995","unstructured":"McKelvey RD, Palfrey TR (1995) Quantal response equilibria for normal form games. Games Econ Behav 10(1):6\u201338","journal-title":"Games Econ Behav"},{"key":"10351_CR60","unstructured":"M\u00fcller R, Kornblith S, Hinton GE (2019) When does label smoothing help? In: Advances in neural information processing systems, pp 4694\u20134703"},{"key":"10351_CR61","unstructured":"Nagabandi A, Clavera I, Liu S, Fearing RS, Abbeel P, Levine S, Finn C (2018) Learning to adapt in dynamic, real-world environments through meta-reinforcement learning. In: International conference on learning representations"},{"key":"10351_CR62","unstructured":"Neumann G, Daniel C, Kupcsik A, Deisenroth M, Peters J (2013) Information-theoretic motor skill learning. In: Proceedings of the AAAI workshop on intelligent robotic systems"},{"key":"10351_CR63","first-page":"269","volume":"6830","author":"P Ortega","year":"2011","unstructured":"Ortega P, Braun D (2011) Information, utility and bounded rationality. Lect Notes Artif Intell 6830:269\u2013274","journal-title":"Lect Notes Artif Intell"},{"key":"10351_CR64","doi-asserted-by":"crossref","unstructured":"Ortega PA, Braun DA (2013) Thermodynamics as a theory of decision-making with information-processing costs. Proc R Soc Lond A: Math Phys Eng Sci 469(2153)","DOI":"10.1098\/rspa.2012.0683"},{"key":"10351_CR65","unstructured":"Ortega PA, Wang JX, Rowland M, Genewein T, Kurth-Nelson Z, Pascanu R, Heess N, Veness J, Pritzel A, Sprechmann P et al (2019) Meta-learning of sequential strategies. arXiv preprint arXiv:1905.03030"},{"key":"10351_CR66","doi-asserted-by":"crossref","DOI":"10.1017\/CBO9781139173933","volume-title":"The adaptive decision maker","author":"JW Payne","year":"1993","unstructured":"Payne JW, Payne JW, Bettman JR, Johnson EJ (1993) The adaptive decision maker. Cambridge University Press, Cambridge"},{"key":"10351_CR67","first-page":"2825","volume":"12","author":"F Pedregosa","year":"2011","unstructured":"Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, Blondel M, Prettenhofer P, Weiss R, Dubourg V, Vanderplas J, Passos A, Cournapeau D, Brucher M, Perrot M, Duchesnay E (2011) Scikit-learn: machine learning in python. J Mach Learn Res 12:2825\u20132830","journal-title":"J Mach Learn Res"},{"key":"10351_CR68","doi-asserted-by":"crossref","unstructured":"Peng Z, Genewein T, Leibfried F, Braun DA (2017) An information-theoretic on-line update principle for perception-action coupling. In: Proceedings of the 2017 IEEE\/RSJ international conference on intelligent robots and systems (IROS). IEEE, pp 789\u2013796","DOI":"10.1109\/IROS.2017.8202240"},{"key":"10351_CR69","unstructured":"Pereyra G, Tucker G, Chorowski J, Kaiser \u0141, Hinton G (2017) Regularizing neural networks by penalizing confident output distributions. In: Proceedings of the international conference on learning representations (ICLR) 2017"},{"key":"10351_CR70","unstructured":"Randl\u00f8v J, Barto AG, Rosenstein MT (2000) Combining reinforcement learning with a local control algorithm. In: Proceedings of the international conference on machine learning"},{"key":"10351_CR71","unstructured":"Ravi S, Larochelle H (2017) Optimization as a model for few-shot learning. In: Proceedings of the international conference on learning representations"},{"key":"10351_CR72","unstructured":"Rothfuss J, Lee D, Clavera I, Asfour T, Abbeel P (2018) Promp: proximal meta-policy search. In: International conference on learning representations"},{"key":"10351_CR73","doi-asserted-by":"crossref","unstructured":"Schach S, Gottwald S, Braun DA (2018) Quantifying motor task performance by bounded rational decision theory. Front Neurosci, 12","DOI":"10.3389\/fnins.2018.00932"},{"issue":"1","key":"10351_CR74","doi-asserted-by":"crossref","first-page":"105","DOI":"10.1023\/A:1007383707642","volume":"28","author":"J Schmidhuber","year":"1997","unstructured":"Schmidhuber J, Zhao J, Wiering M (1997) Shifting inductive bias with success-story algorithm, adaptive levin search, and incremental self-improvement. Mach Learn 28(1):105\u2013130","journal-title":"Mach Learn"},{"key":"10351_CR75","unstructured":"Schulman J, Levine S, Abbeel P, Jordan M, Moritz P (2015) Trust region policy optimization. In: Proceedings of the international conference on machine learning, pp 1889\u20131897"},{"issue":"4\u20135","key":"10351_CR76","doi-asserted-by":"crossref","first-page":"439","DOI":"10.1016\/S0893-6080(01)00027-2","volume":"14","author":"F Schwenker","year":"2001","unstructured":"Schwenker F, Kestler HA, Palm G (2001) Three learning phases for radial-basis-function networks. Neural Netw 14(4\u20135):439\u2013458","journal-title":"Neural Netw"},{"key":"10351_CR77","doi-asserted-by":"crossref","DOI":"10.1201\/9781315140919","volume-title":"Density estimation for statistics and data analysis","author":"BW Silverman","year":"2018","unstructured":"Silverman BW (2018) Density estimation for statistics and data analysis. Routledge, London"},{"issue":"1","key":"10351_CR78","doi-asserted-by":"crossref","first-page":"99","DOI":"10.2307\/1884852","volume":"69","author":"HA Simon","year":"1955","unstructured":"Simon HA (1955) A behavioral model of rational choice. Q J Econ 69(1):99\u2013118","journal-title":"Q J Econ"},{"issue":"1","key":"10351_CR79","first-page":"1929","volume":"15","author":"N Srivastava","year":"2014","unstructured":"Srivastava N, Hinton G, Krizhevsky A, Sutskever I, Salakhutdinov R (2014) Dropout: a simple way to prevent neural networks from overfitting. J Mach Learn Res 15(1):1929\u20131958","journal-title":"J Mach Learn Res"},{"key":"10351_CR80","unstructured":"Sutton RS (1996) Generalization in reinforcement learning: successful examples using sparse coarse coding. In: Advances in neural information processing systems, pp 1038\u20131044"},{"key":"10351_CR81","volume-title":"Reinforcement learning: an introduction","author":"RS Sutton","year":"2018","unstructured":"Sutton RS, Barto AG (2018) Reinforcement learning: an introduction. MIT Press, Cambridge"},{"key":"10351_CR82","unstructured":"Sutton RS, McAllester DA, Singh SP, Mansour Y (2000) Policy gradient methods for reinforcement learning with function approximation. In: Advances in neural information processing systems, pp 1057\u20131063"},{"key":"10351_CR83","doi-asserted-by":"crossref","unstructured":"Szegedy C, Vanhoucke V, Ioffe S, Shlens J, Wojna Z (2016) Rethinking the inception architecture for computer vision. In: 2016 IEEE conference on computer vision and pattern recognition (CVPR), pp 2818\u20132826","DOI":"10.1109\/CVPR.2016.308"},{"key":"10351_CR84","volume-title":"Learning to learn","author":"S Thrun","year":"2012","unstructured":"Thrun S, Pratt L (2012) Learning to learn. Springer, Berlin"},{"key":"10351_CR85","doi-asserted-by":"crossref","unstructured":"Tishby N, Polani D (2011) Information theory of decisions and actions. In: Perception-action cycle: models architectures, and hardware. Springer, Berlin","DOI":"10.1007\/978-1-4419-1452-1_19"},{"key":"10351_CR86","unstructured":"Tschannen M, Djolonga J, Rubenstein PK, Gelly S, Lucic M (2020) On mutual information maximization for representation learning. In: Proceedings of the international conference on representation learning"},{"key":"10351_CR87","unstructured":"van Hasselt HP, Guez A, Hessel M, Mnih V, Silver D (2016) Learning values across many orders of magnitude. In: Advances in neural information processing systems, pp 4287\u20134295"},{"issue":"2","key":"10351_CR88","doi-asserted-by":"crossref","first-page":"77","DOI":"10.1023\/A:1019956318069","volume":"18","author":"R Vilalta","year":"2002","unstructured":"Vilalta R, Drissi Y (2002) A perspective view and survey of meta-learning. Artif Intell Rev 18(2):77\u201395","journal-title":"Artif Intell Rev"},{"key":"10351_CR89","doi-asserted-by":"crossref","unstructured":"Vincent P, Larochelle H, Bengio Y, Manzagol P-A (2008) Extracting and composing robust features with denoising autoencoders. In: Proceedings of the 25th international conference on machine learning, pp 1096\u20131103. ACM","DOI":"10.1145\/1390156.1390294"},{"key":"10351_CR90","unstructured":"Vinyals O, Blundell C, Lillicrap T, Wierstra D et\u00a0al (2016) Matching networks for one shot learning. In: Advances in neural information processing systems, pp 3630\u20133638"},{"key":"10351_CR91","volume-title":"Theory of games and economic behavior (commemorative edition)","author":"J Von Neumann","year":"2007","unstructured":"Von Neumann J, Morgenstern O (2007) Theory of games and economic behavior (commemorative edition). Princeton University Press, Princeton"},{"key":"10351_CR92","doi-asserted-by":"crossref","unstructured":"Wolpert DH (2006) Information theory\u2014the bridge connecting bounded rational game theory and statistical physics. In: Complex engineered systems. Springer, Berlin, pp 262\u2013290","DOI":"10.1007\/3-540-32834-3_12"},{"key":"10351_CR93","doi-asserted-by":"crossref","DOI":"10.1002\/9780470382776","volume-title":"Clustering","author":"R Xu","year":"2008","unstructured":"Xu R, Wunsch D (2008) Clustering, vol 10. Wiley, London"},{"key":"10351_CR94","unstructured":"Yao H, Wei Y, Huang J, Li Z (2019) Hierarchically structured meta-learning. In: Proceedings of the international conference on machine learning, pp 7045\u20137054"},{"issue":"2","key":"10351_CR95","doi-asserted-by":"crossref","first-page":"67","DOI":"10.1007\/s10015-004-0340-6","volume":"9","author":"J Yoshimoto","year":"2005","unstructured":"Yoshimoto J, Nishimura M, Tokita Y, Ishii S (2005) Acrobot control by learning the switching of multiple controllers. Artif Life Robot 9(2):67\u201371","journal-title":"Artif Life Robot"},{"issue":"8","key":"10351_CR96","doi-asserted-by":"crossref","first-page":"1177","DOI":"10.1109\/TNNLS.2012.2200299","volume":"23","author":"SE Yuksel","year":"2012","unstructured":"Yuksel SE, Wilson JN, Gader PD (2012) Twenty years of mixture of experts. IEEE Trans Neural Netw Learn Syst 23(8):1177\u20131193","journal-title":"IEEE Trans Neural Netw Learn Syst"}],"container-title":["Neural Processing Letters"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s11063-020-10351-3.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1007\/s11063-020-10351-3\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s11063-020-10351-3.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2021,9,30]],"date-time":"2021-09-30T02:46:17Z","timestamp":1632969977000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/s11063-020-10351-3"}},"subtitle":["A Unified Information-theoretic Approach for Supervised, Unsupervised and Reinforcement Learning"],"short-title":[],"issued":{"date-parts":[[2020,9,30]]},"references-count":96,"journal-issue":{"issue":"3","published-print":{"date-parts":[[2020,12]]}},"alternative-id":["10351"],"URL":"https:\/\/doi.org\/10.1007\/s11063-020-10351-3","relation":{},"ISSN":["1370-4621","1573-773X"],"issn-type":[{"value":"1370-4621","type":"print"},{"value":"1573-773X","type":"electronic"}],"subject":[],"published":{"date-parts":[[2020,9,30]]},"assertion":[{"value":"5 September 2020","order":1,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"30 September 2020","order":2,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Compliance with ethical standards"}},{"value":"The authors declare no conflicts of interest.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Conflict of interest"}}]}}