{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,9,19]],"date-time":"2025-09-19T09:38:47Z","timestamp":1758274727782,"version":"3.37.3"},"reference-count":42,"publisher":"Springer Science and Business Media LLC","issue":"12","license":[{"start":{"date-parts":[[2022,11,5]],"date-time":"2022-11-05T00:00:00Z","timestamp":1667606400000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/www.springernature.com\/gp\/researchers\/text-and-data-mining"},{"start":{"date-parts":[[2022,11,5]],"date-time":"2022-11-05T00:00:00Z","timestamp":1667606400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.springernature.com\/gp\/researchers\/text-and-data-mining"}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Appl Intell"],"published-print":{"date-parts":[[2023,6]]},"DOI":"10.1007\/s10489-022-04227-3","type":"journal-article","created":{"date-parts":[[2022,11,5]],"date-time":"2022-11-05T07:04:51Z","timestamp":1667631891000},"page":"14903-14917","update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":7,"title":["Learning more with the same effort: how randomization improves the robustness of a robotic deep reinforcement learning agent"],"prefix":"10.1007","volume":"53","author":[{"ORCID":"https:\/\/orcid.org\/0000-0003-2426-3748","authenticated-orcid":false,"given":"Luc\u00eda","family":"G\u00fcitta-L\u00f3pez","sequence":"first","affiliation":[]},{"given":"Jaime","family":"Boal","sequence":"additional","affiliation":[]},{"given":"\u00c1lvaro J.","family":"L\u00f3pez-L\u00f3pez","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2022,11,5]]},"reference":[{"key":"4227_CR1","unstructured":"Sutton RS, Barto AG (2018) Reinforcement learning: an introduction. 2nd edn The MIT Press"},{"key":"4227_CR2","unstructured":"Mahmood AR, Korenkevych D, Vasan G, Ma W, Bergstra J (2018) Benchmarking reinforcement learning algorithms on real-world robots. In: Proc 2nd conf Robot learning, vol 87, pp 561\u2013591"},{"issue":"3\u20134","key":"4227_CR3","doi-asserted-by":"publisher","first-page":"219","DOI":"10.1561\/2200000071","volume":"11","author":"V Fran\u00e7ois-Lavet","year":"2018","unstructured":"Fran\u00e7ois-Lavet V, Henderson P, Islam R, Bellemare MG, Pineau J (2018) An introduction to deep reinforcement learning. Found Trends Mach Learn 11(3\u20134):219\u2013354. https:\/\/doi.org\/10.1561\/2200000071https:\/\/doi.org\/10.1561\/2200000071","journal-title":"Found Trends Mach Learn"},{"key":"4227_CR4","unstructured":"Rusu AA, Ve\u010der\u00edk M, Roth\u00f6rl T, Heess N, Pascanu R, Hadsell R (2017) Sim-to-real robot learning from pixels with progressive nets. In: 1St conf. robot learning"},{"key":"4227_CR5","doi-asserted-by":"crossref","unstructured":"Tobin J, Fong R, Ray A, Schneider J, Zaremba W, Abbeel P (2017) Domain randomization for transferring deep neural networks from simulation to the real world. In: Proc. IEEE\/RSJ int conf intelligent robots and systems, pp 23\u201330","DOI":"10.1109\/IROS.2017.8202133"},{"key":"4227_CR6","doi-asserted-by":"crossref","unstructured":"Bellman R (1957) A Markovian decision process. Journal of Mathematics and Mechanics, pp 679\u2013684","DOI":"10.1512\/iumj.1957.6.56038"},{"issue":"1","key":"4227_CR7","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1287\/mnsc.28.1.1","volume":"28","author":"GE Monahan","year":"1982","unstructured":"Monahan GE (1982) Survey of partially observable Markov decision processes - Theory, models and algortihms. Manag Sci 28(1):1\u201316. https:\/\/doi.org\/10.1287\/mnsc.28.1.1","journal-title":"Manag Sci"},{"key":"4227_CR8","doi-asserted-by":"publisher","first-page":"176598","DOI":"10.1109\/ACCESS.2020.3027152","volume":"8","author":"MD Al-Masrur Khan","year":"2020","unstructured":"Al-Masrur Khan MD, Khan MRJ, Tooshil A, Sikder N, Parvez Mahmud MA, Kouzani AZ, Nahid AA (2020) A systematic review on reinforcement learning-based robotics within the last decade. IEEE Access 8:176598\u2013176623. https:\/\/doi.org\/10.1109\/ACCESS.2020.3027152","journal-title":"IEEE Access"},{"key":"4227_CR9","unstructured":"Mnih V, Puigdom\u00e8nech Badia A, Mirza M, Graves A, Harley T, Lillicrap TP, Silver D, Kavukcuoglu K (2016) Asynchronous methods for deep reinforcement learning. In: Proc 33rd int conf machine learning, vol 48, pp 1928\u20131937"},{"key":"4227_CR10","unstructured":"Gu Z, Jia Z, Choset H (2018) Adversary A3C for robust reinforcement learning. In: Int conf learning representations"},{"key":"4227_CR11","doi-asserted-by":"publisher","first-page":"1291","DOI":"10.1109\/TSMCC.2012.2218595","volume":"42","author":"I Grondman","year":"2012","unstructured":"Grondman I, Busoniu L, Lopes GAD, Babu\u0161ka R (2012) A survey of actor-critic reinforcement learning: standard and natural policy gradients. IEEE Trans Syst Man Cybern Part C: Appl Rev 42:1291\u20131307. https:\/\/doi.org\/10.1109\/TSMCC.2012.2218595","journal-title":"IEEE Trans Syst Man Cybern Part C: Appl Rev"},{"key":"4227_CR12","unstructured":"Babaeizadeh M, Frosio I, Tyree S, Clemons J, Kautz J (2017) Reinforcement learning through asynchronous advantage actor-critic on a GPU. In: Int conf learning representations"},{"key":"4227_CR13","doi-asserted-by":"publisher","first-page":"1421","DOI":"10.1613\/jair.1.12412","volume":"69","author":"A Lazaridis","year":"2020","unstructured":"Lazaridis A (2020) Deep reinforcement learning: a state-of-the-art walkthrough. J Artif Intell Res 69:1421\u20131471","journal-title":"J Artif Intell Res"},{"issue":"6","key":"4227_CR14","doi-asserted-by":"publisher","first-page":"26","DOI":"10.1109\/MSP.2017.2743240","volume":"34","author":"K Arulkumaran","year":"2017","unstructured":"Arulkumaran K, Deisenroth MP, Brundage M, Bharath AA (2017) Deep reinforcement learning: a brief survey. IEEE Signal Process Mag 34(6):26\u201338. https:\/\/doi.org\/10.1109\/MSP.2017.2743240","journal-title":"IEEE Signal Process Mag"},{"issue":"7553","key":"4227_CR15","doi-asserted-by":"publisher","first-page":"436","DOI":"10.1038\/nature14539","volume":"521","author":"Y Lecun","year":"2015","unstructured":"Lecun Y, Bengio Y, Hinton G (2015) Deep learning. Nature 521(7553):436\u2013444. https:\/\/doi.org\/10.1038\/nature14539","journal-title":"Nature"},{"key":"4227_CR16","doi-asserted-by":"publisher","unstructured":"Goodfellow I, Bengio Y, Courville A (2016) Deep learning MIT press. https:\/\/doi.org\/10.5555\/3086952https:\/\/www.deeplearningbook.org\/ ISBN (paper-version): 978-0262035613)","DOI":"10.5555\/3086952"},{"issue":"11","key":"4227_CR17","doi-asserted-by":"publisher","first-page":"1238","DOI":"10.1177\/0278364913495721","volume":"32","author":"J Kober","year":"2013","unstructured":"Kober J, Bagnell JA, Peters J (2013) Reinforcement learning in robotics: a survey. Int J Robot Res 32(11):1238\u20131274. https:\/\/doi.org\/10.1177\/0278364913495721","journal-title":"Int J Robot Res"},{"key":"4227_CR18","doi-asserted-by":"publisher","first-page":"99","DOI":"10.1016\/j.ins.2021.10.070","volume":"583","author":"IA Zamfirache","year":"2022","unstructured":"Zamfirache IA, Precup RE, Roman RC, Petriu EM (2022) Reinforcement learning-based control using Q-learning and gravitational search algorithm with experimental validation on a nonlinear servo system. Inf Sci 583:99\u2013120. https:\/\/doi.org\/10.1016\/j.ins.2021.10.070","journal-title":"Inf Sci"},{"key":"4227_CR19","doi-asserted-by":"publisher","first-page":"945","DOI":"10.1007\/s10462-021-09997-9","volume":"55","author":"B Singh","year":"2022","unstructured":"Singh B, Kumar R, Singh VP (2022) Reinforcement learning in robotic applications: a comprehensive survey. Artif Intell Rev 55:945\u2013990. https:\/\/doi.org\/10.1007\/s10462-021-09997-9","journal-title":"Artif Intell Rev"},{"key":"4227_CR20","doi-asserted-by":"publisher","unstructured":"Zhao W, Queralta JP, Westerlund T (2020) Sim-to-real transfer in deep reinforcement learning for robotics: a survey. In: IEEE symposium series on computational intelligence, pp 737\u2013744. https:\/\/doi.org\/10.1109\/SSCI47803.2020.9308468","DOI":"10.1109\/SSCI47803.2020.9308468"},{"key":"4227_CR21","unstructured":"Chen X, Hu J, Jin C, Li L, Wang L (2022) Understanding domain randomization for sim-to-real transfer. In: Int conf learning representations"},{"key":"4227_CR22","unstructured":"Rusu AA, Colmenarejo SG, Gulcehre C, Desjardins G, Kirkpatrick J, Pascanu R, Mnih V, Kavukcuoglu K, Hadsell R (2016) Policy distillation. In: Proc int conf learning representations"},{"key":"4227_CR23","unstructured":"Wang JX, Kurth-Nelson Z, Soyer H, Leibo JZ, Tirumala D, Munos R, Blundell C, Kumaran D, Botvinick MM (2017) Learning to reinforcement learn. In: CogSci. https:\/\/www.deepmind.com\/publications\/learning-to-reinforcement-learn"},{"issue":"2","key":"4227_CR24","doi-asserted-by":"publisher","first-page":"335","DOI":"10.1162\/0899766053011528","volume":"17","author":"J Morimoto","year":"2005","unstructured":"Morimoto J, Doya K (2005) Robust reinforcement learning. Neural Comput 17(2):335\u2013359. https:\/\/doi.org\/10.1162\/0899766053011528https:\/\/doi.org\/10.1162\/0899766053011528","journal-title":"Neural Comput"},{"key":"4227_CR25","doi-asserted-by":"publisher","unstructured":"Hussein A, Gaber MM, Elyan E, Jayne C (2017) Imitation learning: A survey of learning methods. ACM Computing Surveys 50(2). https:\/\/doi.org\/10.1145\/3054912","DOI":"10.1145\/3054912"},{"key":"4227_CR26","doi-asserted-by":"publisher","unstructured":"Zhu Y, Wang Z, Merel J, Rusu A, Erez T, Cabi S, Tunyasuvunakool S, Kram\u00e1r J, Hadsell R, de Freitas N, Heess N (2018) Reinforcement and imitation learning for diverse visuomotor skills. In: Proceedings of robotics: science and systems. https:\/\/doi.org\/10.15607\/RSS.2018.XIV.009","DOI":"10.15607\/RSS.2018.XIV.009"},{"key":"4227_CR27","unstructured":"Traor\u00e9 R, Caselles-Dupr\u00e9 H, Lesort T, Sun T, D\u00edaz-Rodr\u00edguez N, Filliat D (2019) Continual reinforcement learning deployed in real-life using policy distillation and sim2real transfer. In: Proc int conf machine learning"},{"key":"4227_CR28","doi-asserted-by":"crossref","unstructured":"Arndt K, Hazara M, Ghadirzadeh A, Kyrki V (2020) Meta reinforcement learning for sim-to-real domain adaptation. In: IEEE int conf robotics and automation","DOI":"10.1109\/ICRA40945.2020.9196540"},{"key":"4227_CR29","unstructured":"Higgins I, Pal A, Rusu A, Matthey L, Burgess C, Pritzel A, Botvinick M, Blundell C, Lerchner A (2017) DARLA: improving Zero-shot transfer in reinforcement learning. In: Proc 34th int conf machine learning, vol 70, pp 1480\u20131490"},{"key":"4227_CR30","doi-asserted-by":"publisher","unstructured":"Shoeleh F, Asadpour M (2020) Skill based transfer learning with domain adaptation for continuous reinforcement learning domains. Applied Intelligence 50. https:\/\/doi.org\/10.1007\/s10489-019-01527-zhttps:\/\/doi.org\/10.1007\/s10489-019-01527-z","DOI":"10.1007\/s10489-019-01527-z 10.1007\/s10489-019-01527-z"},{"key":"4227_CR31","doi-asserted-by":"publisher","unstructured":"Bousmalis K, Irpan A, Wohlhart P, Bai Y, Kelcey M, Kalakrishnan M, Downs L, Ibarz J, Pastor P, Konolige K, Levine S, Vanhoucke V (2018) Using simulation and domain adaptation to improve efficiency of deep robotic grasping, 2018 IEEE International Conference on Robotics and Automation (ICRA) pp 4243\u20134250. https:\/\/doi.org\/10.1109\/ICRA.2018.8460875","DOI":"10.1109\/ICRA.2018.8460875"},{"key":"4227_CR32","unstructured":"Goodfellow IJ, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, Courville A, Bengio Y (2014) Generative adversarial nets. In: Proc 27th int conf neural information processing systems, vol 27"},{"key":"4227_CR33","doi-asserted-by":"publisher","unstructured":"Shrivastava A, Pfister T, Tuzel O, Susskind J, Wang W, Webb R (2017) Learning from simulated and unsupervised images through adversarial training. In: Proc IEEE conf computer vision and pattern recognition, pp 2242\u20132251. https:\/\/doi.org\/10.1109\/CVPR.2017.241https:\/\/doi.org\/10.1109\/CVPR.2017.241","DOI":"10.1109\/CVPR.2017.241 10.1109\/CVPR.2017.241"},{"key":"4227_CR34","doi-asserted-by":"publisher","unstructured":"Zhu J-Y, Park T, Isola P, Efros AA (2017) Unpaired image-to-image translation using cycle-consistent adversarial networks. In: IEEE Int conf computer vision pp 2242\u20132251. https:\/\/doi.org\/10.1109\/ICCV.2017.244https:\/\/doi.org\/10.1109\/ICCV.2017.244","DOI":"10.1109\/ICCV.2017.244 10.1109\/ICCV.2017.244"},{"key":"4227_CR35","doi-asserted-by":"publisher","unstructured":"Rusu AA, Rabinowitz NC, Desjardins G, Soyer H, Kirkpatrick J, Kavukcuoglu K, Pascanu R, Hadsell R (2016) Progressive neural networks Computing Research Repository (coRR). https:\/\/doi.org\/10.48550\/arXiv.1606.04671","DOI":"10.48550\/arXiv.1606.04671"},{"key":"4227_CR36","unstructured":"Simonyan K, Zisserman A (2015) Very deep convolutional networks for large-scale image recognition. In: 3rd int conf learning representations"},{"key":"4227_CR37","doi-asserted-by":"publisher","unstructured":"James S, Wohlhart P, Kalakrishnan M, Kalashnikov D, Irpan A, Ibarz J, Levine S, Hadsell R, Bousmalis K (2019) Sim-to-real via sim-to-sim: data-efficient robotic grasping via randomized-to-canonical adaptation networks. In: Proc IEEE\/CVF conf computer vision and pattern recognition, pp 12619\u201312629. https:\/\/doi.org\/10.1109\/CVPR.2019.01291","DOI":"10.1109\/CVPR.2019.01291"},{"key":"4227_CR38","doi-asserted-by":"publisher","unstructured":"Mozifian M, Zhang A, Pineau J, Meger D (2020) Intervention design for effective sim2real transfer Computing Research Repository (coRR). https:\/\/doi.org\/10.48550\/arXiv.2012.02055","DOI":"10.48550\/arXiv.2012.02055"},{"key":"4227_CR39","unstructured":"Chan SCY, Fishman S, Canny J, Korattikara A, Guadarrama S (2020) Measuring the reliability of reinforcement learning algorithms. In: Int conf learning representations"},{"key":"4227_CR40","unstructured":"Jordan SM, Chandak Y, Cohen D, Zhang M, Thomas PS (2020) Evaluating the performance of reinforcement learning algorithms. In: Proc 37th int conf machine learning"},{"key":"4227_CR41","doi-asserted-by":"publisher","unstructured":"Todorov E, Erez T, Tassa Y (2012) MuJoCo: a physics engine for model-based control. In: IEEE int conf intelligent robots and systems, pp 5026\u20135033. https:\/\/doi.org\/10.1109\/IROS.2012.6386109","DOI":"10.1109\/IROS.2012.6386109"},{"key":"4227_CR42","unstructured":"Stevens E, Antiga L, Viehmann T (2020) Deep learning with PyTorch"}],"container-title":["Applied Intelligence"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s10489-022-04227-3.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1007\/s10489-022-04227-3\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s10489-022-04227-3.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,6,1]],"date-time":"2023-06-01T03:49:34Z","timestamp":1685591374000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/s10489-022-04227-3"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,11,5]]},"references-count":42,"journal-issue":{"issue":"12","published-print":{"date-parts":[[2023,6]]}},"alternative-id":["4227"],"URL":"https:\/\/doi.org\/10.1007\/s10489-022-04227-3","relation":{},"ISSN":["0924-669X","1573-7497"],"issn-type":[{"type":"print","value":"0924-669X"},{"type":"electronic","value":"1573-7497"}],"subject":[],"published":{"date-parts":[[2022,11,5]]},"assertion":[{"value":"27 September 2022","order":1,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"5 November 2022","order":2,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Declarations"}},{"value":"Not applicable","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"<!--Emphasis Type='Bold' removed-->Consent for Publication"}},{"value":"All authors certify that they have no affiliations with or involvement in any organization or entity with any financial interest or non-financial interest in the subject matter or materials discussed in this manuscript.","order":3,"name":"Ethics","group":{"name":"EthicsHeading","label":"<!--Emphasis Type='Bold' removed-->Conflict of Interests"}}]}}