{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,16]],"date-time":"2026-02-16T19:37:20Z","timestamp":1771270640087,"version":"3.50.1"},"publisher-location":"New York, NY, USA","reference-count":9,"publisher":"ACM","license":[{"start":{"date-parts":[[2021,7,21]],"date-time":"2021-07-21T00:00:00Z","timestamp":1626825600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2021,7,21]]},"DOI":"10.1145\/3461702.3462473","type":"proceedings-article","created":{"date-parts":[[2021,7,31]],"date-time":"2021-07-31T01:21:32Z","timestamp":1627694492000},"page":"275-276","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":3,"title":["Training for Implicit Norms in Deep Reinforcement Learning Agents through Adversarial Multi-Objective Reward Optimization"],"prefix":"10.1145","author":[{"given":"Markus","family":"Peschl","sequence":"first","affiliation":[{"name":"Delft University of Technology, Delft, Netherlands"}]}],"member":"320","published-online":{"date-parts":[[2021,7,30]]},"reference":[{"key":"e_1_3_2_1_1_1","volume-title":"AAAI Workshop: AI, Ethics, and Society.","author":"Abel David","year":"2016","unstructured":"David Abel , James MacGlashan , and Michael Littman . 2016 . Reinforcement Learning as a Framework for Ethical Decision Making . In AAAI Workshop: AI, Ethics, and Society. David Abel, James MacGlashan, and Michael Littman. 2016. Reinforcement Learning as a Framework for Ethical Decision Making. In AAAI Workshop: AI, Ethics, and Society."},{"key":"e_1_3_2_1_2_1","volume-title":"Reinforcement Learning Under Moral Uncertainty. arxiv","author":"Ecoffet Adrien","year":"2006","unstructured":"Adrien Ecoffet and Joel Lehman . 2020. Reinforcement Learning Under Moral Uncertainty. arxiv : 2006 .04734 Adrien Ecoffet and Joel Lehman. 2020. Reinforcement Learning Under Moral Uncertainty. arxiv: 2006.04734"},{"key":"e_1_3_2_1_3_1","unstructured":"Justin Fu Katie Luo and Sergey Levine. 2017. Learning Robust Rewards with Adversarial Inverse Reinforcement Learning. arxiv: 1710.11248  Justin Fu Katie Luo and Sergey Levine. 2017. Learning Robust Rewards with Adversarial Inverse Reinforcement Learning. arxiv: 1710.11248"},{"key":"e_1_3_2_1_4_1","doi-asserted-by":"publisher","DOI":"10.1147\/JRD.2019.2940428"},{"key":"e_1_3_2_1_5_1","volume-title":"Proceedings of the International Joint Conference AAMAS","volume":"3","author":"Saunders William","year":"2018","unstructured":"William Saunders , Andreas Stuhlm\u00fc ller, Girish Sastry , and Owain Evans . 2018 . Trial without error: Towards safe reinforcement learning via human intervention . In Proceedings of the International Joint Conference AAMAS , Vol. 3 . 2067--2069. arxiv: 1707.05173 William Saunders, Andreas Stuhlm\u00fc ller, Girish Sastry, and Owain Evans. 2018. Trial without error: Towards safe reinforcement learning via human intervention. In Proceedings of the International Joint Conference AAMAS, Vol. 3. 2067--2069. arxiv: 1707.05173"},{"key":"e_1_3_2_1_6_1","unstructured":"John Schulman Filip Wolski Prafulla Dhariwal Alec Radford and Oleg Klimov. 2017. Proximal Policy Optimization Algorithms. arxiv: 1707.06347  John Schulman Filip Wolski Prafulla Dhariwal Alec Radford and Oleg Klimov. 2017. Proximal Policy Optimization Algorithms. arxiv: 1707.06347"},{"key":"e_1_3_2_1_7_1","doi-asserted-by":"publisher","DOI":"10.1613\/jair.1.12360"},{"key":"e_1_3_2_1_8_1","doi-asserted-by":"publisher","DOI":"10.1609\/aaai.v32i1.11498"},{"key":"e_1_3_2_1_9_1","volume-title":"Proceedings of the National Conference on Artificial Intelligence","volume":"3","author":"Ziebart Brian D","year":"2008","unstructured":"Brian D Ziebart , Andrew Maas , J Andrew Bagnell , and Anind K Dey . 2008 . Maximum entropy inverse reinforcement learning . In Proceedings of the National Conference on Artificial Intelligence , Vol. 3 . 1433--1438. Brian D Ziebart, Andrew Maas, J Andrew Bagnell, and Anind K Dey. 2008. Maximum entropy inverse reinforcement learning. In Proceedings of the National Conference on Artificial Intelligence, Vol. 3. 1433--1438."}],"event":{"name":"AIES '21: AAAI\/ACM Conference on AI, Ethics, and Society","location":"Virtual Event USA","acronym":"AIES '21","sponsor":["SIGAI ACM Special Interest Group on Artificial Intelligence","AAAI"]},"container-title":["Proceedings of the 2021 AAAI\/ACM Conference on AI, Ethics, and Society"],"original-title":[],"link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3461702.3462473","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3461702.3462473","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T20:49:05Z","timestamp":1750193345000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3461702.3462473"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2021,7,21]]},"references-count":9,"alternative-id":["10.1145\/3461702.3462473","10.1145\/3461702"],"URL":"https:\/\/doi.org\/10.1145\/3461702.3462473","relation":{},"subject":[],"published":{"date-parts":[[2021,7,21]]},"assertion":[{"value":"2021-07-30","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}