{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,13]],"date-time":"2026-03-13T02:22:25Z","timestamp":1773368545215,"version":"3.50.1"},"reference-count":40,"publisher":"Springer Science and Business Media LLC","issue":"4","license":[{"start":{"date-parts":[[2022,9,29]],"date-time":"2022-09-29T00:00:00Z","timestamp":1664409600000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2022,9,29]],"date-time":"2022-09-29T00:00:00Z","timestamp":1664409600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"name":"TU Wien"}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Ethics Inf Technol"],"published-print":{"date-parts":[[2022,12]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>Recent years have yielded many discussions on how to endow autonomous agents with the ability to make ethical decisions, and the need for explicit ethical reasoning and transparency is a persistent theme in this literature. We present a modular and transparent approach to equip autonomous agents with the ability to comply with ethical prescriptions, while still enacting pre-learned optimal behaviour. Our approach relies on a normative supervisor module, that integrates a theorem prover for defeasible deontic logic within the control loop of a reinforcement learning agent. The supervisor operates as both an event recorder and an on-the-fly compliance checker w.r.t. an external norm base. We successfully evaluated our approach with several tests using variations of the game Pac-Man, subject to a variety of \u201cethical\u201d constraints.<\/jats:p>","DOI":"10.1007\/s10676-022-09665-8","type":"journal-article","created":{"date-parts":[[2022,9,29]],"date-time":"2022-09-29T20:02:56Z","timestamp":1664481776000},"update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":9,"title":["Enforcing ethical goals over reinforcement-learning policies"],"prefix":"10.1007","volume":"24","author":[{"ORCID":"https:\/\/orcid.org\/0000-0001-5998-3273","authenticated-orcid":false,"given":"Emery A.","family":"Neufeld","sequence":"first","affiliation":[]},{"given":"Ezio","family":"Bartocci","sequence":"additional","affiliation":[]},{"given":"Agata","family":"Ciabattoni","sequence":"additional","affiliation":[]},{"given":"Guido","family":"Governatori","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2022,9,29]]},"reference":[{"key":"9665_CR1","unstructured":"Abel, D., MacGlashan, J., & Littman, M. L. (2016). Reinforcement learning as a framework for ethical decision making. In AAAI workshop: AI, ethics, and society (Vol. 16, p. 02). http:\/\/www.aaai.org\/ocs\/index.php\/WS\/AAAIW16\/paper\/view\/12582"},{"key":"9665_CR2","unstructured":"Aler Tubella, A., & Dignum, V. (2019). The glass box approach: Verifying contextual adherence to values. In Proceedings of the AISafety@IJCAI 2019, CEUR workshop proceedings (Vol. 2419). http:\/\/ceur-ws.org\/Vol-2419\/paper_18.pdf"},{"key":"9665_CR3","doi-asserted-by":"publisher","unstructured":"Aler Tubella, A., Theodorou, A., Dignum, F., & Dignum, V. (2019). Governance by glass-box: Implementing transparent moral bounds for AI behaviour. In Proc. of IJCAI: The twenty-eighth international joint conference on artificial intelligence (pp. 5787\u20135793). ijcai.org. https:\/\/doi.org\/10.24963\/ijcai.2019\/802","DOI":"10.24963\/ijcai.2019\/802"},{"key":"9665_CR4","doi-asserted-by":"crossref","unstructured":"Alshiekh, M., Bloem, R., Ehlers, R., K\u00f6nighofer, B., Niekum, S., & Topcu, U (2018) Safe reinforcement learning via shielding. In Proceedings of the thirty-second AAAI conference on artificial intelligence (pp. 2669\u20132678). https:\/\/www.aaai.org\/ocs\/index.php\/AAAI\/AAAI18\/paper\/view\/17211","DOI":"10.1609\/aaai.v32i1.11797"},{"key":"9665_CR5","unstructured":"Andrighetto, G., Governatori, G., Noriega, P., & van der Torre, L. W. N. (eds.) (2013). Normative multi-agent systems, Dagstuhl follow-ups (Vol. 4). Schloss Dagstuhl - Leibniz-Zentrum fuer Informatik. http:\/\/drops.dagstuhl.de\/opus\/portals\/dfu\/index.php?semnr=13003"},{"key":"9665_CR6","unstructured":"Berreby, F., Bourgne, G., & Ganascia, J. G. (2017). A declarative modular framework for representing and applying ethical principles. In Proc. of AAMAS 2017: The 16th conference on autonomous agents and multiagent systems (pp. 96\u2013104). ACM. http:\/\/dl.acm.org\/citation.cfm?id=3091125"},{"key":"9665_CR7","unstructured":"Boella, G., & van der Torre, L. (2004). Regulative and constitutive norms in normative multiagent systems. In Proc. of KR 2004: The 9th intern. conf. on principles of knowledge representation and reasoning (pp. 255\u2013266). AAAI Press. http:\/\/www.aaai.org\/Library\/KR\/2004\/kr04-028.php"},{"issue":"3","key":"9665_CR8","doi-asserted-by":"publisher","first-page":"541","DOI":"10.1109\/JPROC.2019.2898267","volume":"107","author":"P Bremner","year":"2019","unstructured":"Bremner, P., Dennis, L., Fisher, M., & Winfield, A. (2019). On proactive, transparent, and verifiable ethical reasoning for robots. Proceedings of the IEEE, 107(3), 541\u2013561. https:\/\/doi.org\/10.1109\/JPROC.2019.2898267","journal-title":"Proceedings of the IEEE"},{"key":"9665_CR9","doi-asserted-by":"publisher","unstructured":"Broersen, J., Dastani, M., Hulstijn, J., Huang, Z., & van der Torre, L. (2001). The boid architecture: Conflicts between beliefs, obligations, intentions and desires. In Proc. of AGENTS 2001: The fifth international conference on Autonomous agents (pp. 9\u201316). ACM. https:\/\/doi.org\/10.1145\/375735","DOI":"10.1145\/375735"},{"key":"9665_CR10","unstructured":"DeNero, J., & Klein, D. (2014). UC Berkeley CS188 intro to AI\u2014Course materials"},{"key":"9665_CR11","doi-asserted-by":"publisher","unstructured":"Dignum, V (2017) Responsible autonomy. In Proc. of IJCAI 2017: The twenty-sixth international joint conference on artificial intelligence (pp. 4698\u20134704). ijcai.org. https:\/\/doi.org\/10.24963\/ijcai.2017\/655","DOI":"10.24963\/ijcai.2017\/655"},{"issue":"4","key":"9665_CR12","doi-asserted-by":"publisher","first-page":"193","DOI":"10.2307\/2026120","volume":"81","author":"JW Forrester","year":"1984","unstructured":"Forrester, J. W. (1984). Gentle murder, or the adverbial samaritan. The Journal of Philosophy, 81(4), 193\u2013197. https:\/\/doi.org\/10.2307\/2026120","journal-title":"The Journal of Philosophy"},{"key":"9665_CR13","doi-asserted-by":"publisher","unstructured":"Governatori, G. (2015). Thou shalt is not you will. In K. Atkinson (Ed.), Proceedings of the fifteenth international conference on artificial intelligence and law (pp. 63\u201368). ACM https:\/\/doi.org\/10.1145\/2746090.2746105","DOI":"10.1145\/2746090.2746105"},{"key":"9665_CR14","doi-asserted-by":"publisher","unstructured":"Governatori, G. (2018). Practical normative reasoning with defeasible deontic logic. In Reasoning web international summer school, Lecture notes in computer science (Vol. 11078, pp. 1\u201325). Springer. https:\/\/doi.org\/10.1007\/978-3-030-00338-8_1","DOI":"10.1007\/978-3-030-00338-8_1"},{"issue":"6","key":"9665_CR15","doi-asserted-by":"publisher","first-page":"799","DOI":"10.1007\/s10992-013-9295-1","volume":"42","author":"G Governatori","year":"2013","unstructured":"Governatori, G., Olivieri, F., Rotolo, A., & Scannapieco, S. (2013). Computing strong and weak permissions in defeasible logic. Journal of Philosophical Logic, 42(6), 799\u2013829. https:\/\/doi.org\/10.1007\/s10992-013-9295-1","journal-title":"Journal of Philosophical Logic"},{"issue":"1","key":"9665_CR16","doi-asserted-by":"publisher","first-page":"36","DOI":"10.1007\/s10458-008-9030-4","volume":"17","author":"G Governatori","year":"2008","unstructured":"Governatori, G., & Rotolo, A. (2008). BIO logical agents: Norms, beliefs, intentions in defeasible logic. Journal of Autonomous Agents and Multi Agent Systems, 17(1), 36\u201369. https:\/\/doi.org\/10.1007\/s10458-008-9030-4","journal-title":"Journal of Autonomous Agents and Multi Agent Systems"},{"key":"9665_CR17","doi-asserted-by":"publisher","unstructured":"Hasanbeig, M., Kantaros, Y., Abate, A., Kroening, D., Pappas, G. J., & Lee, I. (2019). Reinforcement learning for temporal logic control synthesis with probabilistic satisfaction guarantees. In Proc. of CDC 2019: The IEEE 58th conference on decision and control (pp. 5338\u20135343). IEEE . https:\/\/doi.org\/10.1109\/CDC40024.2019.9028919","DOI":"10.1109\/CDC40024.2019.9028919"},{"key":"9665_CR18","doi-asserted-by":"publisher","DOI":"10.1017\/S0269888917000169","volume":"32","author":"C Haynes","year":"2017","unstructured":"Haynes, C., Luck, M., McBurney, P., Mahmoud, S., V\u00edtek, T., & Miles, S. (2017). Engineering the emergence of norms: A review. Knowledge Engineering Review, 32, e18. https:\/\/doi.org\/10.1017\/S0269888917000169","journal-title":"Knowledge Engineering Review"},{"key":"9665_CR19","doi-asserted-by":"publisher","unstructured":"Jansen, N., K\u00f6nighofer, B., Junges, S., Serban, A., & Bloem, R. (2020). Safe reinforcement learning using probabilistic shields (invited paper). In Proc. of CONCUR 2020: The 31st international conference on concurrency theory, Leibniz international proceedings in informatics (LIPIcs) (Vol. 171, pp. 3:1\u20133:16). https:\/\/doi.org\/10.4230\/LIPIcs.CONCUR.2020.3","DOI":"10.4230\/LIPIcs.CONCUR.2020.3"},{"key":"9665_CR20","doi-asserted-by":"publisher","unstructured":"Lam, H. P., & Governatori, G. (2009). The making of SPINdle. In Proc. of RuleML 2009: The international symposium of rule interchange and applications, LNCS (Vol. 5858, pp. 315\u2013322). Springer. https:\/\/doi.org\/10.1007\/978-3-642-04985-9","DOI":"10.1007\/978-3-642-04985-9"},{"issue":"2","key":"9665_CR21","doi-asserted-by":"publisher","first-page":"373","DOI":"10.1007\/978-3-642-04985-9_29","volume":"23","author":"HP Lam","year":"2013","unstructured":"Lam, H. P., & Governatori, G. (2013). Towards a model of UAVs navigation in urban canyon through defeasible logic. Journal of Logic and Computation, 23(2), 373\u2013395. https:\/\/doi.org\/10.1007\/978-3-642-04985-9_29","journal-title":"Journal of Logic and Computation"},{"key":"9665_CR22","unstructured":"Levine, S., Finn, C., Darrell, T., & Abbeel, P. (2016). End-to-end training of deep visuomotor policies. The Journal of Machine Learning Research, 17, 39:1\u201339:40"},{"key":"9665_CR23","unstructured":"Makinson, D., & Van Der Torre, L. (2007). What is input\/output logic? Input\/output logic, constraints, permissions. In Dagstuhl seminar proceedings (Vol. 07122). Schloss Dagstuhl-Leibniz-Zentrum f\u00fcr Informatik, Internationales Begegnungs- und Forschungszentrum f\u00fcr Informatik (IBFI), Schloss Dagstuhl, Germany. http:\/\/drops.dagstuhl.de\/opus\/volltexte\/2007\/928"},{"key":"9665_CR24","doi-asserted-by":"publisher","first-page":"18","DOI":"10.1109\/MIS.2006.80","volume":"21","author":"J Moor","year":"2006","unstructured":"Moor, J. (2006). The nature, importance, and difficulty of machine ethics. IEEE Intelligent Systems, 21, 18\u201321. https:\/\/doi.org\/10.1109\/MIS.2006.80","journal-title":"IEEE Intelligent Systems"},{"key":"9665_CR25","doi-asserted-by":"publisher","unstructured":"Neufeld, E., Bartocci, E., Ciabattoni, A., & Governatori, G. (2021). A normative supervisor for reinforcement learning agents. In A. Platzer, & G. Sutcliffe (Eds.), Proc. of CADE 28: The 28th international conference on automated deduction, LNCS (Vol. 12699, pp. 565\u2013576). Springer. https:\/\/doi.org\/10.1007\/978-3-030-72019-3_18","DOI":"10.1007\/978-3-030-72019-3_18"},{"key":"9665_CR26","doi-asserted-by":"publisher","unstructured":"Noothigattu, R., Bouneffouf, D., Mattei, N., Chandra, R., Madan, P., Varshney, K. R., Campbell, M., Singh, M., & Rossi, F. (2019). Teaching AI agents ethical values using reinforcement learning and policy orchestration. In Proc. of IJCAI 2019: The twenty-eighth international joint conference on artificial intelligence (pp. 6377\u20136381). https:\/\/doi.org\/10.24963\/ijcai.2019\/891","DOI":"10.24963\/ijcai.2019\/891"},{"issue":"275","key":"9665_CR27","doi-asserted-by":"publisher","first-page":"289","DOI":"10.1093\/mind\/LXIX.275.289","volume":"69","author":"PH Nowell-Smith","year":"1960","unstructured":"Nowell-Smith, P. H., & Lemmon, E. J. (1960). Escapism: The logical basis of ethics. Mind, 69(275), 289\u2013300.","journal-title":"Mind"},{"issue":"3\/4","key":"9665_CR28","doi-asserted-by":"publisher","first-page":"209","DOI":"10.1504\/IJRIS.2009.028020","volume":"1","author":"LM Pereira","year":"2009","unstructured":"Pereira, L. M., & Saptawijaya, A. (2009). Modelling morality with prospective logic. International Journal of Reasoning-based Intelligent Systems, 1(3\/4), 209\u2013221. https:\/\/doi.org\/10.1504\/IJRIS.2009.028020","journal-title":"International Journal of Reasoning-based Intelligent Systems"},{"key":"9665_CR29","doi-asserted-by":"publisher","unstructured":"Pnueli, A. (1977). The temporal logic of programs. In Proc. of the 18th annual symposium on foundations of computer science (pp. 46\u201357). IEEE Computer Society. https:\/\/doi.org\/10.1109\/SFCS.1977.32","DOI":"10.1109\/SFCS.1977.32"},{"key":"9665_CR30","doi-asserted-by":"publisher","first-page":"214","DOI":"10.1016\/j.artint.2015.06.005","volume":"227","author":"H Prakken","year":"2015","unstructured":"Prakken, H., & Sartor, G. (2015). Law and logic: A review from an argumentation perspective. Artificial Intelligence, 227, 214\u2013245. https:\/\/doi.org\/10.1016\/j.artint.2015.06.005","journal-title":"Artificial Intelligence"},{"key":"9665_CR31","doi-asserted-by":"publisher","unstructured":"Rodriguez-Soto, M., L\u00f3pez-S\u00e1nchez, M., & Rodr\u00edguez-Aguilar, J. A. (2021). Multi-objective reinforcement learning for designing ethical environments. In Proc. of IJCAI 2021: The thirtieth international joint conference on artificial intelligence (pp. 545\u2013551). https:\/\/doi.org\/10.24963\/ijcai.2021\/76","DOI":"10.24963\/ijcai.2021\/76"},{"issue":"2\u20133","key":"9665_CR32","doi-asserted-by":"publisher","first-page":"101","DOI":"10.1007\/s10588-006-9539-5","volume":"12","author":"F Sadri","year":"2006","unstructured":"Sadri, F., Stathis, K., & Toni, F. (2006). Normative KGP agents. Computational and Mathematical Organization Theory, 12(2\u20133), 101\u2013126. https:\/\/doi.org\/10.1007\/s10588-006-9539-5","journal-title":"Computational and Mathematical Organization Theory"},{"issue":"1","key":"9665_CR33","doi-asserted-by":"publisher","first-page":"21","DOI":"10.3233\/MGS-2011-0167","volume":"7","author":"BTR Savarimuthu","year":"2011","unstructured":"Savarimuthu, B. T. R., & Cranefield, S. (2011). Norm creation, spreading and emergence: A survey of simulation models of norms in multi-agent systems. Multiagent and Grid Systems, 7(1), 21\u201354. https:\/\/doi.org\/10.3233\/MGS-2011-0167","journal-title":"Multiagent and Grid Systems"},{"issue":"5","key":"9665_CR34","doi-asserted-by":"publisher","first-page":"370","DOI":"10.1145\/5689.5920","volume":"29","author":"MJ Sergot","year":"1986","unstructured":"Sergot, M. J., Sadri, F., Kowalski, R. A., Kriwaczek, F., Hammond, P., & Cory, H. T. (1986). The british nationality act as a logic program. Communications of the ACM, 29(5), 370\u2013386. https:\/\/doi.org\/10.1145\/5689.5920","journal-title":"Communications of the ACM"},{"issue":"7676","key":"9665_CR35","doi-asserted-by":"publisher","first-page":"354","DOI":"10.1038\/nature24270","volume":"550","author":"D Silver","year":"2017","unstructured":"Silver, D., Schrittwieser, J., Simonyan, K., Antonoglou, I., Huang, A., Guez, A., Hubert, T., Baker, L., Lai, M., Bolton, A., Chen, Y., Lillicrap, T. P., Hui, F., Sifre, L., van den Driessche, G., Graepel, T., & Hassabis, D. (2017). Mastering the game of go without human knowledge. Nature, 550(7676), 354\u2013359. https:\/\/doi.org\/10.1038\/nature24270","journal-title":"Nature"},{"key":"9665_CR36","unstructured":"The IEEE Global Initiative on Ethics of Autonomous and Intelligent Systems: IEEE standard review\u2014Ethically aligned design: A vision for prioritizing human wellbeing with artificial intelligence and autonomous systems (1st ed.). IEEE (2019)"},{"key":"9665_CR37","volume-title":"An essay in deontic logic and the general theory of action: With a bibliography of deontic and imperative logic","author":"GH von Wright","year":"1968","unstructured":"von Wright, G. H. (1968). An essay in deontic logic and the general theory of action: With a bibliography of deontic and imperative logic. Co: North-Holland Pub."},{"key":"9665_CR38","doi-asserted-by":"publisher","DOI":"10.1093\/acprof:oso\/9780195374049.001.0001","author":"W Wallach","year":"2008","unstructured":"Wallach, W., & Allen, C. (2008). Moral machines: Teaching robots right from wrong. Oxford University Press. https:\/\/doi.org\/10.1093\/acprof:oso\/9780195374049.001.0001","journal-title":"Oxford University Press"},{"key":"9665_CR39","unstructured":"Watkins, C. J. C. H.: Learning from delayed rewards. Ph.D. thesis, King\u2019s College, Cambridge, UK (1989). http:\/\/www.cs.rhul.ac.uk\/~chrisw\/new_thesis.pdf"},{"key":"9665_CR40","doi-asserted-by":"crossref","unstructured":"Wu, Y. H., & Lin, S. D. (2018). A low-cost ethics shaping approach for designing reinforcement learning agents. In Proc. AAAI 2018: The thirty-second AAAI conference on artificial intelligence (pp. 1687\u20131694). AAAI Press. https:\/\/www.aaai.org\/ocs\/index.php\/AAAI\/AAAI18\/paper\/view\/16195","DOI":"10.1609\/aaai.v32i1.11498"}],"container-title":["Ethics and Information Technology"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s10676-022-09665-8.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1007\/s10676-022-09665-8\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s10676-022-09665-8.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,1,4]],"date-time":"2023-01-04T05:06:56Z","timestamp":1672808816000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/s10676-022-09665-8"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,9,29]]},"references-count":40,"journal-issue":{"issue":"4","published-print":{"date-parts":[[2022,12]]}},"alternative-id":["9665"],"URL":"https:\/\/doi.org\/10.1007\/s10676-022-09665-8","relation":{},"ISSN":["1388-1957","1572-8439"],"issn-type":[{"value":"1388-1957","type":"print"},{"value":"1572-8439","type":"electronic"}],"subject":[],"published":{"date-parts":[[2022,9,29]]},"assertion":[{"value":"24 August 2022","order":1,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"29 September 2022","order":2,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Declarations"}},{"value":"The authors declare that they have no conflict of interest.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Conflict of interest"}}],"article-number":"43"}}