{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,19]],"date-time":"2026-02-19T15:52:53Z","timestamp":1771516373941,"version":"3.50.1"},"publisher-location":"New York, NY, USA","reference-count":53,"publisher":"ACM","license":[{"start":{"date-parts":[[2021,7,21]],"date-time":"2021-07-21T00:00:00Z","timestamp":1626825600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"name":"Survival and Flourishing","award":["n\/a"],"award-info":[{"award-number":["n\/a"]}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2021,7,21]]},"DOI":"10.1145\/3461702.3462570","type":"proceedings-article","created":{"date-parts":[[2021,7,31]],"date-time":"2021-07-31T01:21:32Z","timestamp":1627694492000},"page":"437-445","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":14,"title":["AI Alignment and Human Reward"],"prefix":"10.1145","author":[{"given":"Patrick","family":"Butlin","sequence":"first","affiliation":[{"name":"King's College London, London, United Kingdom"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","published-online":{"date-parts":[[2021,7,30]]},"reference":[{"key":"e_1_3_2_1_1_1","volume-title":"Advances in Neural Information Processing Systems 31 (NeurIPS '18). Curran Associates","author":"Armstrong Stuart","unstructured":"Stuart Armstrong and S\u00f6ren Mindermann . 2018. Occam's razor is insufficient to infer the preferences of irrational agents . In Advances in Neural Information Processing Systems 31 (NeurIPS '18). Curran Associates , Red Hook, NY , 5603--5614. Stuart Armstrong and S\u00f6ren Mindermann. 2018. Occam's razor is insufficient to infer the preferences of irrational agents. In Advances in Neural Information Processing Systems 31 (NeurIPS '18). Curran Associates, Red Hook, NY, 5603--5614."},{"key":"e_1_3_2_1_2_1","volume-title":"Intrinsically Motivated Learning in Natural and Artificial Systems","author":"Barto Andrew","unstructured":"Andrew Barto . 2013. Intrinsic motivation and reinforcement learning . In Intrinsically Motivated Learning in Natural and Artificial Systems , edited by G. Baldassarre and M. Minolli. Springer , Berlin , 17--47. DOI: https:\/\/doi.org\/10.1007\/978--3--642--32375--1_2 Andrew Barto. 2013. Intrinsic motivation and reinforcement learning. In Intrinsically Motivated Learning in Natural and Artificial Systems, edited by G. Baldassarre and M. Minolli. Springer, Berlin, 17--47. DOI: https:\/\/doi.org\/10.1007\/978--3--642--32375--1_2"},{"key":"e_1_3_2_1_3_1","doi-asserted-by":"publisher","DOI":"10.1521\/soco.2008.26.5.621"},{"key":"e_1_3_2_1_4_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.neuron.2015.02.018"},{"key":"e_1_3_2_1_5_1","volume-title":"O'Doherty","author":"Berridge Kent","year":"2013","unstructured":"Kent Berridge and John P . O'Doherty . 2013 . From experienced utility to decision utility. In Neuroeconomics : Decision-Making and the Brain, edited by E. Fehr and P. W. Glimcher. Academic Press , London, 335--348. Kent Berridge and John P. O'Doherty. 2013. From experienced utility to decision utility. In Neuroeconomics: Decision-Making and the Brain, edited by E. Fehr and P. W. Glimcher. Academic Press, London, 335--348."},{"key":"e_1_3_2_1_6_1","volume-title":"Superintelligence: Paths, Dangers, Strategies","author":"Bostrom Nick","year":"2014","unstructured":"Nick Bostrom . 2014 . Superintelligence: Paths, Dangers, Strategies . Oxford University Press , Oxford . Nick Bostrom. 2014. Superintelligence: Paths, Dangers, Strategies. Oxford University Press, Oxford."},{"key":"e_1_3_2_1_7_1","volume-title":"Why hunger is not a desire. Review of Philosophy and Psychology 8 (Sep","author":"Butlin Patrick","year":"2017","unstructured":"Patrick Butlin . 2017. Why hunger is not a desire. Review of Philosophy and Psychology 8 (Sep . 2017 ), 617--635. DOI: https:\/\/doi.org\/10.1007\/s13164-017-0332--9 Patrick Butlin. 2017. Why hunger is not a desire. Review of Philosophy and Psychology 8 (Sep. 2017), 617--635. DOI: https:\/\/doi.org\/10.1007\/s13164-017-0332--9"},{"key":"e_1_3_2_1_8_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.jebo.2015.10.016"},{"key":"e_1_3_2_1_9_1","volume-title":"Stanford Encyclopedia of Philosophy, edited by E. N. Zalta.","author":"Crisp Roger","unstructured":"Roger Crisp . 2017. Well-being. Stanford Encyclopedia of Philosophy, edited by E. N. Zalta. Retrieved from https:\/\/plato.stanford.edu\/entries\/well-being\/ Roger Crisp. 2017. Well-being. Stanford Encyclopedia of Philosophy, edited by E. N. Zalta. Retrieved from https:\/\/plato.stanford.edu\/entries\/well-being\/"},{"key":"e_1_3_2_1_10_1","unstructured":"Paul Christiano. 2015. The easy goal inference problem is still hard. Retrieved from https:\/\/ai-alignment.com\/the-easy-goal-inference-problem-is-still-hard-fad030e0a876  Paul Christiano. 2015. The easy goal inference problem is still hard. Retrieved from https:\/\/ai-alignment.com\/the-easy-goal-inference-problem-is-still-hard-fad030e0a876"},{"key":"e_1_3_2_1_11_1","doi-asserted-by":"publisher","DOI":"10.1177\/1088868313495594"},{"key":"e_1_3_2_1_12_1","volume-title":"Neuroeconomics: Decision-Making and the Brain","author":"Daw Nathaniel","unstructured":"Nathaniel Daw . 2013. Advanced reinforcement learning . In Neuroeconomics: Decision-Making and the Brain , edited by E. Fehr and P. W. Glimcher. Academic Press , London , 299--317. Nathaniel Daw. 2013. Advanced reinforcement learning. In Neuroeconomics: Decision-Making and the Brain, edited by E. Fehr and P. W. Glimcher. Academic Press, London, 299--317."},{"key":"e_1_3_2_1_13_1","volume-title":"O'Doherty","author":"Daw Nathaniel","year":"2013","unstructured":"Nathaniel Daw and John P . O'Doherty . 2013 . Multiple systems for value learning. In Neuroeconomics : Decision-Making and the Brain, edited by E. Fehr and P. W. Glimcher. Academic Press , London, 393--410. Nathaniel Daw and John P. O'Doherty. 2013. Multiple systems for value learning. In Neuroeconomics: Decision-Making and the Brain, edited by E. Fehr and P. W. Glimcher. Academic Press, London, 393--410."},{"key":"e_1_3_2_1_14_1","doi-asserted-by":"publisher","DOI":"10.1080\/09515080903532290"},{"key":"e_1_3_2_1_15_1","volume-title":"Ryan","author":"Deci Edward L.","year":"1985","unstructured":"Edward L. Deci and Richard M . Ryan . 1985 . Intrinsic Motivation and Self-Determination in Human Behavior. Plenum Press , New York. Edward L. Deci and Richard M. Ryan. 1985. Intrinsic Motivation and Self-Determination in Human Behavior. Plenum Press, New York."},{"key":"e_1_3_2_1_16_1","volume-title":"Hedonics: The cognitive-motivational interface. In Pleasures of the Brain","author":"Dickinson Anthony","year":"2009","unstructured":"Anthony Dickinson and Bernard Balleine . 2009 . Hedonics: The cognitive-motivational interface. In Pleasures of the Brain , edited by M. Kringelbach and K. Berridge. Oxford University Press , Oxford , 74--84. Anthony Dickinson and Bernard Balleine. 2009. Hedonics: The cognitive-motivational interface. In Pleasures of the Brain, edited by M. Kringelbach and K. Berridge. Oxford University Press, Oxford, 74--84."},{"key":"e_1_3_2_1_17_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.neuron.2013.09.007"},{"key":"e_1_3_2_1_18_1","volume-title":"Natural Law and Natural Rights","author":"Finnis John","unstructured":"John Finnis . 1980. Natural Law and Natural Rights . Clarendon Press , Oxford . John Finnis. 1980. Natural Law and Natural Rights. Clarendon Press, Oxford."},{"key":"e_1_3_2_1_19_1","doi-asserted-by":"publisher","DOI":"10.1017\/S0953820812000453"},{"key":"e_1_3_2_1_20_1","volume-title":"Objective list theories","author":"Fletcher Guy","unstructured":"Guy Fletcher . 2016. Objective list theories . In The Routledge Handbook of Philosophy of Well-Being, edited G. Fletcher. Routledge , Abingdon, Oxon, 148--160. Guy Fletcher. 2016. Objective list theories. In The Routledge Handbook of Philosophy of Well-Being, edited G. Fletcher. Routledge, Abingdon, Oxon, 148--160."},{"key":"e_1_3_2_1_21_1","doi-asserted-by":"publisher","DOI":"10.2307\/2024717"},{"key":"e_1_3_2_1_22_1","doi-asserted-by":"publisher","DOI":"10.1007\/s11023-020-09539-2"},{"key":"e_1_3_2_1_23_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.tics.2010.12.004"},{"key":"e_1_3_2_1_24_1","volume-title":"Pieter Abbeel and Stuart Russell","author":"Hadfield-Menell Dylan","year":"2016","unstructured":"Dylan Hadfield-Menell , Anca D. Dragan , Pieter Abbeel and Stuart Russell . 2016 . Cooperative inverse reinforcement learning. In Advances in Neural Information Processing Systems 29 (NeurIPS '16). Curran Associates, Red Hook, NY , 3916--3924. Dylan Hadfield-Menell, Anca D. Dragan, Pieter Abbeel and Stuart Russell. 2016. Cooperative inverse reinforcement learning. In Advances in Neural Information Processing Systems 29 (NeurIPS '16). Curran Associates, Red Hook, NY, 3916--3924."},{"key":"e_1_3_2_1_25_1","doi-asserted-by":"publisher","DOI":"10.24963\/ijcai.2017\/32"},{"key":"e_1_3_2_1_26_1","doi-asserted-by":"crossref","unstructured":"Daniel Hausman. 2012. Preference Value Choice and Welfare. CUP New York.  Daniel Hausman. 2012. Preference Value Choice and Welfare. CUP New York.","DOI":"10.1017\/CBO9781139058537"},{"key":"e_1_3_2_1_27_1","doi-asserted-by":"publisher","DOI":"10.1007\/s11098-009-9440-4"},{"key":"e_1_3_2_1_28_1","volume-title":"Oxford University Press","author":"Hurka Tom","unstructured":"Tom Hurka . 1993. Perfectionism. Oxford University Press , Oxford . Tom Hurka. 1993. Perfectionism. Oxford University Press, Oxford."},{"key":"e_1_3_2_1_29_1","first-page":"10","article-title":"Where does value come from","volume":"23","author":"Jeuchems Keno","year":"2019","unstructured":"Keno Jeuchems and Christopher Summerfield . 2019 . Where does value come from ? Trends in Cognitive Sciences 23 , 10 (Oct. 2019), 836--850. DOI: https:\/\/doi.org\/10.1016\/j.tics.2019.07.012 Keno Jeuchems and Christopher Summerfield. 2019. Where does value come from? Trends in Cognitive Sciences 23, 10 (Oct. 2019), 836--850. DOI: https:\/\/doi.org\/10.1016\/j.tics.2019.07.012","journal-title":"Trends in Cognitive Sciences"},{"key":"e_1_3_2_1_30_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.neuron.2015.08.037"},{"key":"e_1_3_2_1_31_1","volume-title":"Goal-Directed Decision-Making: Computations and Circuits","author":"Kool Wouter","unstructured":"Wouter Kool , Fiery Cushman , and Samuel Gershman . 2018. Competition and cooperation between multiple reinforcement learning systems . In Goal-Directed Decision-Making: Computations and Circuits , edited by R. Morris, A. Bornstein and A. Shenhav. Elsevier , Amsterdam , 153--178. DOI: https:\/\/doi.org\/10.1016\/B978-0--12--812098--9.00007--3 Wouter Kool, Fiery Cushman, and Samuel Gershman. 2018. Competition and cooperation between multiple reinforcement learning systems. In Goal-Directed Decision-Making: Computations and Circuits, edited by R. Morris, A. Bornstein and A. Shenhav. Elsevier, Amsterdam, 153--178. DOI: https:\/\/doi.org\/10.1016\/B978-0--12--812098--9.00007--3"},{"key":"e_1_3_2_1_32_1","doi-asserted-by":"publisher","DOI":"10.1093\/analys\/anx043"},{"key":"e_1_3_2_1_33_1","volume-title":"Vishal Maini and Shane Legg","author":"Leike Jan","year":"2018","unstructured":"Jan Leike , David Krueger , Tom Everitt , Miljan Martic , Vishal Maini and Shane Legg . 2018 . Scalable agent alignment via reward modeling: A research direction. arXiv:1811.07871. Retrieved from https:\/\/arxiv.org\/abs\/1811.07871 Jan Leike, David Krueger, Tom Everitt, Miljan Martic, Vishal Maini and Shane Legg. 2018. Scalable agent alignment via reward modeling: A research direction. arXiv:1811.07871. Retrieved from https:\/\/arxiv.org\/abs\/1811.07871"},{"key":"e_1_3_2_1_34_1","volume-title":"Krister Bykvist and Toby Ord","author":"MacAskill William","year":"2020","unstructured":"William MacAskill , Krister Bykvist and Toby Ord . 2020 . Moral Uncertainty. Oxford University Press , Oxford. William MacAskill, Krister Bykvist and Toby Ord. 2020. Moral Uncertainty. Oxford University Press, Oxford."},{"key":"e_1_3_2_1_35_1","volume-title":"Natural Law and Practical Rationality","author":"Murphy Mark","unstructured":"Mark Murphy . 2001. Natural Law and Practical Rationality . CUP , New York . Mark Murphy. 2001. Natural Law and Practical Rationality. CUP, New York."},{"key":"e_1_3_2_1_36_1","doi-asserted-by":"publisher","DOI":"10.1177\/0146167211419863"},{"key":"e_1_3_2_1_37_1","volume-title":"Proceedings of the Seventeenth International Conference on Machine Learning (ICML '00)","author":"Ng Andrew","year":"2000","unstructured":"Andrew Ng and Stuart Russell . 2000 . Algorithms for inverse reinforcement learning . In Proceedings of the Seventeenth International Conference on Machine Learning (ICML '00) . Morgan Kaufmann Publishers, San Francisco, Calif., 663--670 Andrew Ng and Stuart Russell. 2000. Algorithms for inverse reinforcement learning. In Proceedings of the Seventeenth International Conference on Machine Learning (ICML '00). Morgan Kaufmann Publishers, San Francisco, Calif., 663--670"},{"key":"e_1_3_2_1_38_1","volume-title":"State and Utopia","author":"Nozick Robert","unstructured":"Robert Nozick . 1974. Anarchy , State and Utopia . Basic Books , New York . Robert Nozick. 1974. Anarchy, State and Utopia. Basic Books, New York."},{"key":"e_1_3_2_1_39_1","doi-asserted-by":"publisher","DOI":"10.1109\/TEVC.2006.890271"},{"key":"e_1_3_2_1_40_1","doi-asserted-by":"publisher","DOI":"10.1016\/bs.pbr.2016.05.005"},{"key":"e_1_3_2_1_41_1","unstructured":"Derek Parfit. 1984. Reasons and Persons. Clarendon Press Oxford.  Derek Parfit. 1984. Reasons and Persons. Clarendon Press Oxford."},{"key":"e_1_3_2_1_42_1","doi-asserted-by":"publisher","DOI":"10.1145\/3306618.3314259"},{"key":"e_1_3_2_1_43_1","volume-title":"Human Compatible: Artificial Intelligence and the Problem of Control","author":"Russell Stuart","year":"2019","unstructured":"Stuart Russell . 2019 . Human Compatible: Artificial Intelligence and the Problem of Control . Viking Press , New York . Stuart Russell. 2019. Human Compatible: Artificial Intelligence and the Problem of Control. Viking Press, New York."},{"key":"e_1_3_2_1_44_1","doi-asserted-by":"publisher","DOI":"10.1111\/1468-0017.00197"},{"key":"e_1_3_2_1_45_1","first-page":"441","article-title":"Mammalian value systems","volume":"41","author":"Sarma Gopal","year":"2017","unstructured":"Gopal Sarma and Nick Hay . 2017 . Mammalian value systems . Informatica 41 , 3, 441 -- 449 . DOI: https:\/\/dx.doi.org\/10.2139\/ssrn.2975399 Gopal Sarma and Nick Hay. 2017. Mammalian value systems. Informatica 41, 3, 441--449. DOI: https:\/\/dx.doi.org\/10.2139\/ssrn.2975399","journal-title":"Informatica"},{"key":"e_1_3_2_1_46_1","doi-asserted-by":"publisher","DOI":"10.1109\/TAMD.2010.2056368"},{"key":"e_1_3_2_1_47_1","volume-title":"Three Faces of Desire","author":"Schroeder Timothy","unstructured":"Timothy Schroeder . 2004. Three Faces of Desire . Oxford University Press , New York . Timothy Schroeder. 2004. Three Faces of Desire. Oxford University Press, New York."},{"key":"e_1_3_2_1_48_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.neubiorev.2013.02.002"},{"key":"e_1_3_2_1_49_1","volume-title":"Proceedings of the Thirty-First Annual Conference of the Cognitive Science Society. Curran Associates","author":"Singh Satinder","year":"2009","unstructured":"Satinder Singh , Richard Lewis and Andrew Barto . 2009 . Where do rewards come from? In Proceedings of the Thirty-First Annual Conference of the Cognitive Science Society. Curran Associates , Red Hook, NY, 2601--2606. Satinder Singh, Richard Lewis and Andrew Barto. 2009. Where do rewards come from? In Proceedings of the Thirty-First Annual Conference of the Cognitive Science Society. Curran Associates, Red Hook, NY, 2601--2606."},{"key":"e_1_3_2_1_50_1","volume-title":"Papers from the 2016 AAAI Workshop on AI","author":"Sotala Kaj","unstructured":"Kaj Sotala . 2016. Defining human values for value learners . In Papers from the 2016 AAAI Workshop on AI , Ethics and Society. AAAI Press , Palo Alto, Calif.. Kaj Sotala. 2016. Defining human values for value learners. In Papers from the 2016 AAAI Workshop on AI, Ethics and Society. AAAI Press, Palo Alto, Calif.."},{"key":"e_1_3_2_1_51_1","volume-title":"Reinforcement Learning: An Introduction (2nd. Ed.)","author":"Sutton Richard","year":"2018","unstructured":"Richard Sutton and Andrew Barto . 2018 . Reinforcement Learning: An Introduction (2nd. Ed.) . MIT Press , Cambridge, MA . Richard Sutton and Andrew Barto. 2018. Reinforcement Learning: An Introduction (2nd. Ed.). MIT Press, Cambridge, MA."},{"key":"e_1_3_2_1_52_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.neuron.2016.08.018"},{"key":"e_1_3_2_1_53_1","doi-asserted-by":"publisher","DOI":"10.2307\/2024703"}],"event":{"name":"AIES '21: AAAI\/ACM Conference on AI, Ethics, and Society","location":"Virtual Event USA","acronym":"AIES '21","sponsor":["SIGAI ACM Special Interest Group on Artificial Intelligence","AAAI"]},"container-title":["Proceedings of the 2021 AAAI\/ACM Conference on AI, Ethics, and Society"],"original-title":[],"link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3461702.3462570","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3461702.3462570","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T20:49:06Z","timestamp":1750193346000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3461702.3462570"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2021,7,21]]},"references-count":53,"alternative-id":["10.1145\/3461702.3462570","10.1145\/3461702"],"URL":"https:\/\/doi.org\/10.1145\/3461702.3462570","relation":{},"subject":[],"published":{"date-parts":[[2021,7,21]]},"assertion":[{"value":"2021-07-30","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}