{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,15]],"date-time":"2026-03-15T12:35:08Z","timestamp":1773578108710,"version":"3.50.1"},"reference-count":30,"publisher":"Springer Science and Business Media LLC","issue":"3","license":[{"start":{"date-parts":[[2023,8,16]],"date-time":"2023-08-16T00:00:00Z","timestamp":1692144000000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/www.springernature.com\/gp\/researchers\/text-and-data-mining"},{"start":{"date-parts":[[2023,8,16]],"date-time":"2023-08-16T00:00:00Z","timestamp":1692144000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.springernature.com\/gp\/researchers\/text-and-data-mining"}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Ethics Inf Technol"],"published-print":{"date-parts":[[2023,9]]},"DOI":"10.1007\/s10676-023-09716-8","type":"journal-article","created":{"date-parts":[[2023,8,16]],"date-time":"2023-08-16T10:02:36Z","timestamp":1692180156000},"update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":11,"title":["Calibrating machine behavior: a challenge for AI alignment"],"prefix":"10.1007","volume":"25","author":[{"given":"Erez","family":"Firt","sequence":"first","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2023,8,16]]},"reference":[{"key":"9716_CR1","doi-asserted-by":"crossref","unstructured":"Abbeel, P. & Ng, A.Y. (2004). Apprenticeship learning via inverse reinforcement learning. In Proceedings of the twenty-first international conference on Machine learning (p. 1). ACM.","DOI":"10.1145\/1015330.1015430"},{"key":"9716_CR2","unstructured":"Bostrom, N. (2003). Ethical issues in advanced artificial intelligence. Retrieved Jan 31, 2023 from https:\/\/nickbostrom.com\/ethics\/ai."},{"key":"9716_CR3","doi-asserted-by":"publisher","unstructured":"Burda, Y., Edwards, H., Pathak, D., Storkey, A., Darrell, T., & Efros, A. A. (2018). Large-scale study of curiosity-driven learning. https:\/\/doi.org\/10.48550\/arXiv.1808.04355.","DOI":"10.48550\/arXiv.1808.04355"},{"key":"9716_CR4","unstructured":"Christian, B. (2020). The alignment problem: Machine learning and human values. WW Norton & Company."},{"key":"9716_CR5","unstructured":"Eckersley, P. (2018). Impossibility and uncertainty theorems in AI value alignment (or why your AGI should not have a utility function). arXiv:1901.00064."},{"issue":"7639","key":"9716_CR6","doi-asserted-by":"publisher","first-page":"115","DOI":"10.1038\/nature21056","volume":"542","author":"A Esteva","year":"2017","unstructured":"Esteva, A., Kuprel, B., Novoa, R. A., et al. (2017). Dermatologist-level classification of skin cancer with deep neural networks. Nature, 542(7639), 115\u2013118.","journal-title":"Nature"},{"key":"9716_CR7","doi-asserted-by":"publisher","first-page":"995","DOI":"10.1007\/s00146-020-00942-y","volume":"35","author":"E Firt","year":"2020","unstructured":"Firt, E. (2020). The missing G. AI & Society, 35, 995\u20131007.","journal-title":"AI & Society"},{"key":"9716_CR8","doi-asserted-by":"publisher","DOI":"10.1007\/s00146-023-01631-2","author":"E Firt","year":"2023","unstructured":"Firt, E. (2023). Artificial understanding: A step toward Robust AI. AI & Society. https:\/\/doi.org\/10.1007\/s00146-023-01631-2","journal-title":"AI & Society"},{"key":"9716_CR9","doi-asserted-by":"publisher","first-page":"411","DOI":"10.1007\/s11023-020-09539-2","volume":"30","author":"I Gabriel","year":"2020","unstructured":"Gabriel, I. (2020). Artificial intelligence, values, and alignment. Minds and Machines, 30, 411\u2013437.","journal-title":"Minds and Machines"},{"key":"9716_CR10","doi-asserted-by":"crossref","unstructured":"Hadfield-Menell, D., & Hadfield, G. (2018). Incomplete contracting and AI alignment. arXiv:180404268Cs.","DOI":"10.2139\/ssrn.3165793"},{"key":"9716_CR11","unstructured":"Marcus, G. (2020). The next decade in AI: Four steps towards robust artificial intelligence. https:\/\/arxiv.org\/abs\/2002.06177."},{"key":"9716_CR12","unstructured":"Marcus, G. (2022). Deep learning is hitting a wall. Retrieved Feb 5, 2023, from https:\/\/nautil.us\/deep-learning-is-hitting-a-wall-238440\/."},{"key":"9716_CR13","unstructured":"Marcus, G., & Davis, E. (2019). Rebooting AI: Building artificial intelligence we can trust. Vintage Books."},{"key":"9716_CR14","unstructured":"Marcus, G. and Davis, E. (2020). GPT-3, bloviator: OpenAI\u2019s language generator has no idea what it\u2019s talking about. MIT Technology Review. Retrieved Feb 7, 2023, from https:\/\/www.technologyreview.com\/2020\/08\/22\/1007539\/gpt3-openai-language-generatorartificial-intelligence-ai-opinion\/."},{"key":"9716_CR15","doi-asserted-by":"publisher","unstructured":"McIlroy-Young, R., Sen, S., Kleinberg, J., & Anderson, A. (2020). Aligning superhuman AI with human behavior: Chess as a model system. https:\/\/doi.org\/10.48550\/arXiv.2006.01855.","DOI":"10.48550\/arXiv.2006.01855"},{"key":"9716_CR16","doi-asserted-by":"publisher","first-page":"529","DOI":"10.1038\/nature14236","volume":"518","author":"V Mnih","year":"2015","unstructured":"Mnih, V., Kavukcuoglu, K., Silver, D., et al. (2015). Human-level control through deep reinforcement learning. Nature, 518, 529\u2013533. https:\/\/doi.org\/10.1038\/nature14236","journal-title":"Nature"},{"issue":"1","key":"9716_CR17","doi-asserted-by":"publisher","first-page":"61","DOI":"10.1023\/A:1010078828842","volume":"1","author":"JH Moor","year":"1999","unstructured":"Moor, J. H. (1999). Just consequentialism and computing. Ethics and Information Technology, 1(1), 61\u201365.","journal-title":"Ethics and Information Technology"},{"key":"9716_CR18","unstructured":"Ng, A. Y., & Russell, S. J. (2000). Algorithms for inverse reinforcement learning. In Proceedings of the seventeenth international conference on machine learning (ICML '00) (pp. 663\u2013670.). Morgan Kaufmann Publishers Inc"},{"key":"9716_CR19","doi-asserted-by":"publisher","first-page":"211","DOI":"10.1007\/s11263-015-0816-y","volume":"115","author":"O Russakovsky","year":"2014","unstructured":"Russakovsky, O., Deng, J., Su, H., et al. (2014). ImageNet large scale visual recognition challenge. International Journal of Computer Vision, 115, 211\u2013252.","journal-title":"International Journal of Computer Vision"},{"key":"9716_CR20","unstructured":"Russell, S. (2017). 3 principles for creating safer AI. TED talk.  Retrieved Jan 30, 2023, from https:\/\/www.ted.com\/talks\/stuart_russell_3_principles_for_creating_safer_ai."},{"key":"9716_CR21","volume-title":"Human compatible: AI and the problem of control","author":"S Russell","year":"2019","unstructured":"Russell, S. (2019). Human compatible: AI and the problem of control. Allen Lane."},{"key":"9716_CR22","unstructured":"Russell, S. (2020). The control problem of super-intelligent AI|AI podcast clips.  Retrieved Feb 5, 2023, from https:\/\/www.youtube.com\/watch?v=bHPeGhbSVpw."},{"issue":"2","key":"9716_CR23","doi-asserted-by":"publisher","first-page":"43","DOI":"10.1162\/daed_a_01899","volume":"151","author":"S Russell","year":"2022","unstructured":"Russell, S. (2022). If we succeed. Daedalus, 151(2), 43\u201357. https:\/\/doi.org\/10.1162\/daed_a_01899","journal-title":"Daedalus"},{"key":"9716_CR24","unstructured":"Salimans, T., Ho, J., Chen, X., Sidor, S. & Sutskever, I. (2017). Evolution strategies as a scalable alternative to reinforcement learning. arXiv:1703.03864."},{"key":"9716_CR25","doi-asserted-by":"publisher","DOI":"10.1126\/science.aar6404","volume-title":"A general reinforcement learning algorithm that masters chess, shogi, and go through self-play","author":"D Silver","year":"2018","unstructured":"Silver, D., Hubert, T., Schrittwieser, J., et al. (2018). A general reinforcement learning algorithm that masters chess, shogi, and go through self-play. Science."},{"key":"9716_CR26","unstructured":"Soares, N., Fallenstein, B., Yudkowsky, E., & Armstrong, S. (2015). Corrigibility. In AAAI workshops: Workshops at the 29th AAAI conference on artificial intelligence, Austin, TX, January 25\u201326, 2015. AAAI Publications. Retrieved Feb 8, 2023 from https:\/\/intelligence.org\/files\/Corrigibility.pdf."},{"key":"9716_CR27","volume-title":"Reinforcement learning: An introduction","author":"RS Sutton","year":"2018","unstructured":"Sutton, R. S., & Barto, A. G. (2018). Reinforcement learning: An introduction (2nd ed.). MIT Press.","edition":"2"},{"key":"9716_CR28","doi-asserted-by":"publisher","DOI":"10.1093\/acprof:oso\/9780190498511.001.0001","volume-title":"Technology and the virtues: A philosophical guide to a future worth wanting","author":"S Vallor","year":"2016","unstructured":"Vallor, S. (2016). Technology and the virtues: A philosophical guide to a future worth wanting. Oxford University Press."},{"key":"9716_CR29","doi-asserted-by":"publisher","unstructured":"Vasquez, D., Okal, B., Arras, K.O. (2014). Inverse reinforcement learning algorithms and features for robot navigation in crowds: An experimental comparison. In 2014 IEEE\/RSJ international conference on intelligent robots and systems (pp. 1341\u20131346). https:\/\/doi.org\/10.1109\/IROS.2014.6942731","DOI":"10.1109\/IROS.2014.6942731"},{"key":"9716_CR30","unstructured":"Yudkowsky, E. (2016). The AI alignment problem: Why it is hard, and where to start. Symbolic Systems Distinguished Speaker. Retrieved Jan 29, 2023, from  https:\/\/intelligence.org\/stanford-talk\/."}],"container-title":["Ethics and Information Technology"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s10676-023-09716-8.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1007\/s10676-023-09716-8\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s10676-023-09716-8.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,9,20]],"date-time":"2023-09-20T03:37:54Z","timestamp":1695181074000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/s10676-023-09716-8"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,8,16]]},"references-count":30,"journal-issue":{"issue":"3","published-print":{"date-parts":[[2023,9]]}},"alternative-id":["9716"],"URL":"https:\/\/doi.org\/10.1007\/s10676-023-09716-8","relation":{},"ISSN":["1388-1957","1572-8439"],"issn-type":[{"value":"1388-1957","type":"print"},{"value":"1572-8439","type":"electronic"}],"subject":[],"published":{"date-parts":[[2023,8,16]]},"assertion":[{"value":"7 August 2023","order":1,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"16 August 2023","order":2,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Declarations"}},{"value":"Author declare has no conflict of interest.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Conflict of interest"}}],"article-number":"42"}}