{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,10]],"date-time":"2026-04-10T18:07:34Z","timestamp":1775844454145,"version":"3.50.1"},"reference-count":63,"publisher":"Springer Science and Business Media LLC","issue":"8","license":[{"start":{"date-parts":[[2025,6,21]],"date-time":"2025-06-21T00:00:00Z","timestamp":1750464000000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2025,6,21]],"date-time":"2025-06-21T00:00:00Z","timestamp":1750464000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"DOI":"10.13039\/501100011730","name":"Templeton World Charity Foundation","doi-asserted-by":"publisher","id":[{"id":"10.13039\/501100011730","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100000923","name":"Australian Research Council","doi-asserted-by":"publisher","id":[{"id":"10.13039\/501100000923","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100000995","name":"Australian National University","doi-asserted-by":"crossref","id":[{"id":"10.13039\/501100000995","id-type":"DOI","asserted-by":"crossref"}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["AI &amp; Soc"],"published-print":{"date-parts":[[2025,12]]},"abstract":"<jats:title>Abstract<\/jats:title>\n                  <jats:p>AI systems are increasingly in a position to have deep and systemic impacts on human wellbeing. Projects in value alignment, a critical area of AI safety research, must ultimately aim to ensure that all those who stand to be affected by such systems have good reason to accept their outputs. This is especially challenging where AI systems are involved in making morally controversial decisions. In this paper, we consider three current approaches to value alignment: crowdsourcing, reinforcement learning from human feedback, and constitutional AI. We argue that all three fail to accommodate reasonable moral disagreement, since they provide neither good epistemic reasons nor good political reasons for accepting AI systems\u2019 morally controversial outputs. Since these appear to be the most promising approaches to value alignment currently on offer, we conclude that accommodating reasonable moral disagreement remains an open problem for AI safety, and we offer guidance for future research.<\/jats:p>","DOI":"10.1007\/s00146-025-02427-2","type":"journal-article","created":{"date-parts":[[2025,6,21]],"date-time":"2025-06-21T12:33:50Z","timestamp":1750509230000},"page":"6073-6087","update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":3,"title":["Moral disagreement and the limits of AI value alignment: a dual challenge of epistemic justification and political legitimacy"],"prefix":"10.1007","volume":"40","author":[{"given":"Nick","family":"Schuster","sequence":"first","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Daniel","family":"Kilov","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"297","published-online":{"date-parts":[[2025,6,21]]},"reference":[{"issue":"1","key":"2427_CR1","doi-asserted-by":"publisher","first-page":"121","DOI":"10.1007\/s13164-010-0045-9","volume":"2","author":"M Alfano","year":"2011","unstructured":"Alfano M (2011) Explaining away intuitions about traits: why virtue ethics seems plausible (even if it isn\u2019t). Rev Philos Psychol 2(1):121\u2013136. https:\/\/doi.org\/10.1007\/s13164-010-0045-9","journal-title":"Rev Philos Psychol"},{"key":"2427_CR2","unstructured":"Anderson M, Anderson SL, Armen C (2006) MedEthEx: a prototype medical ethics advisor. In: Proceedings of the 18th conference on Innovative applications of artificial intelligence, 2, 1759\u20131765. https:\/\/aaai.org\/papers\/009-IAAI06-009-iaai06"},{"key":"2427_CR3","doi-asserted-by":"publisher","DOI":"10.1093\/acprof:oso\/9780199228782.001.0001","volume-title":"Intelligent virtue","author":"J Annas","year":"2011","unstructured":"Annas J (2011) Intelligent virtue. Oxford University Press"},{"key":"2427_CR4","unstructured":"Anthropic (2023) Collective constitutional AI: aligning a language model with public input"},{"key":"2427_CR5","doi-asserted-by":"publisher","first-page":"119","DOI":"10.1111\/j.1467-8519.2009.01748.x","volume":"25","author":"D Archard","year":"2011","unstructured":"Archard D (2011) Why moral philosophers are not and should not be moral experts. Bioethics 25:119\u2013127. https:\/\/doi.org\/10.1111\/j.1467-8519.2009.01748.x","journal-title":"Bioethics"},{"key":"2427_CR6","unstructured":"Askell A, Bai Y, Chen A, Drain D, Ganguli D, Henighan TJ, Jones A, Joseph N, Mann B, DasSarma N, Elhage N, Hatfield-Dodds Z, Hernandez D, Kernion J, Ndousse K, Olsson C, Amodei D, Brown TB, Clark J, McCandlish S, Olah C, Kaplan J (2021) A general language assistant as a laboratory for alignment. https:\/\/arxiv.org\/abs\/2112.00861"},{"issue":"7729","key":"2427_CR7","doi-asserted-by":"publisher","first-page":"59","DOI":"10.1038\/s41586-018-0637-6","volume":"563","author":"E Awad","year":"2018","unstructured":"Awad E, Dsouza S, Kim R, Schulz J, Henrich J, Shariff A, Bonnefon J, Rahwan I (2018) The moral machine experiment. Nature 563(7729):59\u201364. https:\/\/doi.org\/10.1038\/s41586-018-0637-6","journal-title":"Nature"},{"issue":"3","key":"2427_CR8","doi-asserted-by":"publisher","first-page":"48","DOI":"10.1145\/3339904","volume":"63","author":"E Awad","year":"2020","unstructured":"Awad E, Dsouza S, Bonnefon J, Shariff A, Rahwan I (2020) Crowdsourcing moral machines. Commun ACM 63(3):48\u201355. https:\/\/doi.org\/10.1145\/3339904","journal-title":"Commun ACM"},{"key":"2427_CR9","unstructured":"Bai Y, Kadavath S, Kundu S, Askell A, Kernion J, Jones A, Chen A, Goldie A, Mirhoseini A, McKinnon C, Chen C, Olsson C, Olah C, Hernandez D, Drain D, Ganguli D, Li D, Tran-Johnson E, Perez E, Kerr J, Mueller J, Ladish J, Landau J, Ndousse K, Luko\u0161i\u016bt\u0117 K, Lovitt L, Sellitto M, Elhage N, Schiefer N, Mercado N, DasSarma N, Lasenby R, Larson R, Ringer S, Johnston S, Kravec S, Showk SE, Fort S, Lanham T, Telleen-Lawton T, Conerly T, Henighan TJ, Hume T, Bowman S, Hatfield-Dodds Z, Mann B, Amodei D, Joseph N, McCandlish S, Brown TB, Kaplan J (2022) Constitutional AI: harmlessness from AI feedback. https:\/\/arxiv.org\/abs\/2212.08073"},{"key":"2427_CR10","volume-title":"Principles of biomedical ethics","author":"TL Beauchamp","year":"1979","unstructured":"Beauchamp TL, Childress JF (1979) Principles of biomedical ethics. Oxford University Press"},{"key":"2427_CR11","volume-title":"Public deliberation","author":"J Bohman","year":"1996","unstructured":"Bohman J (1996) Public deliberation. MIT Press"},{"key":"2427_CR12","doi-asserted-by":"publisher","unstructured":"Bourget D, Chalmers DJ (2023) Philosophers on philosophy: the 2020 PhilPapers survey. Philosophers' Imprint 23(11). https:\/\/doi.org\/10.3998\/phimp.2109","DOI":"10.3998\/phimp.2109"},{"key":"2427_CR13","unstructured":"Buranyi S (2017) Rise of the racist robots\u2014how AI is learning all our worst impulses. The Guardian"},{"issue":"22","key":"2427_CR14","first-page":"135","volume":"14","author":"B Cammaerts","year":"2020","unstructured":"Cammaerts B, Mansell R (2020) Digital platform policy and regulation: toward a radical democratic turn. Int J Commun 14(22):135\u2013154","journal-title":"Int J Commun"},{"key":"2427_CR15","unstructured":"Casper S, Davies X, Shi C, Gilbert TK, Scheurer J, Rando J, Freedman R, Korbak T, Lindner D, Freire P, Wang T, Marks S, S\u00e9gerie C, Carroll M, Peng A, Christoffersen PJ, Damani M, Slocum S, Anwar U, Siththaranjan A, Nadeau M, Michaud EJ, Pfau J, Krasheninnikov D, Chen X, Langosco LL, Hase P, Biyik E, Dragan AD, Krueger D, Sadigh D, Hadfield-Menell D (2023) Open problems and fundamental limitations of reinforcement learning from human feedback. https:\/\/arxiv.org\/abs\/2307.15217"},{"key":"2427_CR16","doi-asserted-by":"publisher","first-page":"323","DOI":"10.1007\/s10677-007-9071-9","volume":"10","author":"M Cholbi","year":"2007","unstructured":"Cholbi M (2007) Moral expertise and the credentials problem. Ethical Theory Moral Pract 10:323\u2013334. https:\/\/doi.org\/10.1007\/s10677-007-9071-9","journal-title":"Ethical Theory Moral Pract"},{"key":"2427_CR17","unstructured":"Christiano PF, Leike J, Brown TB, Martic M, Legg S, Amodei D (2017) Deep reinforcement learning from human preferences. https:\/\/arxiv.org\/abs\/1706.03741"},{"key":"2427_CR18","volume-title":"The rule of the many","author":"T Christiano","year":"1996","unstructured":"Christiano T (1996) The rule of the many. Westview Press"},{"issue":"3","key":"2427_CR19","doi-asserted-by":"publisher","first-page":"273","DOI":"10.1007\/s11019-005-1588-x","volume":"8","author":"C Cowley","year":"2005","unstructured":"Cowley C (2005) A new rejection of moral expertise. Med Health Care Philos 8(3):273\u2013279. https:\/\/doi.org\/10.1007\/s11019-005-1588-x","journal-title":"Med Health Care Philos"},{"key":"2427_CR20","doi-asserted-by":"publisher","DOI":"10.12987\/9780300252392","volume-title":"The atlas of AI: power, politics, and the planetary costs of artificial intelligence","author":"K Crawford","year":"2021","unstructured":"Crawford K (2021) The atlas of AI: power, politics, and the planetary costs of artificial intelligence. Yale University Press"},{"key":"2427_CR21","doi-asserted-by":"crossref","unstructured":"Cunningham P, Cord M, Delany SJ (2008) Supervised learning. In: Machine learning techniques for multimedia: case studies on organization and retrieval. Springer Berlin Heidelberg, pp 21\u201349","DOI":"10.1007\/978-3-540-75171-7_2"},{"key":"2427_CR22","doi-asserted-by":"crossref","unstructured":"Davis J (2023) Understanding constitutional AI. Medium","DOI":"10.4324\/9781003252832-1"},{"issue":"4","key":"2427_CR23","doi-asserted-by":"publisher","first-page":"504","DOI":"10.1111\/0029-4624.00136","volume":"32","author":"JM Doris","year":"1998","unstructured":"Doris JM (1998) Persons, situations, and virtue ethics. Nous 32(4):504\u2013530. https:\/\/doi.org\/10.1111\/0029-4624.00136","journal-title":"Nous"},{"key":"2427_CR24","doi-asserted-by":"publisher","unstructured":"Driver J (2013) Moral expertise: judgment, practice, and analysis. Social Philos Policy 30(1\u20132):280\u2013296. https:\/\/doi.org\/10.1017\/S0265052513000137","DOI":"10.1017\/S0265052513000137"},{"key":"2427_CR25","volume-title":"Automating inequality: how high-tech tools profile, police, and punish the poor","author":"V Eubanks","year":"2018","unstructured":"Eubanks V (2018) Automating inequality: how high-tech tools profile, police, and punish the poor. St. Martin\u2019s Press"},{"issue":"3","key":"2427_CR26","doi-asserted-by":"publisher","first-page":"411","DOI":"10.1007\/s11023-020-09539-2","volume":"30","author":"I Gabriel","year":"2020","unstructured":"Gabriel I (2020) Artificial intelligence, values, and alignment. Mind Mach 30(3):411\u2013437. https:\/\/doi.org\/10.1007\/s11023-020-09539-2","journal-title":"Mind Mach"},{"key":"2427_CR27","doi-asserted-by":"publisher","unstructured":"Gabriel I, Ghazavi V (2022) The challenge of value alignment: from fairer algorithms to AI safety. In: V\u00e9liz C (ed) Oxford handbook of digital ethics. Oxford University Press, pp 336\u2013355. https:\/\/doi.org\/10.1093\/oxfordhb\/9780198857815.013.18. Accessed 19 May 2025","DOI":"10.1093\/oxfordhb\/9780198857815.013.18"},{"key":"2427_CR28","unstructured":"Ganguli D, Lovitt L, Kernion J, Askell A, Bai Y, Kadavath S, Mann B, Perez E, Schiefer N, Ndousse K, Jones A, Bowman S, Chen A, Conerly T, DasSarma N, Drain D, Elhage N, El-Showk S, Fort S, Dodds Z, Henighan TJ, Hernandez D, Hume T, Jacobson J, Johnston S, Kravec S, Olsson C, Ringer S, Tran-Johnson E, Amodei D, Brown TB, Joseph N, McCandlish S, Olah C, Kaplan J, Clark J (2022) Red teaming language models to reduce harms: methods, scaling behaviors, and lessons learned. https:\/\/arxiv.org\/abs\/2209.07858"},{"key":"2427_CR29","volume-title":"The order of public reason","author":"GF Gaus","year":"2011","unstructured":"Gaus GF (2011) The order of public reason. Cambridge University Press"},{"key":"2427_CR30","first-page":"71","volume-title":"Oxford studies in political philosophy","author":"A Greene","year":"2016","unstructured":"Greene A (2016) Consent and political legitimacy. In: Sobel D, Vallentyne P, Wall S (eds) Oxford studies in political philosophy. Oxford University Press, pp 71\u201397"},{"key":"2427_CR31","doi-asserted-by":"publisher","unstructured":"Gould CC (2019) How democracy can inform consent: cases of the internet and bioethics. J Appl Philos 36(2):173\u2013191. https:\/\/doi.org\/10.1111\/japp.12360","DOI":"10.1111\/japp.12360"},{"key":"2427_CR32","doi-asserted-by":"publisher","unstructured":"Guan MY, Joglekar M, Wallace E, Jain S, Barak B, Helyar A, Dias R, Vallone A, Ren H, Wei J, Chung HW, Toyer S, Heidecke J, Beutel A, Glaese A (2025) Deliberative alignment: reasoning enables safer language models. https:\/\/doi.org\/10.48550\/arXiv.2412.16339","DOI":"10.48550\/arXiv.2412.16339"},{"key":"2427_CR33","doi-asserted-by":"crossref","unstructured":"Habermas J (1996) Between facts and norms. (W. Rehg, Trans.). MIT Press","DOI":"10.7551\/mitpress\/1564.001.0001"},{"issue":"1","key":"2427_CR34","doi-asserted-by":"publisher","first-page":"315","DOI":"10.1111\/1467-9264.00062","volume":"99","author":"G Harman","year":"1999","unstructured":"Harman G (1999) Moral philosophy meets social psychology: virtue ethics and the fundamental attribution error. Proc Aristot Soc 99(1):315\u2013331. https:\/\/doi.org\/10.1111\/1467-9264.00062","journal-title":"Proc Aristot Soc"},{"issue":"1","key":"2427_CR35","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1017\/epi.2013.46","volume":"11","author":"A Hazlett","year":"2014","unstructured":"Hazlett A (2014) Entitlement and mutually recognized reasonable disagreement. Episteme 11(1):1\u201325. https:\/\/doi.org\/10.1017\/epi.2013.46","journal-title":"Episteme"},{"key":"2427_CR36","doi-asserted-by":"publisher","unstructured":"Hendrycks D, Burns C, Basart S, Critch A, Li JZ, Song DX, Steinhardt J (2020) Aligning AI with shared human values. https:\/\/doi.org\/10.48550\/arXiv.2008.02275","DOI":"10.48550\/arXiv.2008.02275"},{"key":"2427_CR37","unstructured":"Hendrycks D, Mazeika M, Woodside T (2023) An overview of catastrophic AI risks. https:\/\/arxiv.org\/abs\/2306.12001"},{"issue":"4","key":"2427_CR38","doi-asserted-by":"publisher","first-page":"1333","DOI":"10.1007\/s00146-021-01357-z","volume":"38","author":"J Himmelreich","year":"2022","unstructured":"Himmelreich J (2022) Against \u201cdemocratizing AI.\u201d AI & Soc 38(4):1333\u20131346. https:\/\/doi.org\/10.1007\/s00146-021-01357-z","journal-title":"AI & Soc"},{"key":"2427_CR39","unstructured":"Jiang L, Hwang JD, Bhagavatula C, Le Bras R, Liang JT, Dodge J, Sakaguchi K, Forbes M, Borchardt J, Gabriel S, Tsvetkov Y, Etzioni O, Sap M, Rini RA, Choi Y (2022) Can machines learn morality? The Delphi experiment. https:\/\/arxiv.org\/abs\/2110.07574"},{"key":"2427_CR40","doi-asserted-by":"publisher","unstructured":"Kilov D (2023) Brittle virtue or bust: a new challenge to virtue-as-skill theories. Synthese 202. https:\/\/doi.org\/10.1007\/s11229-023-04306-z","DOI":"10.1007\/s11229-023-04306-z"},{"key":"2427_CR41","unstructured":"Lambert N, Castricato L, von Werra L, Havrilla A (2022) Illustrating reinforcement learning from human feedback (RLHF). Hugging Face Blog. https:\/\/huggingface.co\/blog\/rlhf"},{"key":"2427_CR42","doi-asserted-by":"publisher","first-page":"338","DOI":"10.1177\/0090591787015003005","volume":"15","author":"B Manin","year":"1987","unstructured":"Manin B (1987) On legitimacy and political deliberation. Political Theory 15:338\u2013368. https:\/\/doi.org\/10.1177\/0090591787015003005","journal-title":"Political Theory"},{"key":"2427_CR43","doi-asserted-by":"publisher","unstructured":"McGrath S (2008) Moral disagreement and moral expertise. In: Shafer-Landau R (ed) Oxford studies in metaethics, vol 3. Oxford University Press, pp 87\u2013108. https:\/\/doi.org\/10.1093\/oso\/9780199542062.003.0005","DOI":"10.1093\/oso\/9780199542062.003.0005"},{"key":"2427_CR44","unstructured":"McPherson R, Shokri R, Shmatikov V (2016) Defeating image obfuscation with deep learning. https:\/\/arxiv.org\/abs\/1609.00408"},{"issue":"4","key":"2427_CR45","doi-asserted-by":"publisher","first-page":"365","DOI":"10.1023\/A:1026136703565","volume":"7","author":"C Miller","year":"2003","unstructured":"Miller C (2003) Social psychology and virtue ethics. J Ethics 7(4):365\u2013392. https:\/\/doi.org\/10.1023\/A:1026136703565","journal-title":"J Ethics"},{"issue":"2","key":"2427_CR46","doi-asserted-by":"publisher","first-page":"282","DOI":"10.1111\/japp.12553","volume":"39","author":"Y Niv","year":"2022","unstructured":"Niv Y (2022) Beyond all-or-nothing approaches to moral expertise. J Appl Philos 39(2):282\u2013296. https:\/\/doi.org\/10.1111\/japp.12553","journal-title":"J Appl Philos"},{"key":"2427_CR47","volume-title":"Anarchy, state, and Utopia","author":"R Nozick","year":"1974","unstructured":"Nozick R (1974) Anarchy, state, and Utopia. Blackwell"},{"key":"2427_CR48","doi-asserted-by":"publisher","DOI":"10.1017\/CBO9780511720444","volume-title":"Participation and democratic theory","author":"C Pateman","year":"1970","unstructured":"Pateman C (1970) Participation and democratic theory. Cambridge University Press"},{"key":"2427_CR49","doi-asserted-by":"publisher","DOI":"10.1017\/CBO9781139017428","volume-title":"On the people\u2019s terms","author":"P Pettit","year":"2012","unstructured":"Pettit P (2012) On the people\u2019s terms. Cambridge University Press"},{"key":"2427_CR50","doi-asserted-by":"publisher","first-page":"5","DOI":"10.1007\/s10676-017-9430-8","volume":"20","author":"I Rahwan","year":"2018","unstructured":"Rahwan I (2018) Society-in-the-loop: programming the algorithmic social contract. Ethics Inf Technol 20:5\u201314. https:\/\/doi.org\/10.1007\/s10676-017-9430-8","journal-title":"Ethics Inf Technol"},{"key":"2427_CR51","doi-asserted-by":"publisher","DOI":"10.2307\/j.ctv31xf5v0","volume-title":"Justice as fairness: a restatement","author":"J Rawls","year":"2001","unstructured":"Rawls J (2001) Justice as fairness: a restatement. Harvard University Press"},{"issue":"3","key":"2427_CR52","doi-asserted-by":"publisher","first-page":"713","DOI":"10.1093\/pq\/pqab047","volume":"72","author":"J Shepherd","year":"2022","unstructured":"Shepherd J (2022) Practical structure and moral skill. Philos Q 72(3):713\u2013732. https:\/\/doi.org\/10.1093\/pq\/pqab047","journal-title":"Philos Q"},{"key":"2427_CR53","doi-asserted-by":"crossref","unstructured":"Simmons JA (2001) Justification and legitimacy: essays on rights and obligations. Cambridge University Press","DOI":"10.1017\/CBO9780511625152"},{"issue":"4","key":"2427_CR54","doi-asserted-by":"publisher","first-page":"115","DOI":"10.2307\/3327906","volume":"32","author":"P Singer","year":"1972","unstructured":"Singer P (1972) Moral experts. Analysis 32(4):115\u2013117. https:\/\/doi.org\/10.2307\/3327906","journal-title":"Analysis"},{"key":"2427_CR55","doi-asserted-by":"publisher","unstructured":"Sinnott-Armstrong W, Skorburg JA (2021) How AI can aid bioethics. J Practical Ethics 9(1). https:\/\/doi.org\/10.3998\/jpe.1175","DOI":"10.3998\/jpe.1175"},{"key":"2427_CR56","unstructured":"Susskind J (2018) Future politics: living together in a world transformed by tech. Oxford University Press"},{"key":"2427_CR57","unstructured":"Vallier K (2011) Convergence and consensus in public reason. Public Affairs Q 25(4):261\u2013279. https:\/\/www.jstor.org\/stable\/23057084"},{"issue":"2","key":"2427_CR58","doi-asserted-by":"publisher","first-page":"211","DOI":"10.1111\/jopp.12227","volume":"29","author":"K Vallier","year":"2021","unstructured":"Vallier K, Muldoon R (2021) In public reason, diversity trumps coherence. J Polit Philos 29(2):211\u2013230. https:\/\/doi.org\/10.1111\/jopp.12227","journal-title":"J Polit Philos"},{"issue":"1","key":"2427_CR59","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1111\/j.0029-4624.2005.00492.x","volume":"39","author":"PB Vranas","year":"2005","unstructured":"Vranas PB (2005) The indeterminacy paradox: character evaluations and human psychology. Nous 39(1):1\u201342. https:\/\/doi.org\/10.1111\/j.0029-4624.2005.00492.x","journal-title":"Nous"},{"key":"2427_CR60","doi-asserted-by":"crossref","unstructured":"Wellman CH (1996) Liberalism, samaritanism, and political legitimacy. Philos Publ Affairs 25(3):211\u2013237. http:\/\/www.jstor.org\/stable\/2961925","DOI":"10.1111\/j.1088-4963.1996.tb00040.x"},{"key":"2427_CR61","doi-asserted-by":"publisher","first-page":"225","DOI":"10.1007\/s13347-019-00355-w","volume":"33","author":"P-H Wong","year":"2020","unstructured":"Wong P-H (2020) Democratizing algorithmic fairness. Philos Technol 33:225\u2013244. https:\/\/doi.org\/10.1007\/s13347-019-00355-w","journal-title":"Philos Technol"},{"key":"2427_CR62","unstructured":"Zheng C, Sun K, Wu H, Xi C, Zhou X (2024) Balancing enhancement, harmlessness, and general capabilities: enhancing conversational LLMs with direct RLHF. https:\/\/arxiv.org\/abs\/2403.02513"},{"key":"2427_CR63","unstructured":"Zimmermann A, Di Rosa E, Kim H (2020) Technology can\u2019t fix algorithmic injustice. Boston Rev. https:\/\/www.bostonreview.net\/articles\/annette-zimmermann-algorithmic-political\/"}],"container-title":["AI &amp; SOCIETY"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s00146-025-02427-2.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1007\/s00146-025-02427-2\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s00146-025-02427-2.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,11,17]],"date-time":"2025-11-17T06:16:57Z","timestamp":1763360217000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/s00146-025-02427-2"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,6,21]]},"references-count":63,"journal-issue":{"issue":"8","published-print":{"date-parts":[[2025,12]]}},"alternative-id":["2427"],"URL":"https:\/\/doi.org\/10.1007\/s00146-025-02427-2","relation":{},"ISSN":["0951-5666","1435-5655"],"issn-type":[{"value":"0951-5666","type":"print"},{"value":"1435-5655","type":"electronic"}],"subject":[],"published":{"date-parts":[[2025,6,21]]},"assertion":[{"value":"15 January 2025","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"6 June 2025","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"21 June 2025","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Declarations"}},{"value":"The authors declare no competing interests.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Competing interests"}}]}}