{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,1]],"date-time":"2026-04-01T05:12:23Z","timestamp":1775020343674,"version":"3.50.1"},"reference-count":0,"publisher":"AI Access Foundation","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["jair"],"abstract":"<jats:p>Multi-Objective Reinforcement Learning (MORL) aims to learn a set of policies that optimize trade-offs between multiple, often conflicting objectives. MORL is computationally more complex than single-objective RL, particularly as the number of objectives increases. Additionally, when objectives involve the preferences of agents or groups, incorporating fairness becomes both important and socially desirable. This paper introduces a principled algorithm that incorporates fairness into MORL while improving scalability to many-objective problems. We propose using Lorenz dominance to identify policies with equitable reward distributions and introduce \u03bb-Lorenz dominance to enable flexible fairness preferences. We release a new, large-scale real-world transport planning environment and demonstrate that our method encourages the discovery of fair policies, showing improved scalability in two large cities (Xi\u2019an and Amsterdam). Our methods outperform common multi-objective approaches, particularly in high-dimensional objective spaces.<\/jats:p>","DOI":"10.1613\/jair.1.19862","type":"journal-article","created":{"date-parts":[[2026,3,27]],"date-time":"2026-03-27T03:13:59Z","timestamp":1774581239000},"source":"Crossref","is-referenced-by-count":0,"title":["Scalable Multi-Objective Reinforcement Learning with Fairness Guarantees using Lorenz Dominance"],"prefix":"10.1613","volume":"85","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-0106-1126","authenticated-orcid":false,"given":"Dimitris","family":"Michailidis","sequence":"first","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-5045-6127","authenticated-orcid":false,"given":"Willem","family":"R\u00f6pke","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-2825-2491","authenticated-orcid":false,"given":"Diederik M.","family":"Roijers","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0009-0007-5788-4635","authenticated-orcid":false,"given":"Sennay","family":"Ghebreab","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-2310-6444","authenticated-orcid":false,"given":"Fernando P.","family":"Santos","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"16860","published-online":{"date-parts":[[2026,3,25]]},"container-title":["Journal of Artificial Intelligence Research"],"original-title":[],"link":[{"URL":"https:\/\/www.jair.org\/index.php\/jair\/article\/download\/19862\/27286","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/www.jair.org\/index.php\/jair\/article\/download\/19862\/27286","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2026,4,1]],"date-time":"2026-04-01T03:15:01Z","timestamp":1775013301000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.jair.org\/index.php\/jair\/article\/view\/19862"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2026,3,25]]},"references-count":0,"URL":"https:\/\/doi.org\/10.1613\/jair.1.19862","relation":{},"ISSN":["1076-9757"],"issn-type":[{"value":"1076-9757","type":"electronic"}],"subject":[],"published":{"date-parts":[[2026,3,25]]}}}