{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,6]],"date-time":"2025-10-06T19:28:46Z","timestamp":1759778926703,"version":"3.41.0"},"publisher-location":"New York, NY, USA","reference-count":34,"publisher":"ACM","license":[{"start":{"date-parts":[[2022,10,17]],"date-time":"2022-10-17T00:00:00Z","timestamp":1665964800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"name":"National Research Foundation of Korea (NRF) Grant funded by the Ministry of Science and ICT","award":["2021R1A2C2094350"],"award-info":[{"award-number":["2021R1A2C2094350"]}]},{"name":"Technology Innovation Program funded by the Ministry of Trade, Industry and Energy (MOTIE, Korea)","award":["20018112"],"award-info":[{"award-number":["20018112"]}]},{"name":"Institute of Information & communications Technology Planning and Evaluation (IITP) grant funded by the Korea government (MSIT)","award":["2020-0-01373"],"award-info":[{"award-number":["2020-0-01373"]}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2022,10,17]]},"DOI":"10.1145\/3511808.3557389","type":"proceedings-article","created":{"date-parts":[[2022,10,16]],"date-time":"2022-10-16T01:29:57Z","timestamp":1665883797000},"page":"1064-1073","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":0,"title":["Maximum Norm Minimization"],"prefix":"10.1145","author":[{"given":"Seonjae","family":"Lee","sequence":"first","affiliation":[{"name":"Hanyang University, Seoul, Republic of Korea"}]},{"given":"Myoung Hoon","family":"Lee","sequence":"additional","affiliation":[{"name":"Hanyang University, Seoul, Republic of Korea"}]},{"given":"Jun","family":"Moon","sequence":"additional","affiliation":[{"name":"Hanyang University, Seoul, Republic of Korea"}]}],"member":"320","published-online":{"date-parts":[[2022,10,17]]},"reference":[{"key":"e_1_3_2_2_1_1","volume-title":"MOSIM'08: 7th Conference Internationale de Modelisation et Simulation. 698--707","author":"Aissani N","year":"2008","unstructured":"N Aissani , B Beldjilali , and D Trentesaux . 2008 . Efficient and effective reactive scheduling of manufacturing system using Sarsa-multi-objective agents . In MOSIM'08: 7th Conference Internationale de Modelisation et Simulation. 698--707 . N Aissani, B Beldjilali, and D Trentesaux. 2008. Efficient and effective reactive scheduling of manufacturing system using Sarsa-multi-objective agents. In MOSIM'08: 7th Conference Internationale de Modelisation et Simulation. 698--707."},{"volume-title":"The 2012 international joint conference on neural networks (IJCNN)","author":"Castelletti Andrea","key":"e_1_3_2_2_2_1","unstructured":"Andrea Castelletti , Francesca Pianosi , and Marcello Restelli . 2012. Tree-based fitted Q-iteration for multi-objective Markov decision problems . In The 2012 international joint conference on neural networks (IJCNN) . IEEE , 1--8. Andrea Castelletti, Francesca Pianosi, and Marcello Restelli. 2012. Tree-based fitted Q-iteration for multi-objective Markov decision problems. In The 2012 international joint conference on neural networks (IJCNN). IEEE, 1--8."},{"key":"e_1_3_2_2_3_1","doi-asserted-by":"publisher","DOI":"10.1049\/iet-its.2019.0273"},{"key":"e_1_3_2_2_4_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.artmed.2020.101964"},{"key":"e_1_3_2_2_5_1","volume-title":"International conference on machine learning. PMLR, 2829--2838","author":"Gu Shixiang","year":"2016","unstructured":"Shixiang Gu , Timothy Lillicrap , Ilya Sutskever , and Sergey Levine . 2016 . Continuous deep q-learning with model-based acceleration . In International conference on machine learning. PMLR, 2829--2838 . Shixiang Gu, Timothy Lillicrap, Ilya Sutskever, and Sergey Levine. 2016. Continuous deep q-learning with model-based acceleration. In International conference on machine learning. PMLR, 2829--2838."},{"key":"e_1_3_2_2_6_1","volume-title":"International conference on machine learning. PMLR","author":"Haarnoja Tuomas","year":"2018","unstructured":"Tuomas Haarnoja , Aurick Zhou , Pieter Abbeel , and Sergey Levine . 2018 . Soft actor-critic: Off-policy maximum entropy deep reinforcement learning with a stochastic actor . In International conference on machine learning. PMLR , 1861--1870. Tuomas Haarnoja, Aurick Zhou, Pieter Abbeel, and Sergey Levine. 2018. Soft actor-critic: Off-policy maximum entropy deep reinforcement learning with a stochastic actor. In International conference on machine learning. PMLR, 1861--1870."},{"key":"e_1_3_2_2_7_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICRA.2018.8461233"},{"key":"e_1_3_2_2_8_1","volume-title":"Charles Beattie, Neil C Rabinowitz, Ari S Morcos, Avraham Ruderman, et al.","author":"Jaderberg Max","year":"2019","unstructured":"Max Jaderberg , Wojciech M Czarnecki , Iain Dunning , Luke Marris , Guy Lever , Antonio Garcia Castaneda , Charles Beattie, Neil C Rabinowitz, Ari S Morcos, Avraham Ruderman, et al. 2019 . Human-level performance in 3D multiplayer games with population-based reinforcement learning. Science , Vol. 364 , 6443 (2019), 859--865. Max Jaderberg, Wojciech M Czarnecki, Iain Dunning, Luke Marris, Guy Lever, Antonio Garcia Castaneda, Charles Beattie, Neil C Rabinowitz, Ari S Morcos, Avraham Ruderman, et al. 2019. Human-level performance in 3D multiplayer games with population-based reinforcement learning. Science, Vol. 364, 6443 (2019), 859--865."},{"key":"e_1_3_2_2_9_1","volume-title":"International conference on machine learning. PMLR, 2284--2293","author":"Jiang Daniel","year":"2018","unstructured":"Daniel Jiang , Emmanuel Ekwedike , and Han Liu . 2018 . Feedback-based tree search for reinforcement learning . In International conference on machine learning. PMLR, 2284--2293 . Daniel Jiang, Emmanuel Ekwedike, and Han Liu. 2018. Feedback-based tree search for reinforcement learning. In International conference on machine learning. PMLR, 2284--2293."},{"key":"e_1_3_2_2_10_1","volume-title":"Continuous control with deep reinforcement learning. arXiv preprint arXiv:1509.02971","author":"Lillicrap Timothy P","year":"2015","unstructured":"Timothy P Lillicrap , Jonathan J Hunt , Alexander Pritzel , Nicolas Heess , Tom Erez , Yuval Tassa , David Silver , and Daan Wierstra . 2015. Continuous control with deep reinforcement learning. arXiv preprint arXiv:1509.02971 ( 2015 ). Timothy P Lillicrap, Jonathan J Hunt, Alexander Pritzel, Nicolas Heess, Tom Erez, Yuval Tassa, David Silver, and Daan Wierstra. 2015. Continuous control with deep reinforcement learning. arXiv preprint arXiv:1509.02971 (2015)."},{"key":"e_1_3_2_2_11_1","doi-asserted-by":"crossref","first-page":"385","DOI":"10.1109\/TSMC.2014.2358639","article-title":"Multiobjective reinforcement learning: A comprehensive overview","volume":"45","author":"Liu Chunming","year":"2014","unstructured":"Chunming Liu , Xin Xu , and Dewen Hu . 2014 . Multiobjective reinforcement learning: A comprehensive overview . IEEE Transactions on Systems, Man, and Cybernetics: Systems , Vol. 45 , 3 (2014), 385 -- 398 . Chunming Liu, Xin Xu, and Dewen Hu. 2014. Multiobjective reinforcement learning: A comprehensive overview. IEEE Transactions on Systems, Man, and Cybernetics: Systems, Vol. 45, 3 (2014), 385--398.","journal-title":"IEEE Transactions on Systems, Man, and Cybernetics: Systems"},{"key":"e_1_3_2_2_12_1","doi-asserted-by":"publisher","DOI":"10.1049\/iet-its.2019.0249"},{"key":"e_1_3_2_2_13_1","volume-title":"Nature","volume":"518","author":"Mnih Volodymyr","year":"2015","unstructured":"Volodymyr Mnih , Koray Kavukcuoglu , David Silver , Andrei A Rusu , Joel Veness , Marc G Bellemare , Alex Graves , Martin Riedmiller , Andreas K Fidjeland , Georg Ostrovski , 2015 . Human-level control through deep reinforcement learning . Nature , Vol. 518 , 7540 (2015), 529--533. Volodymyr Mnih, Koray Kavukcuoglu, David Silver, Andrei A Rusu, Joel Veness, Marc G Bellemare, Alex Graves, Martin Riedmiller, Andreas K Fidjeland, Georg Ostrovski, et al. 2015. Human-level control through deep reinforcement learning. Nature, Vol. 518, 7540 (2015), 529--533."},{"key":"e_1_3_2_2_14_1","volume-title":"Deep reinforcement learning for cyber security","author":"Nguyen Thanh Thi","year":"2019","unstructured":"Thanh Thi Nguyen and Vijay Janapa Reddi . 2019. Deep reinforcement learning for cyber security . IEEE Transactions on Neural Networks and Learning Systems ( 2019 ). Thanh Thi Nguyen and Vijay Janapa Reddi. 2019. Deep reinforcement learning for cyber security. IEEE Transactions on Neural Networks and Learning Systems (2019)."},{"key":"e_1_3_2_2_15_1","doi-asserted-by":"publisher","DOI":"10.1109\/IJCNN.2014.6889738"},{"key":"e_1_3_2_2_16_1","volume-title":"Proceedings of the adaptive and learning agents workshop (ALA-19)","author":"Reymond Mathieu","year":"2019","unstructured":"Mathieu Reymond and Ann Now\u00e9 . 2019 . Pareto-DQN: Approximating the Pareto front in complex multi-objective decision problems . In Proceedings of the adaptive and learning agents workshop (ALA-19) at AAMAS. Mathieu Reymond and Ann Now\u00e9. 2019. Pareto-DQN: Approximating the Pareto front in complex multi-objective decision problems. In Proceedings of the adaptive and learning agents workshop (ALA-19) at AAMAS."},{"key":"e_1_3_2_2_17_1","volume-title":"Proceedings of the Adaptive and Learning Agents workshop at FAIM","volume":"2018","author":"Roijers Diederik M","year":"2018","unstructured":"Diederik M Roijers , Denis Steckelmacher , and Ann Now\u00e9 . 2018 . Multi-objective reinforcement learning for the expected utility of the return . In Proceedings of the Adaptive and Learning Agents workshop at FAIM , Vol. 2018 . Diederik M Roijers, Denis Steckelmacher, and Ann Now\u00e9. 2018. Multi-objective reinforcement learning for the expected utility of the return. In Proceedings of the Adaptive and Learning Agents workshop at FAIM, Vol. 2018."},{"key":"e_1_3_2_2_18_1","volume-title":"Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347","author":"Schulman John","year":"2017","unstructured":"John Schulman , Filip Wolski , Prafulla Dhariwal , Alec Radford , and Oleg Klimov . 2017. Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347 ( 2017 ). John Schulman, Filip Wolski, Prafulla Dhariwal, Alec Radford, and Oleg Klimov. 2017. Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347 (2017)."},{"key":"e_1_3_2_2_19_1","volume-title":"Learning to repeat: Fine grained action repetition for deep reinforcement learning. arXiv preprint arXiv:1702.06054","author":"Sharma Sahil","year":"2017","unstructured":"Sahil Sharma , Aravind Srinivas , and Balaraman Ravindran . 2017. Learning to repeat: Fine grained action repetition for deep reinforcement learning. arXiv preprint arXiv:1702.06054 ( 2017 ). Sahil Sharma, Aravind Srinivas, and Balaraman Ravindran. 2017. Learning to repeat: Fine grained action repetition for deep reinforcement learning. arXiv preprint arXiv:1702.06054 (2017)."},{"key":"e_1_3_2_2_20_1","volume-title":"International Conference on Machine Learning. PMLR, 8905--8915","author":"Siddique Umer","year":"2020","unstructured":"Umer Siddique , Paul Weng , and Matthieu Zimmer . 2020 . Learning fair policies in multiobjective (deep) reinforcement learning with average and discounted rewards . In International Conference on Machine Learning. PMLR, 8905--8915 . Umer Siddique, Paul Weng, and Matthieu Zimmer. 2020. Learning fair policies in multiobjective (deep) reinforcement learning with average and discounted rewards. In International Conference on Machine Learning. PMLR, 8905--8915."},{"key":"e_1_3_2_2_21_1","doi-asserted-by":"publisher","DOI":"10.1002\/aic.16689"},{"key":"e_1_3_2_2_22_1","doi-asserted-by":"publisher","DOI":"10.1109\/IROS.2012.6386109"},{"key":"e_1_3_2_2_23_1","volume-title":"Randall K Ten Haken, and Issam El Naqa","author":"Tseng Huan-Hsin","year":"2017","unstructured":"Huan-Hsin Tseng , Yi Luo , Sunan Cui , Jen-Tzung Chien , Randall K Ten Haken, and Issam El Naqa . 2017 . Deep reinforcement learning for automated radiation adaptation in lung cancer. Medical physics, Vol. 44 , 12 (2017), 6690--6705. Huan-Hsin Tseng, Yi Luo, Sunan Cui, Jen-Tzung Chien, Randall K Ten Haken, and Issam El Naqa. 2017. Deep reinforcement learning for automated radiation adaptation in lung cancer. Medical physics, Vol. 44, 12 (2017), 6690--6705."},{"key":"e_1_3_2_2_24_1","doi-asserted-by":"publisher","DOI":"10.1007\/s00521-021-05859-1"},{"key":"e_1_3_2_2_25_1","doi-asserted-by":"publisher","DOI":"10.1109\/ADPRL.2013.6615007"},{"key":"e_1_3_2_2_26_1","doi-asserted-by":"publisher","DOI":"10.1145\/3357384.3357965"},{"volume-title":"2014 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning (ADPRL). IEEE, 1--6.","author":"Wiering Marco A","key":"e_1_3_2_2_27_1","unstructured":"Marco A Wiering , Maikel Withagen , and Mua dua lina M Drugan. 2014. Model-based multi-objective reinforcement learning . In 2014 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning (ADPRL). IEEE, 1--6. Marco A Wiering, Maikel Withagen, and Mua dua lina M Drugan. 2014. Model-based multi-objective reinforcement learning. In 2014 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning (ADPRL). IEEE, 1--6."},{"key":"e_1_3_2_2_28_1","volume-title":"5th International Conference on Learning Representations (ICLR).","author":"Wu Yuxin","year":"2017","unstructured":"Yuxin Wu and Yuandong Tian . 2017 . Training agent for first-person shooter game with actor-critic curriculum learning . In 5th International Conference on Learning Representations (ICLR). Yuxin Wu and Yuandong Tian. 2017. Training agent for first-person shooter game with actor-critic curriculum learning. In 5th International Conference on Learning Representations (ICLR)."},{"key":"e_1_3_2_2_29_1","volume-title":"International Conference on Machine Learning. PMLR, 10607--10616","author":"Xu Jie","year":"2020","unstructured":"Jie Xu , Yunsheng Tian , Pingchuan Ma , Daniela Rus , Shinjiro Sueda , and Wojciech Matusik . 2020 . Prediction-guided multi-objective reinforcement learning for continuous robot control . In International Conference on Machine Learning. PMLR, 10607--10616 . Jie Xu, Yunsheng Tian, Pingchuan Ma, Daniela Rus, Shinjiro Sueda, and Wojciech Matusik. 2020. Prediction-guided multi-objective reinforcement learning for continuous robot control. In International Conference on Machine Learning. PMLR, 10607--10616."},{"key":"e_1_3_2_2_30_1","doi-asserted-by":"publisher","DOI":"10.1002\/cav.1978"},{"key":"e_1_3_2_2_31_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.robot.2019.01.003"},{"key":"e_1_3_2_2_32_1","doi-asserted-by":"publisher","DOI":"10.1109\/TEVC.2007.892759"},{"key":"e_1_3_2_2_33_1","first-page":"213","article-title":"Deep reinforcement learning for power system applications: An overview","volume":"6","author":"Zhang Zidong","year":"2019","unstructured":"Zidong Zhang , Dongxia Zhang , and Robert C Qiu . 2019 . Deep reinforcement learning for power system applications: An overview . CSEE Journal of Power and Energy Systems , Vol. 6 , 1 (2019), 213 -- 225 . Zidong Zhang, Dongxia Zhang, and Robert C Qiu. 2019. Deep reinforcement learning for power system applications: An overview. CSEE Journal of Power and Energy Systems, Vol. 6, 1 (2019), 213--225.","journal-title":"CSEE Journal of Power and Energy Systems"},{"key":"e_1_3_2_2_34_1","volume-title":"Provable multi-objective reinforcement learning with generative models. arXiv preprint arXiv:2011.10134","author":"Zhou Dongruo","year":"2020","unstructured":"Dongruo Zhou , Jiahao Chen , and Quanquan Gu. 2020. Provable multi-objective reinforcement learning with generative models. arXiv preprint arXiv:2011.10134 ( 2020 ). Dongruo Zhou, Jiahao Chen, and Quanquan Gu. 2020. Provable multi-objective reinforcement learning with generative models. arXiv preprint arXiv:2011.10134 (2020)."}],"event":{"name":"CIKM '22: The 31st ACM International Conference on Information and Knowledge Management","sponsor":["SIGWEB ACM Special Interest Group on Hypertext, Hypermedia, and Web","SIGIR ACM Special Interest Group on Information Retrieval"],"location":"Atlanta GA USA","acronym":"CIKM '22"},"container-title":["Proceedings of the 31st ACM International Conference on Information &amp; Knowledge Management"],"original-title":[],"link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3511808.3557389","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3511808.3557389","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T17:48:54Z","timestamp":1750182534000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3511808.3557389"}},"subtitle":["A Single-Policy Multi-Objective Reinforcement Learning to Expansion of the Pareto Front"],"short-title":[],"issued":{"date-parts":[[2022,10,17]]},"references-count":34,"alternative-id":["10.1145\/3511808.3557389","10.1145\/3511808"],"URL":"https:\/\/doi.org\/10.1145\/3511808.3557389","relation":{},"subject":[],"published":{"date-parts":[[2022,10,17]]},"assertion":[{"value":"2022-10-17","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}