{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,7,3]],"date-time":"2026-07-03T22:08:22Z","timestamp":1783116502856,"version":"3.54.6"},"publisher-location":"New York, NY, USA","reference-count":50,"publisher":"ACM","license":[{"start":{"date-parts":[[2022,10,17]],"date-time":"2022-10-17T00:00:00Z","timestamp":1665964800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"name":"XJTLU Research Development Fund","award":["RDF-21-01-053 TDF21\/22-R23-160"],"award-info":[{"award-number":["RDF-21-01-053 TDF21\/22-R23-160"]}]},{"name":"AI Empowerment Tech. Inc. Research Fund","award":["RDS10120220021"],"award-info":[{"award-number":["RDS10120220021"]}]},{"name":"Italian Ministry of Education University and Research","award":["Dipartimenti di eccellenza 2018-2022"],"award-info":[{"award-number":["Dipartimenti di eccellenza 2018-2022"]}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2022,10,17]]},"DOI":"10.1145\/3511808.3557429","type":"proceedings-article","created":{"date-parts":[[2022,10,16]],"date-time":"2022-10-16T01:22:22Z","timestamp":1665883342000},"page":"252-261","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":22,"title":["R\n            <scp>e<\/scp>\n            LAX: Reinforcement Learning Agent Explainer for Arbitrary Predictive Models"],"prefix":"10.1145","author":[{"given":"Ziheng","family":"Chen","sequence":"first","affiliation":[{"name":"Stony Brook University, Stony Brook, NY, USA"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Fabrizio","family":"Silvestri","sequence":"additional","affiliation":[{"name":"Sapienza University of Rome, Rome, Italy"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Jia","family":"Wang","sequence":"additional","affiliation":[{"name":"The Xi'an Jiaotong-Liverpool University, Su Zhou, China"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"He","family":"Zhu","sequence":"additional","affiliation":[{"name":"Rutgers University, New Brunswick, NJ, USA"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Hongshik","family":"Ahn","sequence":"additional","affiliation":[{"name":"Stony Brook University, Stony Brook, NY, USA"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Gabriele","family":"Tolomei","sequence":"additional","affiliation":[{"name":"Sapienza University of Rome, Rome, Italy"}],"role":[{"vocabulary":"crossref","role":"author"}]}],"member":"320","published-online":{"date-parts":[[2022,10,17]]},"reference":[{"key":"e_1_3_2_1_1_1","unstructured":"1978. Boston Housing Dataset. [Online]. Available from: https:\/\/www.kaggle.com\/vikrishnan\/boston-house-prices.  1978. Boston Housing Dataset. [Online]. Available from: https:\/\/www.kaggle.com\/vikrishnan\/boston-house-prices."},{"key":"e_1_3_2_1_2_1","unstructured":"1988. Diabetes Dataset. [Online]. Available from: https:\/\/www.kaggle.com\/uciml\/pima-indians-diabetes-database.  1988. Diabetes Dataset. [Online]. Available from: https:\/\/www.kaggle.com\/uciml\/pima-indians-diabetes-database."},{"key":"e_1_3_2_1_3_1","unstructured":"1988. Sonar Dataset. [Online]. Available from: https:\/\/archive.ics.uci.edu\/ml\/datasets\/connectionistbench(sonar minesvs.rocks).  1988. Sonar Dataset. [Online]. Available from: https:\/\/archive.ics.uci.edu\/ml\/datasets\/connectionistbench(sonar minesvs.rocks)."},{"key":"e_1_3_2_1_4_1","unstructured":"1988. Wave Dataset. [Online]. Available from: https:\/\/archive.ics.uci.edu\/ml\/datasets\/waveformdatabasegenerator(version1).  1988. Wave Dataset. [Online]. Available from: https:\/\/archive.ics.uci.edu\/ml\/datasets\/waveformdatabasegenerator(version1)."},{"key":"e_1_3_2_1_5_1","unstructured":"1995. Breast Cancer Dataset. [Online]. Available from: https:\/\/archive.ics.uci.edu\/ml\/datasets\/breastcancerwisconsin(diagnostic).  1995. Breast Cancer Dataset. [Online]. Available from: https:\/\/archive.ics.uci.edu\/ml\/datasets\/breastcancerwisconsin(diagnostic)."},{"key":"e_1_3_2_1_6_1","volume-title":"Faria","author":"Bird Jordan J.","year":"2020","unstructured":"Jordan J. Bird , Chloe M. Barnes , Cristiano Premebida , Anik\u00f3 Ek\u00e1rt , and Diego R . Faria . 2020 . Country-level Pandemic Risk and Preparedness Classification based on COVID-19 Data : A Machine Learning Approach. PLoS One 15, 10 (2020). Jordan J. Bird, Chloe M. Barnes, Cristiano Premebida, Anik\u00f3 Ek\u00e1rt, and Diego R. Faria. 2020. Country-level Pandemic Risk and Preparedness Classification based on COVID-19 Data: A Machine Learning Approach. PLoS One 15, 10 (2020)."},{"key":"e_1_3_2_1_7_1","volume-title":"Exploration by Random Network Distillation. arXiv preprint arXiv:1810.12894","author":"Burda Yuri","year":"2018","unstructured":"Yuri Burda , Harrison Edwards , Amos Storkey , and Oleg Klimov . 2018. Exploration by Random Network Distillation. arXiv preprint arXiv:1810.12894 ( 2018 ). Yuri Burda, Harrison Edwards, Amos Storkey, and Oleg Klimov. 2018. Exploration by Random Network Distillation. arXiv preprint arXiv:1810.12894 (2018)."},{"key":"e_1_3_2_1_8_1","unstructured":"Central Intelligency Agency. 2020. The CIA World Factbook2020.  Central Intelligency Agency. 2020. The CIA World Factbook2020."},{"key":"e_1_3_2_1_9_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-030-58112-1_31"},{"key":"e_1_3_2_1_10_1","volume-title":"Official Journal of the European Union L119","author":"European EU.","year":"2016","unstructured":"EU. 2016. Regulation (EU) 2016\/679 of the European Parliament (GDPR). Official Journal of the European Union L119 ( 2016 ), 1--88. EU. 2016. Regulation (EU) 2016\/679 of the European Parliament (GDPR). Official Journal of the European Union L119 (2016), 1--88."},{"key":"e_1_3_2_1_11_1","volume-title":"Theoretical and Empirical Analysis of Reward Shaping in Reinforcement Learning. In 2009 International Conference on Machine Learning and Applications. IEEE, 337--344","author":"Grzes Marek","year":"2009","unstructured":"Marek Grzes and Daniel Kudenko . 2009 . Theoretical and Empirical Analysis of Reward Shaping in Reinforcement Learning. In 2009 International Conference on Machine Learning and Applications. IEEE, 337--344 . Marek Grzes and Daniel Kudenko. 2009. Theoretical and Empirical Analysis of Reward Shaping in Reinforcement Learning. In 2009 International Conference on Machine Learning and Applications. IEEE, 337--344."},{"key":"e_1_3_2_1_12_1","volume-title":"Counterfactual Explanations and How to Find Them: Literature Review and Benchmarking. Data Mining and Knowledge Discovery","author":"Guidotti Riccardo","year":"2022","unstructured":"Riccardo Guidotti . 2022. Counterfactual Explanations and How to Find Them: Literature Review and Benchmarking. Data Mining and Knowledge Discovery ( 2022 ), 1--55. Riccardo Guidotti. 2022. Counterfactual Explanations and How to Find Them: Literature Review and Benchmarking. Data Mining and Knowledge Discovery (2022), 1--55."},{"key":"e_1_3_2_1_13_1","volume-title":"Local Rule-Based Explanations of Black Box Decision Systems. arXiv preprint arXiv:1805.10820","author":"Guidotti Riccardo","year":"2018","unstructured":"Riccardo Guidotti , Anna Monreale , Salvatore Ruggieri , Dino Pedreschi , Franco Turini , and Fosca Giannotti . 2018. Local Rule-Based Explanations of Black Box Decision Systems. arXiv preprint arXiv:1805.10820 ( 2018 ). Riccardo Guidotti, Anna Monreale, Salvatore Ruggieri, Dino Pedreschi, Franco Turini, and Fosca Giannotti. 2018. Local Rule-Based Explanations of Black Box Decision Systems. arXiv preprint arXiv:1805.10820 (2018)."},{"key":"e_1_3_2_1_14_1","volume-title":"Proc. of ALA Workshop.","author":"Hausknecht Matthew","year":"2016","unstructured":"Matthew Hausknecht , Prannoy Mupparaju , Sandeep Subramanian , Shivaram Kalyanakrishnan , and Peter Stone . 2016 . Half Field Offense: An Environment for Multiagent Learning and Ad Hoc Teamwork . In Proc. of ALA Workshop. Matthew Hausknecht, Prannoy Mupparaju, Sandeep Subramanian, Shivaram Kalyanakrishnan, and Peter Stone. 2016. Half Field Offense: An Environment for Multiagent Learning and Ad Hoc Teamwork. In Proc. of ALA Workshop."},{"key":"e_1_3_2_1_15_1","first-page":"15931","article-title":"Learning to Utilize Shaping Rewards: A New Approach of Reward Shaping","volume":"33","author":"Hu Yujing","year":"2020","unstructured":"Yujing Hu , Weixun Wang , Hangtian Jia , Yixiang Wang , Yingfeng Chen , Jianye Hao , Feng Wu , and Changjie Fan . 2020 . Learning to Utilize Shaping Rewards: A New Approach of Reward Shaping . Advances in Neural Information Processing Systems 33 (2020), 15931 -- 15941 . Yujing Hu, Weixun Wang, Hangtian Jia, Yixiang Wang, Yingfeng Chen, Jianye Hao, Feng Wu, and Changjie Fan. 2020. Learning to Utilize Shaping Rewards: A New Approach of Reward Shaping. Advances in Neural Information Processing Systems 33 (2020), 15931--15941.","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_3_2_1_16_1","doi-asserted-by":"publisher","DOI":"10.24963\/ijcai.2020\/395"},{"key":"e_1_3_2_1_17_1","volume-title":"Proc. of AISTATS, Silvia Chiappa and Roberto Calandra (Eds.)","volume":"108","author":"Karimi Amir-Hossein","year":"2020","unstructured":"Amir-Hossein Karimi , Gilles Barthe , Borja Balle , and Isabel Valera . 2020 . Model-Agnostic Counterfactual Explanations for Consequential Decisions . In Proc. of AISTATS, Silvia Chiappa and Roberto Calandra (Eds.) , Vol. 108 . PMLR, 895--905. http:\/\/proceedings.mlr.press\/v108\/karimi20a.html Amir-Hossein Karimi, Gilles Barthe, Borja Balle, and Isabel Valera. 2020. Model-Agnostic Counterfactual Explanations for Consequential Decisions. In Proc. of AISTATS, Silvia Chiappa and Roberto Calandra (Eds.), Vol. 108. PMLR, 895--905. http:\/\/proceedings.mlr.press\/v108\/karimi20a.html"},{"key":"e_1_3_2_1_18_1","volume-title":"Proc. of NeurIPS (NeurIPS'20)","author":"Karimi Amir-Hossein","year":"2020","unstructured":"Amir-Hossein Karimi , Bodo Julius von K\u00fcgelgen , Bernhard Sch\u00f6lkopf , and Isabel Valera . 2020 . Algorithmic Recourse Under Imperfect Causal Knowledge: A Probabilistic Approach . In Proc. of NeurIPS (NeurIPS'20) , Hugo Larochelle, Marc'Aurelio Ranzato, Raia Hadsell, Maria-Florina Balcan, and Hsuan-Tien Lin (Eds.). https:\/\/proceedings.neurips.cc\/paper\/ 2020\/hash\/02a3c7fb3f489288ae6942498498db20-Abstract.html Amir-Hossein Karimi, Bodo Julius von K\u00fcgelgen, Bernhard Sch\u00f6lkopf, and Isabel Valera. 2020. Algorithmic Recourse Under Imperfect Causal Knowledge: A Probabilistic Approach. In Proc. of NeurIPS (NeurIPS'20), Hugo Larochelle, Marc'Aurelio Ranzato, Raia Hadsell, Maria-Florina Balcan, and Hsuan-Tien Lin (Eds.). https:\/\/proceedings.neurips.cc\/paper\/2020\/hash\/02a3c7fb3f489288ae6942498498db20-Abstract.html"},{"key":"e_1_3_2_1_19_1","doi-asserted-by":"publisher","DOI":"10.1109\/IRC.2017.33"},{"key":"e_1_3_2_1_20_1","doi-asserted-by":"publisher","DOI":"10.1145\/3394486.3403066"},{"key":"e_1_3_2_1_21_1","volume-title":"Proc. of ICLR (ICLR'16)","author":"Lillicrap Timothy P.","year":"2016","unstructured":"Timothy P. Lillicrap , Jonathan J. Hunt , Alexander Pritzel , Nicolas Heess , Tom Erez , Yuval Tassa , David Silver , and Daan Wierstra . 2016 . Continuous Control with Deep Reinforcement Learning . In Proc. of ICLR (ICLR'16) , Yoshua Bengio and Yann LeCun (Eds.). http:\/\/arxiv.org\/abs\/1509.02971 Timothy P. Lillicrap, Jonathan J. Hunt, Alexander Pritzel, Nicolas Heess, Tom Erez, Yuval Tassa, David Silver, and Daan Wierstra. 2016. Continuous Control with Deep Reinforcement Learning. In Proc. of ICLR (ICLR'16), Yoshua Bengio and Yann LeCun (Eds.). http:\/\/arxiv.org\/abs\/1509.02971"},{"key":"e_1_3_2_1_22_1","doi-asserted-by":"publisher","DOI":"10.1609\/aaai.v36i5.20468"},{"key":"e_1_3_2_1_23_1","volume-title":"Maarten De Rijke, and Fabrizio Silvestri","author":"Lucic Ana","year":"2022","unstructured":"Ana Lucic , Maartje A. Ter Hoeve , Gabriele Tolomei , Maarten De Rijke, and Fabrizio Silvestri . 2022 . CF-GNNExplainer: Counterfactual Explanations for Graph Neural Networks. In Proc. of AISTATS (AISTATS '22, Vol. 151), Gustau Camps-Valls, Francisco J. R. Ruiz, and Isabel Valera (Eds.). PMLR, 4499-- 4511 . https:\/\/proceedings.mlr.press\/v151\/lucic22a.html Ana Lucic, Maartje A. Ter Hoeve, Gabriele Tolomei, Maarten De Rijke, and Fabrizio Silvestri. 2022. CF-GNNExplainer: Counterfactual Explanations for Graph Neural Networks. In Proc. of AISTATS (AISTATS'22, Vol. 151), Gustau Camps-Valls, Francisco J. R. Ruiz, and Isabel Valera (Eds.). PMLR, 4499--4511. https:\/\/proceedings.mlr.press\/v151\/lucic22a.html"},{"key":"e_1_3_2_1_24_1","first-page":"I","article-title":"A Unified Approach to Interpreting Model Predictions","volume":"30","author":"Lundberg Scott M","year":"2017","unstructured":"Scott M Lundberg and Su-In Lee . 2017 . A Unified Approach to Interpreting Model Predictions . In Advances in Neural Information Processing Systems 30 , I . Guyon, U. V. Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett (Eds.). Curran Associates, Inc., 4765--4774. http:\/\/papers.nips.cc\/paper\/7062-aunified-approach-to-interpreting-model-predictions.pdf Scott M Lundberg and Su-In Lee. 2017. A Unified Approach to Interpreting Model Predictions. In Advances in Neural Information Processing Systems 30, I. Guyon, U. V. Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett (Eds.). Curran Associates, Inc., 4765--4774. http:\/\/papers.nips.cc\/paper\/7062-aunified-approach-to-interpreting-model-predictions.pdf","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_3_2_1_25_1","volume-title":"Proc. of AAAI (AAAI'16)","author":"Masson Warwick","year":"1934","unstructured":"Warwick Masson , Pravesh Ranchod , and George Konidaris . 2016. Reinforcement Learning with Parameterized Actions . In Proc. of AAAI (AAAI'16) . AAAI Press , 1934 --1940. Warwick Masson, Pravesh Ranchod, and George Konidaris. 2016. Reinforcement Learning with Parameterized Actions. In Proc. of AAAI (AAAI'16). AAAI Press, 1934--1940."},{"key":"e_1_3_2_1_26_1","volume-title":"Playing Atari with Deep Reinforcement Learning. In NIPS Deep Learning Workshop.","author":"Mnih Volodymyr","year":"2013","unstructured":"Volodymyr Mnih , Koray Kavukcuoglu , David Silver , Alex Graves , Ioannis Antonoglou , Daan Wierstra , and Martin Riedmiller . 2013 . Playing Atari with Deep Reinforcement Learning. In NIPS Deep Learning Workshop. Volodymyr Mnih, Koray Kavukcuoglu, David Silver, Alex Graves, Ioannis Antonoglou, Daan Wierstra, and Martin Riedmiller. 2013. Playing Atari with Deep Reinforcement Learning. In NIPS Deep Learning Workshop."},{"key":"e_1_3_2_1_27_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2016.282"},{"key":"e_1_3_2_1_28_1","doi-asserted-by":"publisher","DOI":"10.1145\/3351095.3372850"},{"key":"e_1_3_2_1_29_1","doi-asserted-by":"publisher","DOI":"10.1145\/3366423.3380087"},{"key":"e_1_3_2_1_30_1","doi-asserted-by":"publisher","DOI":"10.1145\/3375627.3375850"},{"key":"e_1_3_2_1_31_1","doi-asserted-by":"publisher","DOI":"10.1145\/2939672.2939778"},{"key":"e_1_3_2_1_32_1","doi-asserted-by":"publisher","DOI":"10.1145\/3287560.3287569"},{"key":"e_1_3_2_1_33_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-030-28954-6"},{"key":"e_1_3_2_1_34_1","volume-title":"Prioritized Experience Replay. arXiv preprint arXiv:1511.05952","author":"Schaul Tom","year":"2015","unstructured":"Tom Schaul , John Quan , Ioannis Antonoglou , and David Silver . 2015. Prioritized Experience Replay. arXiv preprint arXiv:1511.05952 ( 2015 ). Tom Schaul, John Quan, Ioannis Antonoglou, and David Silver. 2015. Prioritized Experience Replay. arXiv preprint arXiv:1511.05952 (2015)."},{"key":"e_1_3_2_1_35_1","doi-asserted-by":"publisher","DOI":"10.1007\/11527862_14"},{"key":"e_1_3_2_1_36_1","volume-title":"Counterfactual Explanations for Arbitrary Regression Models. CoRR abs\/2106.15212","author":"Spooner Thomas","year":"2021","unstructured":"Thomas Spooner , Danial Dervovic , Jason Long , Jon Shepard , Jiahao Chen , and Daniele Magazzeni . 2021. Counterfactual Explanations for Arbitrary Regression Models. CoRR abs\/2106.15212 ( 2021 ). https:\/\/arxiv.org\/abs\/2106.15212 Thomas Spooner, Danial Dervovic, Jason Long, Jon Shepard, Jiahao Chen, and Daniele Magazzeni. 2021. Counterfactual Explanations for Arbitrary Regression Models. CoRR abs\/2106.15212 (2021). https:\/\/arxiv.org\/abs\/2106.15212"},{"key":"e_1_3_2_1_37_1","doi-asserted-by":"publisher","DOI":"10.1109\/ACCESS.2021.3051315"},{"key":"e_1_3_2_1_38_1","volume-title":"Barto","author":"Sutton Richard S.","year":"1998","unstructured":"Richard S. Sutton and Andrew G . Barto . 1998 . Reinforcement Learning : An Introduction. MIT Press . http:\/\/www.cs.ualberta.ca\/~sutton\/book\/the-book.html Richard S. Sutton and Andrew G. Barto. 1998. Reinforcement Learning: An Introduction. MIT Press. http:\/\/www.cs.ualberta.ca\/~sutton\/book\/the-book.html"},{"key":"e_1_3_2_1_39_1","doi-asserted-by":"publisher","DOI":"10.1109\/TKDE.2019.2945326"},{"key":"e_1_3_2_1_40_1","doi-asserted-by":"publisher","DOI":"10.1145\/3097983.3098039"},{"key":"e_1_3_2_1_41_1","doi-asserted-by":"publisher","DOI":"10.1145\/3287560.3287566"},{"key":"e_1_3_2_1_42_1","volume-title":"Counterfactual Explanations for Machine Learning: A Review. arXiv preprint arXiv:2010.10596","author":"Verma Sahil","year":"2020","unstructured":"Sahil Verma , John Dickerson , and Keegan Hines . 2020. Counterfactual Explanations for Machine Learning: A Review. arXiv preprint arXiv:2010.10596 ( 2020 ). Sahil Verma, John Dickerson, and Keegan Hines. 2020. Counterfactual Explanations for Machine Learning: A Review. arXiv preprint arXiv:2010.10596 (2020)."},{"key":"e_1_3_2_1_43_1","first-page":"841","article-title":"Counterfactual Explanations without Opening the Black Box: Automated Decisions and the GDPR","volume":"31","author":"Wachter Sandra","year":"2017","unstructured":"Sandra Wachter , Brent Mittelstadt , and Chris Russell . 2017 . Counterfactual Explanations without Opening the Black Box: Automated Decisions and the GDPR . Harvart Journal of Law & Tech. 31 (2017), 841 . Sandra Wachter, Brent Mittelstadt, and Chris Russell. 2017. Counterfactual Explanations without Opening the Black Box: Automated Decisions and the GDPR. Harvart Journal of Law & Tech. 31 (2017), 841.","journal-title":"Harvart Journal of Law & Tech."},{"key":"e_1_3_2_1_44_1","doi-asserted-by":"publisher","DOI":"10.1145\/3459637.3482397"},{"key":"e_1_3_2_1_45_1","volume-title":"Hierarchical Approaches for Reinforcement Learning in Parameterized Action Space. In 2018 AAAI Spring Symposium Series.","author":"Wei Ermo","year":"2018","unstructured":"Ermo Wei , Drew Wicke , and Sean Luke . 2018 . Hierarchical Approaches for Reinforcement Learning in Parameterized Action Space. In 2018 AAAI Spring Symposium Series. Ermo Wei, Drew Wicke, and Sean Luke. 2018. Hierarchical Approaches for Reinforcement Learning in Parameterized Action Space. In 2018 AAAI Spring Symposium Series."},{"key":"e_1_3_2_1_46_1","unstructured":"World Health Organization. 2017. Prevalence of Obesity among Adults.  World Health Organization. 2017. Prevalence of Obesity among Adults."},{"key":"e_1_3_2_1_47_1","unstructured":"World Health Organization. 2020. Global Health Workforce Statistics.  World Health Organization. 2020. Global Health Workforce Statistics."},{"key":"e_1_3_2_1_48_1","unstructured":"Worldometers. 2013. Current World Population.  Worldometers. 2013. Current World Population."},{"key":"e_1_3_2_1_49_1","volume-title":"Parametrized Deep Q-networks Learning: Reinforcement Learning with Discrete-Continuous Hybrid Action Space. arXiv preprint arXiv:1810.06394","author":"Xiong Jiechao","year":"2018","unstructured":"Jiechao Xiong , QingWang, Zhuoran Yang , Peng Sun , Lei Han , Yang Zheng , Haobo Fu , Tong Zhang , Ji Liu , and Han Liu . 2018. Parametrized Deep Q-networks Learning: Reinforcement Learning with Discrete-Continuous Hybrid Action Space. arXiv preprint arXiv:1810.06394 ( 2018 ). Jiechao Xiong, QingWang, Zhuoran Yang, Peng Sun, Lei Han, Yang Zheng, Haobo Fu, Tong Zhang, Ji Liu, and Han Liu. 2018. Parametrized Deep Q-networks Learning: Reinforcement Learning with Discrete-Continuous Hybrid Action Space. arXiv preprint arXiv:1810.06394 (2018)."},{"key":"e_1_3_2_1_50_1","doi-asserted-by":"publisher","DOI":"10.1109\/JPROC.2020.3004555"}],"event":{"name":"CIKM '22: The 31st ACM International Conference on Information and Knowledge Management","location":"Atlanta GA USA","acronym":"CIKM '22","sponsor":["SIGWEB ACM Special Interest Group on Hypertext, Hypermedia, and Web","SIGIR ACM Special Interest Group on Information Retrieval"]},"container-title":["Proceedings of the 31st ACM International Conference on Information &amp; Knowledge Management"],"original-title":[],"link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3511808.3557429","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3511808.3557429","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T17:48:55Z","timestamp":1750182535000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3511808.3557429"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,10,17]]},"references-count":50,"alternative-id":["10.1145\/3511808.3557429","10.1145\/3511808"],"URL":"https:\/\/doi.org\/10.1145\/3511808.3557429","relation":{},"subject":[],"published":{"date-parts":[[2022,10,17]]},"assertion":[{"value":"2022-10-17","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}