{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T04:18:56Z","timestamp":1750220336993,"version":"3.41.0"},"publisher-location":"New York, NY, USA","reference-count":39,"publisher":"ACM","license":[{"start":{"date-parts":[[2021,10,26]],"date-time":"2021-10-26T00:00:00Z","timestamp":1635206400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2021,10,26]]},"DOI":"10.1145\/3459637.3482357","type":"proceedings-article","created":{"date-parts":[[2021,10,30]],"date-time":"2021-10-30T18:33:11Z","timestamp":1635618791000},"page":"1447-1456","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":5,"title":["Jointly-Learned State-Action Embedding for Efficient Reinforcement Learning"],"prefix":"10.1145","author":[{"given":"Paul J.","family":"Pritz","sequence":"first","affiliation":[{"name":"Imperial College London, London, United Kingdom"}]},{"given":"Liang","family":"Ma","sequence":"additional","affiliation":[{"name":"Dataminr, New York, NY, USA"}]},{"given":"Kin K.","family":"Leung","sequence":"additional","affiliation":[{"name":"Imperial College London, London, United Kingdom"}]}],"member":"320","published-online":{"date-parts":[[2021,10,30]]},"reference":[{"key":"e_1_3_2_2_1_1","unstructured":"Joshua Achiam. 2018. Spinning Up in Deep Reinforcement Learning. (2018).  Joshua Achiam. 2018. Spinning Up in Deep Reinforcement Learning. (2018)."},{"key":"e_1_3_2_2_2_1","volume-title":"Playing text-adventure games with graph-based deep reinforcement learning. arXiv preprint arXiv:1812.01628","author":"Ammanabrolu Prithviraj","year":"2018","unstructured":"Prithviraj Ammanabrolu and Mark O Riedl . 2018. Playing text-adventure games with graph-based deep reinforcement learning. arXiv preprint arXiv:1812.01628 ( 2018 ). Prithviraj Ammanabrolu and Mark O Riedl. 2018. Playing text-adventure games with graph-based deep reinforcement learning. arXiv preprint arXiv:1812.01628 (2018)."},{"key":"e_1_3_2_2_3_1","doi-asserted-by":"publisher","DOI":"10.1609\/aaai.v34i06.6564"},{"key":"e_1_3_2_2_4_1","volume-title":"Learning Action Representations for Reinforcement Learning. 36th International Conference on Machine Learning (ICML) (2019","author":"Chandak Yash","year":"1902","unstructured":"Yash Chandak , Georgios Theocharous , James Kostas , Scott Jordan , and Philip S. Thomas . 2019 . Learning Action Representations for Reinforcement Learning. 36th International Conference on Machine Learning (ICML) (2019 ), 1565--1582. arXiv: 1902 .00183 Yash Chandak, Georgios Theocharous, James Kostas, Scott Jordan, and Philip S. Thomas. 2019. Learning Action Representations for Reinforcement Learning. 36th International Conference on Machine Learning (ICML) (2019), 1565--1582. arXiv:1902.00183"},{"key":"e_1_3_2_2_5_1","doi-asserted-by":"publisher","DOI":"10.1109\/LRA.2018.2800101"},{"key":"e_1_3_2_2_6_1","volume-title":"Deep reinforcement learning in large discrete action spaces. arXiv preprint arXiv:1512.07679","author":"Dulac-Arnold Gabriel","year":"2015","unstructured":"Gabriel Dulac-Arnold , Richard Evans , Hado van Hasselt , Peter Sunehag , Timothy Lillicrap , Jonathan Hunt , Timothy Mann , Theophane Weber , Thomas Degris , and Ben Coppin . 2015. Deep reinforcement learning in large discrete action spaces. arXiv preprint arXiv:1512.07679 ( 2015 ). Gabriel Dulac-Arnold, Richard Evans, Hado van Hasselt, Peter Sunehag, Timothy Lillicrap, Jonathan Hunt, Timothy Mann, Theophane Weber, Thomas Degris, and Ben Coppin. 2015. Deep reinforcement learning in large discrete action spaces. arXiv preprint arXiv:1512.07679 (2015)."},{"key":"e_1_3_2_2_7_1","volume-title":"Latent World Models For Intrinsically Motivated Exploration. Advances in Neural Information Processing Systems 33","author":"Ermolov Aleksandr","year":"2020","unstructured":"Aleksandr Ermolov and Nicu Sebe . 2020. Latent World Models For Intrinsically Motivated Exploration. Advances in Neural Information Processing Systems 33 ( 2020 ). Aleksandr Ermolov and Nicu Sebe. 2020. Latent World Models For Intrinsically Motivated Exploration. Advances in Neural Information Processing Systems 33 (2020)."},{"key":"e_1_3_2_2_8_1","volume-title":"Deepmind AI reduces Google data centre cooling bill by 40%. DeepMind blog","author":"Evans Richard","year":"2016","unstructured":"Richard Evans and Jim Gao . 2016. Deepmind AI reduces Google data centre cooling bill by 40%. DeepMind blog ( 2016 ). https:\/\/deepmind.com\/blog\/article\/deepmind-aireduces-google-data-centre-cooling-bill-40 Richard Evans and Jim Gao. 2016. Deepmind AI reduces Google data centre cooling bill by 40%. DeepMind blog (2016). https:\/\/deepmind.com\/blog\/article\/deepmind-aireduces-google-data-centre-cooling-bill-40"},{"key":"e_1_3_2_2_9_1","doi-asserted-by":"publisher","DOI":"10.5555\/1036843.1036863"},{"key":"e_1_3_2_2_10_1","doi-asserted-by":"publisher","DOI":"10.1609\/aaai.v33i01.33013582"},{"key":"e_1_3_2_2_11_1","doi-asserted-by":"publisher","DOI":"10.1016\/S0004-3702(02)00376-4"},{"key":"e_1_3_2_2_12_1","doi-asserted-by":"publisher","DOI":"10.5555\/3327144.3327171"},{"key":"e_1_3_2_2_13_1","volume-title":"World models. arXiv preprint arXiv:1803.10122","author":"Ha David","year":"2018","unstructured":"David Ha and J\u00fcrgen Schmidhuber . 2018. World models. arXiv preprint arXiv:1803.10122 ( 2018 ). David Ha and J\u00fcrgen Schmidhuber. 2018. World models. arXiv preprint arXiv:1803.10122 (2018)."},{"key":"e_1_3_2_2_14_1","volume-title":"Soft actor-critic: Off-policy maximum entropy deep reinforcement learning with a stochastic actor. arXiv preprint arXiv:1801.01290","author":"Haarnoja Tuomas","year":"2018","unstructured":"Tuomas Haarnoja , Aurick Zhou , Pieter Abbeel , and Sergey Levine . 2018. Soft actor-critic: Off-policy maximum entropy deep reinforcement learning with a stochastic actor. arXiv preprint arXiv:1801.01290 ( 2018 ). Tuomas Haarnoja, Aurick Zhou, Pieter Abbeel, and Sergey Levine. 2018. Soft actor-critic: Off-policy maximum entropy deep reinforcement learning with a stochastic actor. arXiv preprint arXiv:1801.01290 (2018)."},{"key":"e_1_3_2_2_15_1","volume-title":"Mastering Atari with Discrete World Models. arXiv preprint arXiv:2010.02193","author":"Hafner Danijar","year":"2020","unstructured":"Danijar Hafner , Timothy Lillicrap , Mohammad Norouzi , and Jimmy Ba. 2020. Mastering Atari with Discrete World Models. arXiv preprint arXiv:2010.02193 ( 2020 ). Danijar Hafner, Timothy Lillicrap, Mohammad Norouzi, and Jimmy Ba. 2020. Mastering Atari with Discrete World Models. arXiv preprint arXiv:2010.02193 (2020)."},{"key":"e_1_3_2_2_16_1","volume-title":"A deep reinforcement learning framework for the financial portfolio management problem. arXiv preprint arXiv:1706.10059","author":"Jiang Zhengyao","year":"2017","unstructured":"Zhengyao Jiang , Dixing Xu , and Jinjun Liang . 2017. A deep reinforcement learning framework for the financial portfolio management problem. arXiv preprint arXiv:1706.10059 ( 2017 ). Zhengyao Jiang, Dixing Xu, and Jinjun Liang. 2017. A deep reinforcement learning framework for the financial portfolio management problem. arXiv preprint arXiv:1706.10059 (2017)."},{"key":"e_1_3_2_2_17_1","unstructured":"Lukasz Kaiser Mohammad Babaeizadeh Piotr Milos Blazej Osinski Roy H Campbell Konrad Czechowski Dumitru Erhan Chelsea Finn Piotr Kozakowski Sergey Levine etal 2019. Model-based reinforcement learning for atari. arXiv preprint arXiv:1903.00374 (2019).  Lukasz Kaiser Mohammad Babaeizadeh Piotr Milos Blazej Osinski Roy H Campbell Konrad Czechowski Dumitru Erhan Chelsea Finn Piotr Kozakowski Sergey Levine et al. 2019. Model-based reinforcement learning for atari. arXiv preprint arXiv:1903.00374 (2019)."},{"key":"e_1_3_2_2_18_1","unstructured":"Michael Kechinov. 2019. eCommerce behavior data from multi category store. https:\/\/www.kaggle.com\/mkechinov\/ecommerce-behavior-data-from-multi-category-store  Michael Kechinov. 2019. eCommerce behavior data from multi category store. https:\/\/www.kaggle.com\/mkechinov\/ecommerce-behavior-data-from-multi-category-store"},{"key":"e_1_3_2_2_19_1","doi-asserted-by":"publisher","DOI":"10.1145\/3359554"},{"key":"e_1_3_2_2_20_1","unstructured":"Lihong Li Thomas J Walsh and Michael L Littman. 2006. Towards a Unified Theory of State Abstraction for MDPs.. In ISAIM.  Lihong Li Thomas J Walsh and Michael L Littman. 2006. Towards a Unified Theory of State Abstraction for MDPs.. In ISAIM."},{"key":"e_1_3_2_2_21_1","volume-title":"Deep reinforcement learning based recommendation with explicit user-item interactions modeling. arXiv preprint arXiv:1810.12027","author":"Liu Feng","year":"2018","unstructured":"Feng Liu , Ruiming Tang , Xutao Li , Weinan Zhang , Yunming Ye , Haokun Chen , Huifeng Guo , and Yuzhou Zhang . 2018. Deep reinforcement learning based recommendation with explicit user-item interactions modeling. arXiv preprint arXiv:1810.12027 ( 2018 ). Feng Liu, Ruiming Tang, Xutao Li, Weinan Zhang, Yunming Ye, Haokun Chen, Huifeng Guo, and Yuzhou Zhang. 2018. Deep reinforcement learning based recommendation with explicit user-item interactions modeling. arXiv preprint arXiv:1810.12027 (2018)."},{"key":"e_1_3_2_2_22_1","doi-asserted-by":"publisher","DOI":"10.1145\/3005745.3005750"},{"key":"e_1_3_2_2_23_1","volume-title":"Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781","author":"Mikolov Tomas","year":"2013","unstructured":"Tomas Mikolov , Kai Chen , Greg Corrado , and Jeffrey Dean . 2013. Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 ( 2013 ). Tomas Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean. 2013. Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013)."},{"key":"e_1_3_2_2_24_1","volume-title":"Playing atari with deep reinforcement learning. arXiv preprint arXiv:1312.5602","author":"Mnih Volodymyr","year":"2013","unstructured":"Volodymyr Mnih , Koray Kavukcuoglu , David Silver , Alex Graves , Ioannis Antonoglou , Daan Wierstra , and Martin Riedmiller . 2013. Playing atari with deep reinforcement learning. arXiv preprint arXiv:1312.5602 ( 2013 ). Volodymyr Mnih, Koray Kavukcuoglu, David Silver, Alex Graves, Ioannis Antonoglou, Daan Wierstra, and Martin Riedmiller. 2013. Playing atari with deep reinforcement learning. arXiv preprint arXiv:1312.5602 (2013)."},{"key":"e_1_3_2_2_25_1","doi-asserted-by":"publisher","DOI":"10.1038\/nature14236"},{"key":"e_1_3_2_2_26_1","doi-asserted-by":"publisher","DOI":"10.1109\/CDC.2016.7798980"},{"key":"e_1_3_2_2_27_1","volume-title":"Language under- standing for text-based games using deep reinforcement learning. arXiv preprint arXiv:1506.08941","author":"Narasimhan Karthik","year":"2015","unstructured":"Karthik Narasimhan , Tejas Kulkarni , and Regina Barzilay . 2015. Language under- standing for text-based games using deep reinforcement learning. arXiv preprint arXiv:1506.08941 ( 2015 ). Karthik Narasimhan, Tejas Kulkarni, and Regina Barzilay. 2015. Language under- standing for text-based games using deep reinforcement learning. arXiv preprint arXiv:1506.08941 (2015)."},{"key":"e_1_3_2_2_28_1","doi-asserted-by":"publisher","DOI":"10.5555\/3104482.3104631"},{"key":"e_1_3_2_2_29_1","doi-asserted-by":"publisher","DOI":"10.5555\/1005332.1016794"},{"key":"e_1_3_2_2_30_1","volume-title":"On learning to think: Algorithmic information theory for novel combinations of reinforcement learning controllers and recurrent neural world models. arXiv preprint arXiv:1511.09249","author":"Schmidhuber J\u00fcrgen","year":"2015","unstructured":"J\u00fcrgen Schmidhuber . 2015. On learning to think: Algorithmic information theory for novel combinations of reinforcement learning controllers and recurrent neural world models. arXiv preprint arXiv:1511.09249 ( 2015 ). J\u00fcrgen Schmidhuber. 2015. On learning to think: Algorithmic information theory for novel combinations of reinforcement learning controllers and recurrent neural world models. arXiv preprint arXiv:1511.09249 (2015)."},{"key":"e_1_3_2_2_31_1","volume-title":"Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347","author":"Schulman John","year":"2017","unstructured":"John Schulman , Filip Wolski , Prafulla Dhariwal , Alec Radford , and Oleg Klimov . 2017. Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347 ( 2017 ). John Schulman, Filip Wolski, Prafulla Dhariwal, Alec Radford, and Oleg Klimov. 2017. Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347 (2017)."},{"key":"e_1_3_2_2_32_1","doi-asserted-by":"publisher","DOI":"10.5555\/1046920.1088715"},{"key":"e_1_3_2_2_33_1","volume-title":"Learning to factor policies and action-value functions: Factored action space representations for deep reinforcement learning. arXiv preprint arXiv:1705.07269","author":"Sharma Sahil","year":"2017","unstructured":"Sahil Sharma , Aravind Suresh , Rahul Ramesh , and Balaraman Ravindran . 2017. Learning to factor policies and action-value functions: Factored action space representations for deep reinforcement learning. arXiv preprint arXiv:1705.07269 ( 2017 ). Sahil Sharma, Aravind Suresh, Rahul Ramesh, and Balaraman Ravindran. 2017. Learning to factor policies and action-value functions: Factored action space representations for deep reinforcement learning. arXiv preprint arXiv:1705.07269 (2017)."},{"key":"e_1_3_2_2_34_1","volume-title":"Novelty Search in Representational Space for Sample Efficient Exploration. Advances in Neural Information Processing Systems 33","author":"Tao Ruo Yu","year":"2020","unstructured":"Ruo Yu Tao , Vincent Fran\u00e7ois-Lavet , and Joelle Pineau . 2020. Novelty Search in Representational Space for Sample Efficient Exploration. Advances in Neural Information Processing Systems 33 ( 2020 ). Ruo Yu Tao, Vincent Fran\u00e7ois-Lavet, and Joelle Pineau. 2020. Novelty Search in Representational Space for Sample Efficient Exploration. Advances in Neural Information Processing Systems 33 (2020)."},{"key":"e_1_3_2_2_35_1","volume-title":"The Natural Language of Actions. 36th International Conference on Machine Learning, ICML 2019 97","author":"Tennenholtz Guy","year":"2019","unstructured":"Guy Tennenholtz and Shie Mannor . 2019 . The Natural Language of Actions. 36th International Conference on Machine Learning, ICML 2019 97 , 6196--6205. Guy Tennenholtz and Shie Mannor. 2019. The Natural Language of Actions. 36th International Conference on Machine Learning, ICML 2019 97, 6196--6205."},{"key":"e_1_3_2_2_36_1","doi-asserted-by":"publisher","DOI":"10.1109\/IROS.2012.6386109"},{"key":"e_1_3_2_2_37_1","doi-asserted-by":"publisher","DOI":"10.5555\/1704175.1704343"},{"key":"e_1_3_2_2_38_1","volume-title":"Dynamics-Aware Embeddings. In International Conference on Learning Representations.","author":"Whitney William","year":"2019","unstructured":"William Whitney , Rajat Agarwal , Kyunghyun Cho , and Abhinav Gupta . 2019 . Dynamics-Aware Embeddings. In International Conference on Learning Representations. William Whitney, Rajat Agarwal, Kyunghyun Cho, and Abhinav Gupta. 2019. Dynamics-Aware Embeddings. In International Conference on Learning Representations."},{"key":"e_1_3_2_2_39_1","unstructured":"Amy Zhang Rowan McAllister Roberto Calandra Yarin Gal and Sergey Levine. 2020. Learning invariant representations for reinforcement learning without reconstruction. arXiv preprint arXiv:2006.10742 (2020)  Amy Zhang Rowan McAllister Roberto Calandra Yarin Gal and Sergey Levine. 2020. Learning invariant representations for reinforcement learning without reconstruction. arXiv preprint arXiv:2006.10742 (2020)"}],"event":{"name":"CIKM '21: The 30th ACM International Conference on Information and Knowledge Management","sponsor":["SIGWEB ACM Special Interest Group on Hypertext, Hypermedia, and Web","SIGIR ACM Special Interest Group on Information Retrieval"],"location":"Virtual Event Queensland Australia","acronym":"CIKM '21"},"container-title":["Proceedings of the 30th ACM International Conference on Information &amp; Knowledge Management"],"original-title":[],"link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3459637.3482357","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3459637.3482357","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T20:12:22Z","timestamp":1750191142000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3459637.3482357"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2021,10,26]]},"references-count":39,"alternative-id":["10.1145\/3459637.3482357","10.1145\/3459637"],"URL":"https:\/\/doi.org\/10.1145\/3459637.3482357","relation":{},"subject":[],"published":{"date-parts":[[2021,10,26]]},"assertion":[{"value":"2021-10-30","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}