{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,20]],"date-time":"2026-04-20T11:32:30Z","timestamp":1776684750127,"version":"3.51.2"},"reference-count":57,"publisher":"Association for Computing Machinery (ACM)","issue":"6","license":[{"start":{"date-parts":[[2020,11,27]],"date-time":"2020-11-27T00:00:00Z","timestamp":1606435200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"crossref","award":["61972038"],"award-info":[{"award-number":["61972038"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"crossref"}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Graph."],"published-print":{"date-parts":[[2020,12,31]]},"abstract":"<jats:p>\n            We propose a novel approach for automatically generating a move plan for scene arrangement.\n            <jats:sup>1<\/jats:sup>\n            Given a scene like an apartment with many furniture objects, to transform its layout into another layout, one would need to determine a collision-free move plan. It could be challenging to design this plan manually because the furniture objects may block the way of each other if not moved properly; and there is a large complex search space of move action sequences that grow exponentially with the number of objects. To tackle this challenge, we propose a learning-based approach to generate a move plan automatically. At the core of our approach is a Monte Carlo tree that encodes possible states of the layout, based on which a search is performed to move a furniture object appropriately in the current layout. We trained a policy neural network embedded with a LSTM module for estimating the best actions to take in the expansion step and simulation step of the Monte Carlo tree search process. Leveraging the power of deep reinforcement learning, the network learned how to make such estimations through millions of trials of moving objects. We demonstrated our approach for moving objects under different scenarios and constraints. We also evaluated our approach on synthetic and real-world layouts, comparing its performance with that of humans and other baseline approaches.\n          <\/jats:p>","DOI":"10.1145\/3414685.3417788","type":"journal-article","created":{"date-parts":[[2020,11,27]],"date-time":"2020-11-27T21:51:05Z","timestamp":1606513865000},"page":"1-15","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":21,"title":["Scene mover"],"prefix":"10.1145","volume":"39","author":[{"given":"Hanqing","family":"Wang","sequence":"first","affiliation":[{"name":"Beijing Institute of Technology"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Wei","family":"Liang","sequence":"additional","affiliation":[{"name":"Beijing Institute of Technology"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Lap-Fai","family":"Yu","sequence":"additional","affiliation":[{"name":"George Mason University"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","published-online":{"date-parts":[[2020,11,27]]},"reference":[{"key":"e_1_2_2_1_1","volume-title":"International Conf. on Automated Planning and Scheduling.","author":"Aberdeen Douglas","year":"2004","unstructured":"Douglas Aberdeen , Sylvie Thi\u00e9baux , and Lin Zhang . 2004 . Decision-theoretic military operations planning . In International Conf. on Automated Planning and Scheduling. Douglas Aberdeen, Sylvie Thi\u00e9baux, and Lin Zhang. 2004. Decision-theoretic military operations planning. In International Conf. on Automated Planning and Scheduling."},{"key":"e_1_2_2_2_1","doi-asserted-by":"publisher","DOI":"10.1109\/21.148404"},{"key":"e_1_2_2_3_1","unstructured":"Marcin Andrychowicz Misha Denil Sergio Gomez Matthew W Hoffman David Pfau Tom Schaul Brendan Shillingford and Nando De Freitas. 2016. Learning to learn by gradient descent by gradient descent. In NIPS.  Marcin Andrychowicz Misha Denil Sergio Gomez Matthew W Hoffman David Pfau Tom Schaul Brendan Shillingford and Nando De Freitas. 2016. Learning to learn by gradient descent by gradient descent. In NIPS."},{"key":"e_1_2_2_4_1","unstructured":"Thomas Anthony Zheng Tian and David Barber. 2017. Thinking fast and slow with deep learning and tree search. In Advances in Neural Information Processing Systems.  Thomas Anthony Zheng Tian and David Barber. 2017. Thinking fast and slow with deep learning and tree search. In Advances in Neural Information Processing Systems."},{"key":"e_1_2_2_5_1","volume-title":"Finite-time analysis of the multiarmed bandit problem. Machine learning 47, 2--3","author":"Auer Peter","year":"2002","unstructured":"Peter Auer , Nicolo Cesa-Bianchi , and Paul Fischer . 2002. Finite-time analysis of the multiarmed bandit problem. Machine learning 47, 2--3 ( 2002 ), 235--256. Peter Auer, Nicolo Cesa-Bianchi, and Paul Fischer. 2002. Finite-time analysis of the multiarmed bandit problem. Machine learning 47, 2--3 (2002), 235--256."},{"key":"e_1_2_2_6_1","doi-asserted-by":"publisher","DOI":"10.1109\/IROS.2017.8206087"},{"key":"e_1_2_2_7_1","volume-title":"Intelligent and Evolutionary Systems","author":"Biswas Sumana","unstructured":"Sumana Biswas , Sreenatha G Anavatti , and Matthew A Garratt . 2017. Obstacle avoidance for multi-agent path planning based on vectorized particle swarm optimization . In Intelligent and Evolutionary Systems . Springer , 61--74. Sumana Biswas, Sreenatha G Anavatti, and Matthew A Garratt. 2017. Obstacle avoidance for multi-agent path planning based on vectorized particle swarm optimization. In Intelligent and Evolutionary Systems. Springer, 61--74."},{"key":"e_1_2_2_8_1","doi-asserted-by":"publisher","DOI":"10.1109\/TSMC.1985.6313352"},{"key":"e_1_2_2_9_1","doi-asserted-by":"publisher","DOI":"10.1016\/S0004-3702(01)00129-1"},{"key":"e_1_2_2_10_1","doi-asserted-by":"crossref","unstructured":"Guillaume Chaslot Sander Bakkes Istvan Szita and Pieter Spronck. 2008. Monte-Carlo Tree Search: A New Framework for Game AI.. In AIIDE.  Guillaume Chaslot Sander Bakkes Istvan Szita and Pieter Spronck. 2008. Monte-Carlo Tree Search: A New Framework for Game AI.. In AIIDE.","DOI":"10.3233\/ICG-2008-31303"},{"key":"e_1_2_2_11_1","volume-title":"ScanNet: Richly-annotated 3D Reconstructions of Indoor Scenes","author":"Dai Angela","unstructured":"Angela Dai , Angel X. Chang , Manolis Savva , Maciej Halber , Thomas Funkhouser , and Matthias Nie\u00dfner . 2017. ScanNet: Richly-annotated 3D Reconstructions of Indoor Scenes . In IEEE CVPR. Angela Dai, Angel X. Chang, Manolis Savva, Maciej Halber, Thomas Funkhouser, and Matthias Nie\u00dfner. 2017. ScanNet: Richly-annotated 3D Reconstructions of Indoor Scenes. In IEEE CVPR."},{"key":"e_1_2_2_12_1","doi-asserted-by":"publisher","DOI":"10.1145\/2816795.2818057"},{"key":"e_1_2_2_13_1","volume-title":"The Cambridge handbook of artificial intelligence","author":"Frankish Keith","unstructured":"Keith Frankish and William M Ramsey . 2014. The Cambridge handbook of artificial intelligence . Cambridge University Press . Keith Frankish and William M Ramsey. 2014. The Cambridge handbook of artificial intelligence. Cambridge University Press."},{"key":"e_1_2_2_14_1","doi-asserted-by":"publisher","DOI":"10.1145\/3130800.3130805"},{"key":"e_1_2_2_15_1","volume-title":"Robot formation motion planning using fast marching. Robotics and Autonomous Systems 59, 9","author":"Garrido Santiago","year":"2011","unstructured":"Santiago Garrido , Luis Moreno , and Pedro U Lima . 2011. Robot formation motion planning using fast marching. Robotics and Autonomous Systems 59, 9 ( 2011 ). Santiago Garrido, Luis Moreno, and Pedro U Lima. 2011. Robot formation motion planning using fast marching. Robotics and Autonomous Systems 59, 9 (2011)."},{"key":"e_1_2_2_16_1","volume-title":"Motion and Operation Planning of Robotic Systems","author":"Gasparetto Alessandro","unstructured":"Alessandro Gasparetto , Paolo Boscariol , Albano Lanzutti , and Renato Vidoni . 2015. Path planning and trajectory planning algorithms: A general overview . In Motion and Operation Planning of Robotic Systems . Springer , 3--27. Alessandro Gasparetto, Paolo Boscariol, Albano Lanzutti, and Renato Vidoni. 2015. Path planning and trajectory planning algorithms: A general overview. In Motion and Operation Planning of Robotic Systems. Springer, 3--27."},{"key":"e_1_2_2_17_1","doi-asserted-by":"crossref","unstructured":"Russell Gayle Paul Segars Ming C. Lin and Dinesh Manocha. 2005. Path Planning for Deformable Robots in Complex Environments. In Robotics: Science and Systems.  Russell Gayle Paul Segars Ming C. Lin and Dinesh Manocha. 2005. Path Planning for Deformable Robots in Complex Environments. In Robotics: Science and Systems.","DOI":"10.15607\/RSS.2005.I.030"},{"key":"e_1_2_2_18_1","volume-title":"Soft actorcritic: Off-policy maximum entropy deep reinforcement learning with a stochastic actor. arXiv preprint arXiv:1801.01290","author":"Haarnoja Tuomas","year":"2018","unstructured":"Tuomas Haarnoja , Aurick Zhou , Pieter Abbeel , and Sergey Levine . 2018. Soft actorcritic: Off-policy maximum entropy deep reinforcement learning with a stochastic actor. arXiv preprint arXiv:1801.01290 ( 2018 ). Tuomas Haarnoja, Aurick Zhou, Pieter Abbeel, and Sergey Levine. 2018. Soft actorcritic: Off-policy maximum entropy deep reinforcement learning with a stochastic actor. arXiv preprint arXiv:1801.01290 (2018)."},{"key":"e_1_2_2_19_1","doi-asserted-by":"publisher","DOI":"10.1109\/TSSC.1968.300136"},{"key":"e_1_2_2_20_1","volume-title":"Deep recurrent q-learning for partially observable mdps. CoRR, abs\/1507.06527 7, 1","author":"Hausknecht Matthew","year":"2015","unstructured":"Matthew Hausknecht and Peter Stone . 2015. Deep recurrent q-learning for partially observable mdps. CoRR, abs\/1507.06527 7, 1 ( 2015 ). Matthew Hausknecht and Peter Stone. 2015. Deep recurrent q-learning for partially observable mdps. CoRR, abs\/1507.06527 7, 1 (2015)."},{"key":"e_1_2_2_21_1","volume-title":"Learning Manipulation States and Actions for Efficient Non-prehensile Rearrangement Planning. arXiv preprint arXiv:1901.03557","author":"Haustein Joshua A","year":"2019","unstructured":"Joshua A Haustein , Isac Arnekvist , Johannes Stork , Kaiyu Hang , and Danica Kragic . 2019. Learning Manipulation States and Actions for Efficient Non-prehensile Rearrangement Planning. arXiv preprint arXiv:1901.03557 ( 2019 ). Joshua A Haustein, Isac Arnekvist, Johannes Stork, Kaiyu Hang, and Danica Kragic. 2019. Learning Manipulation States and Actions for Efficient Non-prehensile Rearrangement Planning. arXiv preprint arXiv:1901.03557 (2019)."},{"key":"e_1_2_2_22_1","volume-title":"Kinodynamic randomized rearrangement planning via dynamic transitions between statically stable states","author":"Haustein Joshua A","unstructured":"Joshua A Haustein , Jennifer King , Siddhartha S Srinivasa , and Tamim Asfour . 2015. Kinodynamic randomized rearrangement planning via dynamic transitions between statically stable states . In IEEE ICRA. Joshua A Haustein, Jennifer King, Siddhartha S Srinivasa, and Tamim Asfour. 2015. Kinodynamic randomized rearrangement planning via dynamic transitions between statically stable states. In IEEE ICRA."},{"key":"e_1_2_2_23_1","doi-asserted-by":"publisher","DOI":"10.1109\/TMECH.2013.2252076"},{"key":"e_1_2_2_24_1","volume-title":"Rearrangement planning using object-centric and robot-centric action spaces","author":"King Jennifer E","unstructured":"Jennifer E King , Marco Cognetti , and Siddhartha S Srinivasa . 2016. Rearrangement planning using object-centric and robot-centric action spaces . In IEEE ICRA. Jennifer E King, Marco Cognetti, and Siddhartha S Srinivasa. 2016. Rearrangement planning using object-centric and robot-centric action spaces. In IEEE ICRA."},{"key":"e_1_2_2_25_1","volume-title":"Unobservable monte carlo planning for nonprehensile rearrangement tasks","author":"King Jennifer E","unstructured":"Jennifer E King , Vinitha Ranganeni , and Siddhartha S Srinivasa . 2017. Unobservable monte carlo planning for nonprehensile rearrangement tasks . In IEEE ICRA. Jennifer E King, Vinitha Ranganeni, and Siddhartha S Srinivasa. 2017. Unobservable monte carlo planning for nonprehensile rearrangement tasks. In IEEE ICRA."},{"key":"e_1_2_2_26_1","doi-asserted-by":"publisher","DOI":"10.1177\/0278364913495721"},{"key":"e_1_2_2_27_1","doi-asserted-by":"publisher","DOI":"10.1109\/IROS.2015.7353743"},{"key":"e_1_2_2_28_1","doi-asserted-by":"publisher","DOI":"10.1109\/LRA.2020.2980984"},{"key":"e_1_2_2_29_1","doi-asserted-by":"publisher","DOI":"10.1177\/0278364917710318"},{"key":"e_1_2_2_30_1","volume-title":"Kai Xu, Siddhartha Chaudhuri, Owais Khan, Ariel Shamir, Changhe Tu, Baoquan Chen, Daniel Cohen Or, and Hao Zhang.","author":"Li Manyi","year":"2019","unstructured":"Manyi Li , Akshay Gadi Patil , Kai Xu, Siddhartha Chaudhuri, Owais Khan, Ariel Shamir, Changhe Tu, Baoquan Chen, Daniel Cohen Or, and Hao Zhang. 2019 . GRAINS : Generative Recursive Autoencoders for INdoor Scenes. ACM Trans. on Graphics 38 (2019). Manyi Li, Akshay Gadi Patil, Kai Xu, Siddhartha Chaudhuri, Owais Khan, Ariel Shamir, Changhe Tu, Baoquan Chen, Daniel Cohen Or, and Hao Zhang. 2019. GRAINS: Generative Recursive Autoencoders for INdoor Scenes. ACM Trans. on Graphics 38 (2019)."},{"key":"e_1_2_2_31_1","doi-asserted-by":"publisher","DOI":"10.1145\/2980179.2980223"},{"key":"e_1_2_2_32_1","doi-asserted-by":"publisher","DOI":"10.1145\/1882262.1866203"},{"key":"e_1_2_2_33_1","volume-title":"Mehdi Mirza, Alex Graves, Timothy Lillicrap, Tim Harley, David Silver, and Koray Kavukcuoglu.","author":"Mnih Volodymyr","year":"2016","unstructured":"Volodymyr Mnih , Adria Puigdomenech Badia , Mehdi Mirza, Alex Graves, Timothy Lillicrap, Tim Harley, David Silver, and Koray Kavukcuoglu. 2016 . Asynchronous methods for deep reinforcement learning. In ICML. Volodymyr Mnih, Adria Puigdomenech Badia, Mehdi Mirza, Alex Graves, Timothy Lillicrap, Tim Harley, David Silver, and Koray Kavukcuoglu. 2016. Asynchronous methods for deep reinforcement learning. In ICML."},{"key":"e_1_2_2_34_1","doi-asserted-by":"crossref","unstructured":"Volodymyr Mnih Koray Kavukcuoglu David Silver Andrei A Rusu Joel Veness Marc G Bellemare Alex Graves Martin Riedmiller Andreas K Fidjeland Georg Ostrovski etal 2015. Human-level control through deep reinforcement learning. Nature 518 7540 (2015) 529.  Volodymyr Mnih Koray Kavukcuoglu David Silver Andrei A Rusu Joel Veness Marc G Bellemare Alex Graves Martin Riedmiller Andreas K Fidjeland Georg Ostrovski et al. 2015. Human-level control through deep reinforcement learning. Nature 518 7540 (2015) 529.","DOI":"10.1038\/nature14236"},{"key":"e_1_2_2_35_1","volume-title":"PGQ: Combing Policy Gradient and Q. arXiv preprint arXiv:1611.01626","author":"O'Donoghue Brendan","year":"2016","unstructured":"Brendan O'Donoghue , R\u00e9mi Munos , Koray Kavukcuoglu , and Volodymyr Mnih . 2016 . PGQ: Combing Policy Gradient and Q. arXiv preprint arXiv:1611.01626 (2016). Brendan O'Donoghue, R\u00e9mi Munos, Koray Kavukcuoglu, and Volodymyr Mnih. 2016. PGQ: Combing Policy Gradient and Q. arXiv preprint arXiv:1611.01626 (2016)."},{"key":"e_1_2_2_36_1","doi-asserted-by":"publisher","DOI":"10.1145\/2601097.2601164"},{"key":"e_1_2_2_37_1","volume-title":"Human-centric Indoor Scene Synthesis Using Stochastic Grammar","author":"Qi Siyuan","unstructured":"Siyuan Qi , Yixin Zhu , Siyuan Huang , Chenfanfu Jiang , and Song Chun Zhu . 2018. Human-centric Indoor Scene Synthesis Using Stochastic Grammar . In IEEE CVPR. Siyuan Qi, Yixin Zhu, Siyuan Huang, Chenfanfu Jiang, and Song Chun Zhu. 2018. Human-centric Indoor Scene Synthesis Using Stochastic Grammar. In IEEE CVPR."},{"key":"e_1_2_2_38_1","volume-title":"Fast and Flexible Indoor Scene Synthesis via Deep Convolutional Generative Models","author":"Ritchie Daniel","unstructured":"Daniel Ritchie , Kai Wang , and Yu-an Lin. 2019. Fast and Flexible Indoor Scene Synthesis via Deep Convolutional Generative Models . In IEEE CVPR. Daniel Ritchie, Kai Wang, and Yu-an Lin. 2019. Fast and Flexible Indoor Scene Synthesis via Deep Convolutional Generative Models. In IEEE CVPR."},{"key":"e_1_2_2_39_1","doi-asserted-by":"publisher","DOI":"10.1145\/2897824.2925867"},{"key":"e_1_2_2_40_1","volume-title":"ReinforceWalk: Learning to Walk in Graph with Monte Carlo Tree Search. arXiv: Artificial Intelligence","author":"Shen Yelong","year":"2018","unstructured":"Yelong Shen , Jianshu Chen , Posen Huang , Yuqing Guo , and Jianfeng Gao . 2018. ReinforceWalk: Learning to Walk in Graph with Monte Carlo Tree Search. arXiv: Artificial Intelligence ( 2018 ). Yelong Shen, Jianshu Chen, Posen Huang, Yuqing Guo, and Jianfeng Gao. 2018. ReinforceWalk: Learning to Walk in Graph with Monte Carlo Tree Search. arXiv: Artificial Intelligence (2018)."},{"key":"e_1_2_2_41_1","volume-title":"Julian Schrittwieser, Ioannis Antonoglou, Veda Panneershelvam, Marc Lanctot, et al.","author":"Silver David","year":"2016","unstructured":"David Silver , Aja Huang , Chris J Maddison , Arthur Guez , Laurent Sifre , George Van Den Driessche , Julian Schrittwieser, Ioannis Antonoglou, Veda Panneershelvam, Marc Lanctot, et al. 2016 . Mastering the game of Go with deep neural networks and tree search. Nature 529, 7587 (2016), 484. David Silver, Aja Huang, Chris J Maddison, Arthur Guez, Laurent Sifre, George Van Den Driessche, Julian Schrittwieser, Ioannis Antonoglou, Veda Panneershelvam, Marc Lanctot, et al. 2016. Mastering the game of Go with deep neural networks and tree search. Nature 529, 7587 (2016), 484."},{"key":"e_1_2_2_42_1","doi-asserted-by":"crossref","unstructured":"David Silver Julian Schrittwieser Karen Simonyan Ioannis Antonoglou Aja Huang Arthur Guez Thomas Hubert Lucas Baker Matthew Lai Adrian Bolton etal 2017. Mastering the game of Go without human knowledge. Nature 550 7676 (2017) 354.  David Silver Julian Schrittwieser Karen Simonyan Ioannis Antonoglou Aja Huang Arthur Guez Thomas Hubert Lucas Baker Matthew Lai Adrian Bolton et al. 2017. Mastering the game of Go without human knowledge. Nature 550 7676 (2017) 354.","DOI":"10.1038\/nature24270"},{"key":"e_1_2_2_43_1","volume-title":"Object Rearrangement with Nested Nonprehensile Manipulation Actions. arXiv preprint arXiv:1905.07505","author":"Song Changkyu","year":"2019","unstructured":"Changkyu Song and Abdeslam Boularias . 2019. Object Rearrangement with Nested Nonprehensile Manipulation Actions. arXiv preprint arXiv:1905.07505 ( 2019 ). Changkyu Song and Abdeslam Boularias. 2019. Object Rearrangement with Nested Nonprehensile Manipulation Actions. arXiv preprint arXiv:1905.07505 (2019)."},{"key":"e_1_2_2_44_1","volume-title":"Danica Kragic, and Johannes A Stork.","author":"Song Haoran","year":"2019","unstructured":"Haoran Song , Joshua A Haustein , Weihao Yuan , Kaiyu Hang , Michael Yu Wang , Danica Kragic, and Johannes A Stork. 2019 . Multi-Object Rearrangement with Monte Carlo Tree Search: A Case Study on Planar Nonprehensile Sorting . arXiv preprint arXiv:1912.07024 (2019). Haoran Song, Joshua A Haustein, Weihao Yuan, Kaiyu Hang, Michael Yu Wang, Danica Kragic, and Johannes A Stork. 2019. Multi-Object Rearrangement with Monte Carlo Tree Search: A Case Study on Planar Nonprehensile Sorting. arXiv preprint arXiv:1912.07024 (2019)."},{"key":"e_1_2_2_45_1","doi-asserted-by":"publisher","DOI":"10.1109\/TVCG.2008.27"},{"key":"e_1_2_2_46_1","unstructured":"Richard S Sutton and Andrew G Barto. 2011. Reinforcement learning: An introduction. (2011).  Richard S Sutton and Andrew G Barto. 2011. Reinforcement learning: An introduction. (2011)."},{"key":"e_1_2_2_47_1","doi-asserted-by":"crossref","unstructured":"Jur van den Berg Jack Snoeyink Ming Lin and Dinesh Manocha. 2009. Centralized path planning for multiple robots: Optimal decoupling into sequential plans. In Robotics: Science and Systems.  Jur van den Berg Jack Snoeyink Ming Lin and Dinesh Manocha. 2009. Centralized path planning for multiple robots: Optimal decoupling into sequential plans. In Robotics: Science and Systems.","DOI":"10.15607\/RSS.2009.V.018"},{"key":"e_1_2_2_48_1","doi-asserted-by":"crossref","unstructured":"Hanqing Wang Wenguan Wang Tianmin Shu Wei Liang and Jianbing Shen. 2020. Active Visual Information Gathering for Vision-Language Navigation. In ECCV.  Hanqing Wang Wenguan Wang Tianmin Shu Wei Liang and Jianbing Shen. 2020. Active Visual Information Gathering for Vision-Language Navigation. In ECCV.","DOI":"10.1007\/978-3-030-58542-6_19"},{"key":"e_1_2_2_49_1","doi-asserted-by":"publisher","DOI":"10.1609\/aaai.v33i01.33018941"},{"key":"e_1_2_2_50_1","doi-asserted-by":"publisher","DOI":"10.1145\/3306346.3322941"},{"key":"e_1_2_2_51_1","doi-asserted-by":"publisher","DOI":"10.1145\/3197517.3201362"},{"key":"e_1_2_2_52_1","first-page":"51","article-title":"Reinforcement learning","volume":"12","author":"Wiering Marco","year":"2012","unstructured":"Marco Wiering and Martijn Van Otterlo . 2012 . Reinforcement learning . Adaptation, Learning, and Optimization 12 (2012), 51 . Marco Wiering and Martijn Van Otterlo. 2012. Reinforcement learning. Adaptation, Learning, and Optimization 12 (2012), 51.","journal-title":"Adaptation, Learning, and Optimization"},{"key":"e_1_2_2_53_1","doi-asserted-by":"publisher","DOI":"10.1111\/cgf.13380"},{"key":"e_1_2_2_54_1","doi-asserted-by":"publisher","DOI":"10.1145\/2185520.2185552"},{"key":"e_1_2_2_55_1","doi-asserted-by":"publisher","DOI":"10.1145\/2010324.1964981"},{"key":"e_1_2_2_56_1","volume-title":"End-to-end nonprehensile rearrangement with deep reinforcement learning and simulation-to-reality transfer. Robotics and Autonomous Systems 119","author":"Yuan Weihao","year":"2019","unstructured":"Weihao Yuan , Kaiyu Hang , Danica Kragic , Michael Y Wang , and Johannes A Stork . 2019. End-to-end nonprehensile rearrangement with deep reinforcement learning and simulation-to-reality transfer. Robotics and Autonomous Systems 119 ( 2019 ). Weihao Yuan, Kaiyu Hang, Danica Kragic, Michael Y Wang, and Johannes A Stork. 2019. End-to-end nonprehensile rearrangement with deep reinforcement learning and simulation-to-reality transfer. Robotics and Autonomous Systems 119 (2019)."},{"key":"e_1_2_2_57_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICRA.2018.8462863"}],"container-title":["ACM Transactions on Graphics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3414685.3417788","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3414685.3417788","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T22:03:14Z","timestamp":1750197794000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3414685.3417788"}},"subtitle":["automatic move planning for scene arrangement by deep reinforcement learning"],"short-title":[],"issued":{"date-parts":[[2020,11,27]]},"references-count":57,"journal-issue":{"issue":"6","published-print":{"date-parts":[[2020,12,31]]}},"alternative-id":["10.1145\/3414685.3417788"],"URL":"https:\/\/doi.org\/10.1145\/3414685.3417788","relation":{},"ISSN":["0730-0301","1557-7368"],"issn-type":[{"value":"0730-0301","type":"print"},{"value":"1557-7368","type":"electronic"}],"subject":[],"published":{"date-parts":[[2020,11,27]]},"assertion":[{"value":"2020-11-27","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}