{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T04:28:20Z","timestamp":1750220900257,"version":"3.41.0"},"publisher-location":"New York, NY, USA","reference-count":25,"publisher":"ACM","license":[{"start":{"date-parts":[[2019,12,6]],"date-time":"2019-12-06T00:00:00Z","timestamp":1575590400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2019,12,6]]},"DOI":"10.1145\/3374587.3374595","type":"proceedings-article","created":{"date-parts":[[2020,3,4]],"date-time":"2020-03-04T18:16:31Z","timestamp":1583345791000},"page":"71-76","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":0,"title":["Reinforcement Learning Based on Multi-subnet Clusters"],"prefix":"10.1145","author":[{"given":"Xiaobing","family":"Wang","sequence":"first","affiliation":[{"name":"School of Computer Science, Hubei University of Technology, Wuhan, China"}]},{"given":"Gang","family":"Liu","sequence":"additional","affiliation":[{"name":"School of Computer Science, Hubei University of Technology, Wuhan, China"}]}],"member":"320","published-online":{"date-parts":[[2020,3,4]]},"reference":[{"key":"e_1_3_2_1_1_1","doi-asserted-by":"publisher","DOI":"10.1038\/nature14236"},{"key":"e_1_3_2_1_3_1","volume-title":"LISA Lab. [31","author":"Convolutional Neural","year":"2013","unstructured":"Convolutional Neural Networks (LeNet) - DeepLearning 0.1 documentation. DeepLearning 0.1. LISA Lab. [31 August 2013 ]. Convolutional Neural Networks (LeNet) - DeepLearning 0.1 documentation. DeepLearning 0.1. LISA Lab. [31 August 2013]."},{"key":"e_1_3_2_1_4_1","unstructured":"van Hasselt H. 2010. Double Q-learning. In Advances in Neural Information Processing Systems 23 2613--2621.  van Hasselt H. 2010. Double Q-learning. In Advances in Neural Information Processing Systems 23 2613--2621."},{"key":"e_1_3_2_1_5_1","first-page":"2100","article-title":"2016. Deep reinforcement learning with double Q-learning","year":"2094","unstructured":"van Hasselt, H.; Guez, A.; and Silver, D . 2016. Deep reinforcement learning with double Q-learning . In Proc. of AAAI , 2094 -- 2100 . van Hasselt, H.; Guez, A.; and Silver, D. 2016. Deep reinforcement learning with double Q-learning. In Proc. of AAAI, 2094--2100.","journal-title":"Proc. of AAAI"},{"volume-title":"Proc. of ICLR.","key":"e_1_3_2_1_6_1","unstructured":"Schaul, T.; Quan, J.; Antonoglou, I.; and Silver, D . 2015. Prioritized experience replay . In Proc. of ICLR. Schaul, T.; Quan, J.; Antonoglou, I.; and Silver, D. 2015. Prioritized experience replay. In Proc. of ICLR."},{"volume-title":"Dueling network architectures for deep reinforcement learning.In Proceedings of The 33rd International Conferenceon Machine Learning,1995--2003","author":"N.","key":"e_1_3_2_1_7_1","unstructured":"Wang, Z.; Schaul, T.; Hessel, M.; van Hasselt, H.; Lanctot, M.; and de Freitas, N. 2016. Dueling network architectures for deep reinforcement learning.In Proceedings of The 33rd International Conferenceon Machine Learning,1995--2003 . Wang, Z.; Schaul, T.; Hessel, M.; van Hasselt, H.; Lanctot, M.; and de Freitas, N. 2016. Dueling network architectures for deep reinforcement learning.In Proceedings of The 33rd International Conferenceon Machine Learning,1995--2003."},{"volume-title":"International Conference on Machine Learning.","author":"Mirza A.","key":"e_1_3_2_1_8_1","unstructured":"Mnih, V.; Badia, A. P.; Mirza , M.; Graves, A.; Lillicrap, T.; Harley, T.; Silver, D.; and Kavukcuoglu, K . 2016. Asynchronous methods for deep reinforcement learning . In International Conference on Machine Learning. Mnih, V.; Badia, A. P.; Mirza, M.; Graves, A.; Lillicrap, T.; Harley, T.; Silver, D.; and Kavukcuoglu, K. 2016. Asynchronous methods for deep reinforcement learning. In International Conference on Machine Learning."},{"key":"e_1_3_2_1_9_1","volume-title":"Reinforcement Learning: An Introduction","author":"R.","year":"1998","unstructured":"Sutton, R. S., and Barto, A. G . 1998 . Reinforcement Learning: An Introduction . The MIT press , Cambridge MA . Sutton, R. S., and Barto, A. G. 1998. Reinforcement Learning: An Introduction. The MIT press, Cambridge MA."},{"key":"e_1_3_2_1_10_1","unstructured":"Bellemare M. G.; Dabney W.; and Munos R. 2017. A distributional perspective on reinforcement learning. In ICML.  Bellemare M. G.; Dabney W.; and Munos R. 2017. A distributional perspective on reinforcement learning. In ICML."},{"key":"e_1_3_2_1_11_1","unstructured":"Fortunato M.; Azar M. G.; Piot B.; Menick J.; Osband I.; Graves A.; Mnih V.; Munos R.; Hassabis D.; Pietquin O.; Blundell C.; and Legg S. 2017. Noisy networks for exploration. CoRR abs\/1706.10295.  Fortunato M.; Azar M. G.; Piot B.; Menick J.; Osband I.; Graves A.; Mnih V.; Munos R.; Hassabis D.; Pietquin O.; Blundell C.; and Legg S. 2017. Noisy networks for exploration. CoRR abs\/1706.10295."},{"key":"e_1_3_2_1_12_1","doi-asserted-by":"crossref","unstructured":"Matteo Hessel;Joseph Modayil;Hado van Hasselt;Tom Schaul;Georg Ostrovski;Will Dabney;Dan Horgan;Bilal Piot;Mohammad Azar;David Silver. 2017. DeepMind Rainbow: Combining Improvements in Deep Reinforcement Learning.  Matteo Hessel;Joseph Modayil;Hado van Hasselt;Tom Schaul;Georg Ostrovski;Will Dabney;Dan Horgan;Bilal Piot;Mohammad Azar;David Silver. 2017. DeepMind Rainbow: Combining Improvements in Deep Reinforcement Learning.","DOI":"10.1609\/aaai.v32i1.11796"},{"key":"e_1_3_2_1_13_1","first-page":"1223","volume-title":"Proceedings of the 25th International Conference on Neural Information Processing Systems, NIPS'12","author":"Dean Jeffrey","year":"2012","unstructured":"Jeffrey Dean , Greg S. Corrado , Rajat Monga , Kai Chen , Matthieu Devin , Quoc V. Le , Mark Z. Mao , Marc'Aurelio Ranzato , Andrew Senior , Paul Tucker , Ke Yang , and Andrew Y. Ng . Large scale distributed deep networks . In Proceedings of the 25th International Conference on Neural Information Processing Systems, NIPS'12 , pp. 1223 -- 1231 , USA, 2012 . Curran Associates Inc. Jeffrey Dean, Greg S. Corrado, Rajat Monga, Kai Chen, Matthieu Devin, Quoc V. Le, Mark Z. Mao, Marc'Aurelio Ranzato, Andrew Senior, Paul Tucker, Ke Yang, and Andrew Y. Ng. Large scale distributed deep networks. In Proceedings of the 25th International Conference on Neural Information Processing Systems, NIPS'12, pp. 1223--1231, USA, 2012. Curran Associates Inc."},{"key":"e_1_3_2_1_14_1","volume-title":"International Conference on Learning Representations","author":"Schaul Tom","year":"2016","unstructured":"Tom Schaul , John Quan , Ioannis Antonoglou , and David Silver . Prioritized experience replay . In International Conference on Learning Representations , 2016 . Tom Schaul, John Quan, Ioannis Antonoglou, and David Silver. Prioritized experience replay. In International Conference on Learning Representations, 2016."},{"key":"e_1_3_2_1_15_1","volume-title":"International Conference on Learning Representations","author":"Babaeizadeh Mohammad","year":"2017","unstructured":"Mohammad Babaeizadeh , IuriFrosio, Stephen Tyree , Jason Clemons , and JanKautz. Reinforcement learning through asynchronous advantage actor-critic on a gpu . In International Conference on Learning Representations , 2017 . Mohammad Babaeizadeh, IuriFrosio, Stephen Tyree, Jason Clemons, and JanKautz. Reinforcement learning through asynchronous advantage actor-critic on a gpu. In International Conference on Learning Representations, 2017."},{"key":"e_1_3_2_1_16_1","volume-title":"Emergence of locomotion behaviours in rich environments. arXiv preprint arXiv:1707.02286","author":"Heess Nicolas","year":"2017","unstructured":"Nicolas Heess , Dhruva TB , Srinivasan Sr iram, Jay Lemmon , Josh Merel , Greg Wayne , Yuval Tassa , Tom Erez , Ziyu Wang , S. M. Ali Eslami , Martin A. Riedmiller , and David Silver . Emergence of locomotion behaviours in rich environments. arXiv preprint arXiv:1707.02286 , 2017 . Nicolas Heess, Dhruva TB, Srinivasan Sriram, Jay Lemmon, Josh Merel, Greg Wayne, Yuval Tassa, Tom Erez, Ziyu Wang, S. M. Ali Eslami, Martin A. Riedmiller, and David Silver. Emergence of locomotion behaviours in rich environments. arXiv preprint arXiv:1707.02286, 2017."},{"key":"e_1_3_2_1_17_1","unstructured":"Dan Horgan John Quan David Budden Gabriel Barth-Maron Matteo Hessel Hadovan Hasselt and David Silver Distributed prioritized experience replay 2018.  Dan Horgan John Quan David Budden Gabriel Barth-Maron Matteo Hessel Hadovan Hasselt and David Silver Distributed prioritized experience replay 2018."},{"key":"e_1_3_2_1_18_1","volume-title":"One weird trick for parallelizing convolutional neural networks. arXiv preprint arXiv:1404.5997","author":"Krizhevsky Alex","year":"2014","unstructured":"Alex Krizhevsky . One weird trick for parallelizing convolutional neural networks. arXiv preprint arXiv:1404.5997 , 2014 . Alex Krizhevsky. One weird trick for parallelizing convolutional neural networks. arXiv preprint arXiv:1404.5997, 2014."},{"key":"e_1_3_2_1_19_1","volume-title":"Humberto Nicol\u00e1s Castej\u00f3n Mart\u00ednez, and Arjun Chandra. Efficient parallel methods for deep reinforcement learning. arXiv preprint arXiv:1705.04862","author":"Clemente Alfredo V.","year":"2017","unstructured":"Alfredo V. Clemente , Humberto Nicol\u00e1s Castej\u00f3n Mart\u00ednez, and Arjun Chandra. Efficient parallel methods for deep reinforcement learning. arXiv preprint arXiv:1705.04862 , 2017 . Alfredo V. Clemente, Humberto Nicol\u00e1s Castej\u00f3n Mart\u00ednez, and Arjun Chandra. Efficient parallel methods for deep reinforcement learning. arXiv preprint arXiv:1705.04862, 2017."},{"key":"e_1_3_2_1_20_1","volume-title":"Monte Carlo Sampling methods using markov chains and their applications. 57(1):97--109","author":"Hastings W. Keith","year":"1970","unstructured":"W. Keith Hastings . Biometrika , Monte Carlo Sampling methods using markov chains and their applications. 57(1):97--109 , 1970 . W. Keith Hastings. Biometrika, Monte Carlo Sampling methods using markov chains and their applications. 57(1):97--109, 1970."},{"key":"e_1_3_2_1_21_1","doi-asserted-by":"publisher","DOI":"10.1016\/S0079-6123(06)65034-6"},{"key":"e_1_3_2_1_22_1","volume-title":"andYoshua Bengio. Variance reduction in sgd by distributed importance sampling. arXiv preprint arXiv:1511.06481","author":"Alain Guillaume","year":"2015","unstructured":"Guillaume Alain , Alex Lamb , Chinnadhurai Sankar , Aaron Courville , andYoshua Bengio. Variance reduction in sgd by distributed importance sampling. arXiv preprint arXiv:1511.06481 , 2015 . Guillaume Alain, Alex Lamb, Chinnadhurai Sankar, Aaron Courville, andYoshua Bengio. Variance reduction in sgd by distributed importance sampling. arXiv preprint arXiv:1511.06481, 2015."},{"key":"e_1_3_2_1_23_1","volume-title":"Online batch selection for faster training of neural networks. arXiv preprint arXiv:1511.06343","author":"Loshchilov Ilya","year":"2015","unstructured":"Ilya Loshchilov and Frank Hutter . Online batch selection for faster training of neural networks. arXiv preprint arXiv:1511.06343 , 2015 . Ilya Loshchilov and Frank Hutter. Online batch selection for faster training of neural networks. arXiv preprint arXiv:1511.06343, 2015."},{"volume-title":"Machine Learning","year":"1992","key":"e_1_3_2_1_24_1","unstructured":"Long-HLin. Self-improving reactive agents based on reinforcement learning, planning and teaching . Machine Learning , 1992 . Long-HLin. Self-improving reactive agents based on reinforcement learning, planning and teaching. Machine Learning, 1992."},{"key":"e_1_3_2_1_25_1","first-page":"317","volume-title":"Neural Fitted Q Iteration - First Experiences with a Data Efficient Neural Reinforcement Learning Method","author":"Riedmiller Martin","year":"2005","unstructured":"Martin Riedmiller . Neural Fitted Q Iteration - First Experiences with a Data Efficient Neural Reinforcement Learning Method , pp. 317 -- 328 . SpringerBerlin Heidelberg , Berlin, Heidelberg , 2005 . ISBN 978-3-540-31692-3. doi: 10.1007\/11564096-32. 10.1007\/11564096-32 Martin Riedmiller. Neural Fitted Q Iteration - First Experiences with a Data Efficient Neural Reinforcement Learning Method, pp. 317--328. SpringerBerlin Heidelberg, Berlin, Heidelberg, 2005. ISBN 978-3-540-31692-3. doi: 10.1007\/11564096-32."},{"key":"e_1_3_2_1_26_1","doi-asserted-by":"publisher","DOI":"10.1023\/A:1022635613229"}],"event":{"name":"CSAI2019: 2019 3rd International Conference on Computer Science and Artificial Intelligence","sponsor":["Shenzhen University Shenzhen University"],"location":"Normal IL USA","acronym":"CSAI2019"},"container-title":["Proceedings of the 2019 3rd International Conference on Computer Science and Artificial Intelligence"],"original-title":[],"link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3374587.3374595","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3374587.3374595","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T23:44:44Z","timestamp":1750203884000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3374587.3374595"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2019,12,6]]},"references-count":25,"alternative-id":["10.1145\/3374587.3374595","10.1145\/3374587"],"URL":"https:\/\/doi.org\/10.1145\/3374587.3374595","relation":{},"subject":[],"published":{"date-parts":[[2019,12,6]]},"assertion":[{"value":"2020-03-04","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}