{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,7,30]],"date-time":"2025-07-30T15:01:32Z","timestamp":1753887692660,"version":"3.41.2"},"reference-count":22,"publisher":"Wiley","issue":"1","license":[{"start":{"date-parts":[[2021,3,9]],"date-time":"2021-03-09T00:00:00Z","timestamp":1615248000000},"content-version":"vor","delay-in-days":67,"URL":"http:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":["onlinelibrary.wiley.com"],"crossmark-restriction":true},"short-container-title":["Wireless Communications and Mobile Computing"],"published-print":{"date-parts":[[2021,1]]},"abstract":"<jats:p>Deep reinforcement learning is one kind of machine learning algorithms which uses the maximum cumulative reward to learn the optimal strategy. The difficulty is how to ensure the fast convergence of the model and generate a large number of sample data to promote the model optimization. Using the deep reinforcement learning framework of the AlphaZero algorithm, the deployment problem of wireless nodes in wireless ad hoc networks is equivalent to the game of Go. A deployment model of mobile nodes in wireless ad hoc networks based on the AlphaZero algorithm is designed. Because the application scenario of wireless ad hoc network does not have the characteristics of chessboard symmetry and invariability, it cannot expand the data sample set by rotating and changing the chessboard orientation. The strategy of dynamic updating learning rate and the method of selecting the latest model to generate sample data are used to solve the problem of fast model convergence.<\/jats:p>","DOI":"10.1155\/2021\/4361650","type":"journal-article","created":{"date-parts":[[2021,3,9]],"date-time":"2021-03-09T19:05:07Z","timestamp":1615316707000},"update-policy":"https:\/\/doi.org\/10.1002\/crossmark_policy","source":"Crossref","is-referenced-by-count":1,"title":["Research on the Difficulty of Mobile Node Deployment\u2019s Self\u2010Play in Wireless Ad Hoc Networks Based on Deep Reinforcement Learning"],"prefix":"10.1155","volume":"2021","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-3625-1675","authenticated-orcid":false,"given":"Huitao","family":"Wang","sequence":"first","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Ruopeng","family":"Yang","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Changsheng","family":"Yin","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Xiaofei","family":"Zou","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Xuefeng","family":"Wang","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"311","published-online":{"date-parts":[[2021,3,9]]},"reference":[{"key":"e_1_2_8_1_2","unstructured":"SilverD. HubertT. SchrittwieserJ. AntonoglouI. LaiM. GuezA. LanctotM. SifreL. KumaranD. GraepelT. andLillicrapT. Mastering chess and shogi by self-play with a general reinforcement learning algorithm 2017 https:\/\/arxiv.org\/abs\/1712.01815."},{"key":"e_1_2_8_2_2","doi-asserted-by":"publisher","DOI":"10.1145\/1089733.1089736"},{"key":"e_1_2_8_3_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.comnet.2008.02.020"},{"key":"e_1_2_8_4_2","doi-asserted-by":"publisher","DOI":"10.4218\/etrij.08.0207.0249"},{"key":"e_1_2_8_5_2","doi-asserted-by":"crossref","unstructured":"NoackA. BokP. B. andKruckS. Evaluating the impact of transmission power on QoS in wireless mesh networks 2011 Proceedings of 20th International Conference on Computer Communications and Networks (ICCCN) 2011 Lahaina HI USA 1\u20136.","DOI":"10.1109\/ICCCN.2011.6006026"},{"key":"e_1_2_8_6_2","doi-asserted-by":"publisher","DOI":"10.1103\/PhysRevE.64.026118"},{"key":"e_1_2_8_7_2","first-page":"101","article-title":"A comparison study of simulated annealing and genetic algorithm for node placement problem in wireless mesh networks","volume":"9","author":"Sakamoto S.","year":"2013","journal-title":"Journal of Mobile Multimedia"},{"key":"e_1_2_8_8_2","first-page":"45","article-title":"Optimizing gateway placement in wireless mesh networks based on ACO algorithm","volume":"2","author":"Le HD N. N. G.","year":"2013","journal-title":"International Journal of Computer & Communication Engineering"},{"key":"e_1_2_8_9_2","doi-asserted-by":"crossref","unstructured":"DavidO. E.andNetanyahN. S. End-to-end deep neural network for automatic learning in chess International Conference on Artificial Neural Networks 2016 Cham 88\u201396.","DOI":"10.1007\/978-3-319-44781-0_11"},{"key":"e_1_2_8_10_2","unstructured":"ClarkC.andStorkeyA. J. Training deep convolutional neural networks to play Go 37 International conference on machine learning 2015 1766\u20131774."},{"key":"e_1_2_8_11_2","article-title":"Deploying tactical communication node vehicles with AlphaZero algorithm","volume":"14","author":"Zou X.","year":"2019","journal-title":"IET Communications"},{"key":"e_1_2_8_12_2","doi-asserted-by":"publisher","DOI":"10.1038\/nature24270"},{"key":"e_1_2_8_13_2","doi-asserted-by":"publisher","DOI":"10.1038\/nature16961"},{"key":"e_1_2_8_14_2","doi-asserted-by":"crossref","unstructured":"CoulomR. Efficient selectivity and backup operators in Monte-Carlo tree search International conference on computers and games 2006 Berlin Heidelberg 72\u201383.","DOI":"10.1007\/978-3-540-75538-8_7"},{"key":"e_1_2_8_15_2","doi-asserted-by":"publisher","DOI":"10.1109\/TCAD.2012.2185050"},{"key":"e_1_2_8_16_2","doi-asserted-by":"publisher","DOI":"10.1038\/nature14236"},{"key":"e_1_2_8_17_2","unstructured":"SilverD. NewnhamL. BarkerD. WellerS. andMcFallJ. Concurrent reinforcement learning from customer interactions 28 International conference on machine learning 2013 Atlanta GA USA 924\u2013932."},{"key":"e_1_2_8_18_2","unstructured":"FinnC. ChristianoP. AbbeelP. andLevineS. A connection between generative adversarial networks inverse reinforcement learning and energy-based models [EB\/OL] 2016 https:\/\/arxiv.org\/abs\/1611.03852."},{"key":"e_1_2_8_19_2","unstructured":"MnihV. BadiaA. P. MirzaM. GravesA. LillicrapT. HarleyT. SilverD. andKavukcuogluK. Asynchronous methods for deep reinforcement learning 48 International conference on machine learning 2016 New York NY USA 1928\u20131937."},{"key":"e_1_2_8_20_2","unstructured":"LoffeS.andSzegedyC. Batch normalization: accelerating deep network training by reducing internal covariate shift 37 International conference on machine learning 2015 Lille France 448\u2013456."},{"key":"e_1_2_8_21_2","doi-asserted-by":"crossref","unstructured":"PerezD. RohlfshagenP. andLucasS. M. Monte-Carlo tree search for the physical travelling salesman problem European Conference on the Applications of Evolutionary Computation 2012 Berlin Heidelberg 255\u2013264.","DOI":"10.1007\/978-3-642-29178-4_26"},{"key":"e_1_2_8_22_2","first-page":"2602","article-title":"Direct loss minimizaton inverse optimal control","volume":"23","author":"Doerr A.","year":"2015","journal-title":"Molecular Ecology"}],"container-title":["Wireless Communications and Mobile Computing"],"original-title":[],"language":"en","link":[{"URL":"http:\/\/downloads.hindawi.com\/journals\/wcmc\/2021\/4361650.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"http:\/\/downloads.hindawi.com\/journals\/wcmc\/2021\/4361650.xml","content-type":"application\/xml","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/onlinelibrary.wiley.com\/doi\/pdf\/10.1155\/2021\/4361650","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,8,7]],"date-time":"2024-08-07T12:15:33Z","timestamp":1723032933000},"score":1,"resource":{"primary":{"URL":"https:\/\/onlinelibrary.wiley.com\/doi\/10.1155\/2021\/4361650"}},"subtitle":[],"editor":[{"given":"KI-IL","family":"Kim","sequence":"additional","affiliation":[],"role":[{"role":"editor","vocabulary":"crossref"}]}],"short-title":[],"issued":{"date-parts":[[2021,1]]},"references-count":22,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2021,1]]}},"alternative-id":["10.1155\/2021\/4361650"],"URL":"https:\/\/doi.org\/10.1155\/2021\/4361650","archive":["Portico"],"relation":{},"ISSN":["1530-8669","1530-8677"],"issn-type":[{"type":"print","value":"1530-8669"},{"type":"electronic","value":"1530-8677"}],"subject":[],"published":{"date-parts":[[2021,1]]},"assertion":[{"value":"2020-01-16","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2021-02-24","order":2,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2021-03-09","order":3,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}],"article-number":"4361650"}}