{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2023,1,16]],"date-time":"2023-01-16T22:30:53Z","timestamp":1673908253104},"reference-count":52,"publisher":"IGI Global","issue":"4","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2018,10]]},"abstract":"<jats:p>Heterogeneous cellular networks can balance mobile video loads and reduce cell arrangement costs, which is an important technology of future mobile video communication networks. Because of the characteristics of non-convexity of the mobile offloading problem, the design of the optimal strategy is an essential issue. For the sake of ensuring users' quality of service and the long-term overall network utility, this article proposes the distributive optimal method by means of multiple agent reinforcement learning in the downlink heterogeneous cellular networks. In addition, to solve the computational load issue generated by the large action space, deep reinforcement learning is introduced to gain the optimal policy. The learning policy can provide a near-optimal solution efficiently with a fast convergence speed. Simulation results show that the proposed approach is more efficient at improving the performance than the Q-learning method.<\/jats:p>","DOI":"10.4018\/ijmcmc.2018100103","type":"journal-article","created":{"date-parts":[[2018,9,11]],"date-time":"2018-09-11T15:44:16Z","timestamp":1536680656000},"page":"34-57","source":"Crossref","is-referenced-by-count":5,"title":["Deep Reinforcement Learning for Mobile Video Offloading in Heterogeneous Cellular Networks"],"prefix":"10.4018","volume":"9","author":[{"given":"Nan","family":"Zhao","sequence":"first","affiliation":[{"name":"Hubei Collaborative Innovation Center for High-efficiency Utilization of Solar Energy, Hubei University of Technology, Wuhan, China"}]},{"given":"Chao","family":"Tian","sequence":"additional","affiliation":[{"name":"Hubei University of Technology, Wuhan, China"}]},{"given":"Menglin","family":"Fan","sequence":"additional","affiliation":[{"name":"Hubei University of Technology, Wuhan, China"}]},{"given":"Minghu","family":"Wu","sequence":"additional","affiliation":[{"name":"Hubei Collaborative Innovation Center for High-efficiency Utilization of Solar Energy, Hubei University of Technology, Wuhan, China"}]},{"given":"Xiao","family":"He","sequence":"additional","affiliation":[{"name":"Hubei University of Technology, Wuhan, China"}]},{"given":"Pengfei","family":"Fan","sequence":"additional","affiliation":[{"name":"Hubei University of Technology, Wuhan, China"}]}],"member":"2432","reference":[{"key":"IJMCMC.2018100103-0","doi-asserted-by":"publisher","DOI":"10.1504\/IJSSC.2015.070953"},{"key":"IJMCMC.2018100103-1","doi-asserted-by":"publisher","DOI":"10.1504\/IJSSC.2017.088991"},{"key":"IJMCMC.2018100103-2","doi-asserted-by":"publisher","DOI":"10.1007\/s10462-015-9447-5"},{"key":"IJMCMC.2018100103-3","doi-asserted-by":"publisher","DOI":"10.1504\/IJSSC.2015.069199"},{"key":"IJMCMC.2018100103-4","doi-asserted-by":"publisher","DOI":"10.1109\/TCOMM.2014.2339313"},{"key":"IJMCMC.2018100103-5","doi-asserted-by":"publisher","DOI":"10.1109\/GLOCOMW.2010.5700414"},{"key":"IJMCMC.2018100103-6","unstructured":"Bojarski, M., Yeres, P., Choromanska, A., Choromanski, K., Firner, B., Jackel, L., & Muller, U. (2017). Explaining how a deep neural network trained with end-to-end learning steers a car."},{"key":"IJMCMC.2018100103-7","doi-asserted-by":"publisher","DOI":"10.1504\/IJSSC.2017.089003"},{"key":"IJMCMC.2018100103-8","doi-asserted-by":"publisher","DOI":"10.1504\/IJGUC.2017.082140"},{"key":"IJMCMC.2018100103-9","doi-asserted-by":"publisher","DOI":"10.1109\/TIT.2013.2268923"},{"key":"IJMCMC.2018100103-10","doi-asserted-by":"publisher","DOI":"10.1109\/JSAC.2015.2393496"},{"key":"IJMCMC.2018100103-11","first-page":"1840","article-title":"Estimating the maximum expected value in continuous reinforcement learning problems.","author":"C.D\u2019Eramo","year":"2017","journal-title":"31st AAAI Conference on Artificial Intelligence"},{"key":"IJMCMC.2018100103-12","doi-asserted-by":"publisher","DOI":"10.1145\/1143844.1143877"},{"key":"IJMCMC.2018100103-13","article-title":"Deep reinforcement learning in large discrete action spaces.","author":"G.Dulac-Arnold","year":"2016","journal-title":"International Conference on Machine Learning (ICML)"},{"key":"IJMCMC.2018100103-14","first-page":"7","article-title":"Mixed reinforcement learning for partially observable Markov decision process.","author":"L.Dung","year":"2006","journal-title":"Proceedings of the 2007 IEEE International Symposium on Computational Intelligence in Robotics and Automation (CIRA)"},{"key":"IJMCMC.2018100103-15","doi-asserted-by":"publisher","DOI":"10.1109\/JSAC.2015.2416990"},{"key":"IJMCMC.2018100103-16","doi-asserted-by":"publisher","DOI":"10.1109\/TITS.2013.2255286"},{"key":"IJMCMC.2018100103-17","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-68321-8_8"},{"key":"IJMCMC.2018100103-18","doi-asserted-by":"publisher","DOI":"10.1007\/978-1-84882-983-1_21"},{"key":"IJMCMC.2018100103-19","doi-asserted-by":"publisher","DOI":"10.1109\/ICASSP.2017.7952524"},{"key":"IJMCMC.2018100103-20","doi-asserted-by":"publisher","DOI":"10.1109\/ICC.2017.7996332"},{"key":"IJMCMC.2018100103-21","doi-asserted-by":"publisher","DOI":"10.1109\/MSP.2012.2205597"},{"key":"IJMCMC.2018100103-22","doi-asserted-by":"publisher","DOI":"10.1109\/ICICIP.2013.6568104"},{"key":"IJMCMC.2018100103-23","doi-asserted-by":"publisher","DOI":"10.1007\/s41650-017-0002-1"},{"key":"IJMCMC.2018100103-24","doi-asserted-by":"crossref","unstructured":"Kar, S. Moura, Jose M.F., & Poor, H. V. (2013). Distributed reinforcement learning in multi-agent networks. In 2013 5th IEEE International Workshop on Computational Advances in Multi-Sensor Adaptive Processing (CAMSAP) (pp. 296-299).","DOI":"10.1109\/CAMSAP.2013.6714066"},{"key":"IJMCMC.2018100103-25","doi-asserted-by":"crossref","unstructured":"Katayama, S. (2016). Ideas for a reinforcement learning algorithm that learns programs. In Artificial General Intelligence, LNCS (Vol. 9782, pp. 354-362). Cham: Springer.","DOI":"10.1007\/978-3-319-41649-6_36"},{"key":"IJMCMC.2018100103-26","doi-asserted-by":"publisher","DOI":"10.1504\/IJSSC.2018.091194"},{"key":"IJMCMC.2018100103-27","first-page":"1097","article-title":"ImageNet Classification with Deep Convolutional Neural Networks.","author":"A.Krizhevsky","year":"2012","journal-title":"25th International Conference on Neural Information Processing Systems (NIPS)"},{"key":"IJMCMC.2018100103-28","doi-asserted-by":"publisher","DOI":"10.1504\/IJSSC.2017.086811"},{"key":"IJMCMC.2018100103-29","doi-asserted-by":"publisher","DOI":"10.1038\/nature14539"},{"issue":"12","key":"IJMCMC.2018100103-30","first-page":"3136","article-title":"Shallow updates for deep reinforcement learning.","author":"N.Levine","year":"2017","journal-title":"Advances in Neural Information Processing Systems"},{"key":"IJMCMC.2018100103-31","doi-asserted-by":"publisher","DOI":"10.1109\/MWC.2015.7143323"},{"key":"IJMCMC.2018100103-32","doi-asserted-by":"publisher","DOI":"10.1504\/IJGUC.2018.090225"},{"key":"IJMCMC.2018100103-33","doi-asserted-by":"publisher","DOI":"10.1504\/IJSSC.2016.077965"},{"issue":"9","key":"IJMCMC.2018100103-34","first-page":"75","article-title":"Optimised cost-231 hata models for WiMAX path loss prediction in suburban and open urban environments.","volume":"4","author":"R.Mardeni","year":"2010","journal-title":"Modern Applied Science"},{"key":"IJMCMC.2018100103-35","doi-asserted-by":"publisher","DOI":"10.1038\/nature14236"},{"key":"IJMCMC.2018100103-36","doi-asserted-by":"publisher","DOI":"10.1504\/IJGUC.2018.091720"},{"key":"IJMCMC.2018100103-37","doi-asserted-by":"publisher","DOI":"10.1109\/JSAC.2014.2328143"},{"key":"IJMCMC.2018100103-38","doi-asserted-by":"publisher","DOI":"10.1016\/j.procs.2015.12.013"},{"key":"IJMCMC.2018100103-39","doi-asserted-by":"publisher","DOI":"10.1145\/1553374.1553504"},{"key":"IJMCMC.2018100103-40","unstructured":"Tieleman, T., & Hinton, G. (2012). Lecture 6.5 RmsProp: Divide the gradient by a running average of its recent magnitude. COURSERA: Neural Networks for Machine Learning."},{"key":"IJMCMC.2018100103-41","doi-asserted-by":"publisher","DOI":"10.1504\/IJGUC.2016.073780"},{"key":"IJMCMC.2018100103-42","article-title":"Deep reinforcement learning for dynamic multichannel access.","author":"S.Wang","year":"2017","journal-title":"Proc. Int. Conf. Computing, Networking and Communication (ICNC)"},{"key":"IJMCMC.2018100103-43","doi-asserted-by":"publisher","DOI":"10.1007\/BF00992698"},{"key":"IJMCMC.2018100103-44","doi-asserted-by":"publisher","DOI":"10.1109\/GLOCOM.2017.8254214"},{"key":"IJMCMC.2018100103-45","doi-asserted-by":"publisher","DOI":"10.1109\/WI-IAT.2009.147"},{"key":"IJMCMC.2018100103-46","doi-asserted-by":"publisher","DOI":"10.1504\/IJGUC.2015.070682"},{"key":"IJMCMC.2018100103-47","doi-asserted-by":"publisher","DOI":"10.1109\/TWC.2013.040413.120676"},{"key":"IJMCMC.2018100103-48","doi-asserted-by":"publisher","DOI":"10.1109\/JSAC.2016.2520244"},{"key":"IJMCMC.2018100103-49","doi-asserted-by":"crossref","unstructured":"Zhao, N., He, X., Wu, M., Fan, P., Fan, M., & Tian, C. (2018). Deep Q-network for user association in heterogeneous cellular networks. In The 12th International Conference on Complex, Intelligent, and Software Intensive Systems (CISIS) (pp. 1-10).","DOI":"10.1007\/978-3-319-93659-8_35"},{"key":"IJMCMC.2018100103-50","doi-asserted-by":"publisher","DOI":"10.1109\/GLOCOM.2017.8254092"},{"key":"IJMCMC.2018100103-51","doi-asserted-by":"publisher","DOI":"10.1007\/s11276-017-1518-x"}],"container-title":["International Journal of Mobile Computing and Multimedia Communications"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.igi-global.com\/viewtitle.aspx?TitleId=214042","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2022,5,6]],"date-time":"2022-05-06T13:59:03Z","timestamp":1651845543000},"score":1,"resource":{"primary":{"URL":"http:\/\/services.igi-global.com\/resolvedoi\/resolve.aspx?doi=10.4018\/IJMCMC.2018100103"}},"subtitle":[""],"short-title":[],"issued":{"date-parts":[[2018,10]]},"references-count":52,"journal-issue":{"issue":"4"},"URL":"https:\/\/doi.org\/10.4018\/ijmcmc.2018100103","relation":{},"ISSN":["1937-9412","1937-9404"],"issn-type":[{"value":"1937-9412","type":"print"},{"value":"1937-9404","type":"electronic"}],"subject":[],"published":{"date-parts":[[2018,10]]}}}