{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T02:23:13Z","timestamp":1760149393432,"version":"build-2065373602"},"reference-count":45,"publisher":"MDPI AG","issue":"15","license":[{"start":{"date-parts":[[2023,7,31]],"date-time":"2023-07-31T00:00:00Z","timestamp":1690761600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"National Natural Science Foundation of China","award":["62271068","61827801","L222046","NS2022046"],"award-info":[{"award-number":["62271068","61827801","L222046","NS2022046"]}]},{"name":"Beijing Natural Science Foundation","award":["62271068","61827801","L222046","NS2022046"],"award-info":[{"award-number":["62271068","61827801","L222046","NS2022046"]}]},{"name":"Basic Scientific Research Project","award":["62271068","61827801","L222046","NS2022046"],"award-info":[{"award-number":["62271068","61827801","L222046","NS2022046"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Sensors"],"abstract":"<jats:p>Wireless resource utilizations are the focus of future communication, which are used constantly to alleviate the communication quality problem caused by the explosive interference with increasing users, especially the inter-cell interference in the multi-cell multi-user systems. To tackle this interference and improve the resource utilization rate, we proposed a joint-priority-based reinforcement learning (JPRL) approach to jointly optimize the bandwidth and transmit power allocation. This method aims to maximize the average throughput of the system while suppressing the co-channel interference and guaranteeing the quality of service (QoS) constraint. Specifically, we de-coupled the joint problem into two sub-problems, i.e., the bandwidth assignment and power allocation sub-problems. The multi-agent double deep Q network (MADDQN) was developed to solve the bandwidth allocation sub-problem for each user and the prioritized multi-agent deep deterministic policy gradient (P-MADDPG) algorithm by deploying a prioritized replay buffer that is designed to handle the transmit power allocation sub-problem. Numerical results show that the proposed JPRL method could accelerate model training and outperform the alternative methods in terms of throughput. For example, the average throughput was approximately 10.4\u201315.5% better than the homogeneous-learning-based benchmarks, and about 17.3% higher than the genetic algorithm.<\/jats:p>","DOI":"10.3390\/s23156822","type":"journal-article","created":{"date-parts":[[2023,7,31]],"date-time":"2023-07-31T10:08:14Z","timestamp":1690798094000},"page":"6822","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":0,"title":["Joint Optimization of Bandwidth and Power Allocation in Uplink Systems with Deep Reinforcement Learning"],"prefix":"10.3390","volume":"23","author":[{"given":"Chongli","family":"Zhang","sequence":"first","affiliation":[{"name":"School of Information and Communication Engineering, Beijing University of Posts and Telecommunications (BUPT), Beijing 100876, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-4127-2419","authenticated-orcid":false,"given":"Tiejun","family":"Lv","sequence":"additional","affiliation":[{"name":"School of Information and Communication Engineering, Beijing University of Posts and Telecommunications (BUPT), Beijing 100876, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Pingmu","family":"Huang","sequence":"additional","affiliation":[{"name":"School of Artificial Intelligence, Beijing University of Posts and Telecommunications (BUPT), Beijing 100876, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-6111-9687","authenticated-orcid":false,"given":"Zhipeng","family":"Lin","sequence":"additional","affiliation":[{"name":"Key Laboratory of Dynamic Cognitive System of Electromagnetic Spectrum Space, College of Electronic and Information Engineering, Nanjing University of Aeronautics and Astronautics (NUAA), Nanjing 211106, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Jie","family":"Zeng","sequence":"additional","affiliation":[{"name":"School of Cyberspace Science and Technology, Beijing Institute of Technology, Beijing 100081, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-2520-8388","authenticated-orcid":false,"given":"Yuan","family":"Ren","sequence":"additional","affiliation":[{"name":"Shaanxi Key Laboratory of Information Communication Network and Security, School of Communications and Information Engineering, Xi\u2019an University of Posts and Telecommunications, Xi\u2019an 710121, China"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"1968","published-online":{"date-parts":[[2023,7,31]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","unstructured":"Liu, G., Cai, B., and Xie, W. (2021, January 4\u20136). Research on 5G Wireless Networks and Evolution. Proceedings of the IEEE International Symposium on Broadband Multimedia Systems and Broadcasting (BMSB), Chengdu, China.","DOI":"10.1109\/BMSB53066.2021.9547155"},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"113428","DOI":"10.1109\/ACCESS.2021.3104509","article-title":"Survey and Performance Evaluation of Multiple Access Schemes for Next-Generation Wireless Communication Systems","volume":"9","author":"Shah","year":"2021","journal-title":"IEEE Access"},{"key":"ref_3","doi-asserted-by":"crossref","first-page":"105","DOI":"10.1109\/TTHZ.2021.3128677","article-title":"Terahertz Communications: Challenges in the Next Decade","volume":"12","author":"Song","year":"2022","journal-title":"IEEE Trans. Terahertz. Sci. Technol."},{"key":"ref_4","doi-asserted-by":"crossref","first-page":"1117","DOI":"10.1109\/COMST.2022.3151028","article-title":"Cellular, Wide-Area, and Non-Terrestrial IoT: A Survey on 5G Advances and the Road Toward 6G","volume":"24","author":"Vaezi","year":"2022","journal-title":"IEEE Commun. Surv. Tutor."},{"key":"ref_5","doi-asserted-by":"crossref","first-page":"426","DOI":"10.1109\/COMST.2019.2916177","article-title":"Toward Massive Machine Type Communications in Ultra-Dense Cellular IoT Networks: Current Issues and Machine Learning-Assisted Solutions","volume":"22","author":"Sharma","year":"2020","journal-title":"IEEE Commun. Surv. Tutor."},{"key":"ref_6","doi-asserted-by":"crossref","first-page":"147692","DOI":"10.1109\/ACCESS.2021.3123577","article-title":"Energy-Efficient Ultra-Dense 5G Networks: Recent Advances, Taxonomy and Future Research Directions","volume":"9","author":"Mughees","year":"2021","journal-title":"IEEE Access"},{"key":"ref_7","doi-asserted-by":"crossref","unstructured":"Mardian, R.D., Suryanegara, M., and Ramli, K. (2019, January 28\u201330). Measuring Quality of Service (QoS) and Quality of Experience (QoE) on 5G Technology: A Review. Proceedings of the IEEE International Conference on Innovative Research and Development (ICIRD 2019), Jakarta, Indonesia.","DOI":"10.1109\/ICIRD47319.2019.9074681"},{"key":"ref_8","doi-asserted-by":"crossref","first-page":"668","DOI":"10.1109\/COMST.2021.3059896","article-title":"A Survey on Resource Allocation for 5G Heterogeneous Networks: Current Research, Future Trends, and Challenges","volume":"23","author":"Xu","year":"2021","journal-title":"IEEE Commun. Surv. Tutor."},{"key":"ref_9","doi-asserted-by":"crossref","first-page":"58","DOI":"10.23919\/JCC.2020.03.006","article-title":"Artificial intelligence-empowered resource management for future wireless communications: A survey","volume":"17","author":"Lin","year":"2020","journal-title":"China Commun."},{"key":"ref_10","doi-asserted-by":"crossref","first-page":"94100","DOI":"10.1109\/ACCESS.2022.3203575","article-title":"Energy-Efficient OFDM Radio Resource Allocation Optimization with Computational Awareness: A Survey","volume":"10","author":"Bossy","year":"2022","journal-title":"IEEE Access"},{"key":"ref_11","doi-asserted-by":"crossref","first-page":"57475","DOI":"10.1109\/ACCESS.2021.3072981","article-title":"Robust Optimal Power Control and Subcarrier Allocation in Uplink OFDMA Network With Assistance of Mobile Relay","volume":"9","author":"Xu","year":"2021","journal-title":"IEEE Access"},{"key":"ref_12","first-page":"1094","article-title":"Resource Management for Millimeter-Wave Ultra-Reliable and Low-Latency Communications","volume":"69","author":"Liu","year":"2021","journal-title":"IEEE Trans. Commun."},{"key":"ref_13","doi-asserted-by":"crossref","unstructured":"Sun, Q., Wu, H., and Petrosian, O. (2022). Optimal Power Allocation Based on Metaheuristic Algorithms in Wireless Network. Mathematics, 10.","DOI":"10.3390\/math10183336"},{"key":"ref_14","doi-asserted-by":"crossref","unstructured":"Cao, L., Wang, Z., Wang, Z., Wang, X., and Yue, Y. (2023). An Energy-Saving and Efficient Deployment Strategy for Heterogeneous Wireless Sensor Networks Based on Improved Seagull Optimization Algorithm. Biomimetics, 8.","DOI":"10.3390\/biomimetics8020231"},{"key":"ref_15","doi-asserted-by":"crossref","first-page":"9293","DOI":"10.1109\/TVT.2019.2926701","article-title":"Spectral-and energy-efficient resource allocation for multi-carrier uplink NOMA systems","volume":"68","author":"Zeng","year":"2019","journal-title":"IEEE Trans. Veh. Technol."},{"key":"ref_16","doi-asserted-by":"crossref","first-page":"3079","DOI":"10.1109\/JIOT.2021.3111838","article-title":"Adaptive and Priority-Based Resource Allocation for Efficient Resources Utilization in Mobile-Edge Computing","volume":"10","author":"Sharif","year":"2023","journal-title":"IEEE Internet Things J."},{"key":"ref_17","doi-asserted-by":"crossref","first-page":"16152","DOI":"10.1109\/ACCESS.2021.3049883","article-title":"Joint Task Offloading and Resource Allocation for Multi-Task Multi-Server NOMA-MEC Networks","volume":"9","author":"Xue","year":"2021","journal-title":"IEEE Access"},{"key":"ref_18","doi-asserted-by":"crossref","unstructured":"Brahmi, I., Koubaa, H., and Zarai, F. (2020, January 27\u201330). Genetic Algorithm based Resource Allocation for V2X Communications. Proceedings of the International Conference on Communications and Networking, ComNet, Hammamet, Tunisia.","DOI":"10.1109\/ComNet47917.2020.9306076"},{"key":"ref_19","doi-asserted-by":"crossref","unstructured":"Dun, H., Ye, F., Jiao, S., Li, Y., and Jiang, T. (2019, January 9\u201313). The Distributed Resource Allocation for D2D Communication with Game Theory. Proceedings of the 2019 IEEE APS Topical Conference on Antennas and Propagation in Wireless Communications (APWC), Granada, Spain.","DOI":"10.1109\/APWC.2019.8870437"},{"key":"ref_20","doi-asserted-by":"crossref","unstructured":"Li, M., Peng, T., and Wu, H. (2020, January 11\u201314). Power Allocation to Achieve Maximum Throughput in Multi-radio Multi-channel Mesh Network. Proceedings of the IEEE 6th International Conference on Computer and Communications (ICCC), Chengdu, China.","DOI":"10.1109\/ICCC51575.2020.9345094"},{"key":"ref_21","doi-asserted-by":"crossref","unstructured":"Wang, C., and Yan, F. (2021, January 13\u201316). Graph Theory based Resource Allocation Algorithm in Terahertz Communication Networks. Proceedings of the 2021 IEEE International Conference on Information Networking, Jeju Island, Republic of Korea.","DOI":"10.1109\/ICICN52636.2021.9673829"},{"key":"ref_22","doi-asserted-by":"crossref","first-page":"44","DOI":"10.1109\/MVT.2019.2903655","article-title":"Deep Reinforcement Learning for Mobile 5G and Beyond: Fundamentals, Applications, and Challenges","volume":"14","author":"Xiong","year":"2019","journal-title":"IEEE Trans. Veh. Technol."},{"key":"ref_23","doi-asserted-by":"crossref","first-page":"29","DOI":"10.1109\/MVT.2020.3015184","article-title":"Green Deep Reinforcement Learning for Radio Resource Management: Architecture, Algorithm Compression, and Challenges","volume":"16","author":"Du","year":"2021","journal-title":"IEEE Trans. Veh. Technol."},{"key":"ref_24","doi-asserted-by":"crossref","unstructured":"Han, K., and Ye, C. (2022, January 30\u201331). Power Control Research for Device-to-Device Wireless Network Underlying Reinforcement Learning. Proceedings of the Global Conference on Robotics, Artificial Intelligence and Information Technology (GCRAIT), Chicago, IL, USA.","DOI":"10.1109\/GCRAIT55928.2022.00081"},{"key":"ref_25","doi-asserted-by":"crossref","first-page":"529","DOI":"10.1038\/nature14236","article-title":"Human-level control through deep reinforcement learning","volume":"518","author":"Mnih","year":"2015","journal-title":"Nature"},{"key":"ref_26","doi-asserted-by":"crossref","unstructured":"Liu, J., Ma, X., Han, W., and Wang, L. (2020, January 17\u201320). Resource Allocation in OFDMA Networks with Deep Reinforcement Learning. Proceedings of the IEEE 8th International Conference on Information, communication and networks (ICICN), Xi\u2019an, China.","DOI":"10.1109\/ICICN51133.2020.9205096"},{"key":"ref_27","doi-asserted-by":"crossref","unstructured":"Guan, X., Lv, T., Lin, Z., Huang, P., and Zeng, J. (2022). D2D-Assisted Multi-User Cooperative Partial Offloading in MEC Based on Deep Reinforcement Learning. Sensors, 22.","DOI":"10.3390\/s22187004"},{"key":"ref_28","doi-asserted-by":"crossref","first-page":"243","DOI":"10.23919\/ICN.2020.0020","article-title":"Deep reinforcement learning based computation offloading and resource allocation for low-latency fog radio access networks","volume":"1","author":"Rahman","year":"2020","journal-title":"Intell. Converg. Netw."},{"key":"ref_29","doi-asserted-by":"crossref","unstructured":"Iqbal, A., Tham, M.L., and Chang, Y.C. (2020, January 14\u201316). Double Deep Q-Network for Power Allocation in Cloud Radio Access Network. Proceedings of the 2020 IEEE 3rd International Conference on Computer and Communication Engineering Technology (CCET), Beijing, China.","DOI":"10.1109\/CCET50901.2020.9213138"},{"key":"ref_30","unstructured":"Lillicrap, T.P., Hunt, J.J., Pritzel, A., Heess, N., Erez, T., Tassa, Y., Silver, D., and Wierstra, D. (2015). Continuous control with deep reinforcement learning. arXiv."},{"key":"ref_31","doi-asserted-by":"crossref","first-page":"6255","DOI":"10.1109\/TWC.2020.3001736","article-title":"Power Allocation in Multi-User Cellular Networks: Deep Reinforcement Learning Approaches","volume":"19","author":"Meng","year":"2020","journal-title":"IEEE Trans. Wirel. Commun."},{"key":"ref_32","doi-asserted-by":"crossref","first-page":"441","DOI":"10.1109\/TCOMM.2022.3221422","article-title":"DDPG-Based Joint Time and Energy Management in Ambient Backscatter-Assisted Hybrid Underlay CRNs","volume":"71","author":"Zheng","year":"2023","journal-title":"IEEE Trans. Comm."},{"key":"ref_33","doi-asserted-by":"crossref","unstructured":"Yue, Y., Cao, L., Lu, D., Hu, Z., Xu, M., Wang, S., Li, B., and Ding, H. (2023). Review and empirical analysis of sparrow search algorithm. Artif. Intell. Rev., 1\u201353.","DOI":"10.1007\/s10462-023-10435-1"},{"key":"ref_34","unstructured":"Zhang, K., Yang, Z., and Ba\u015far, T. (2021). Handbook of Reinforcement Learning and Control, Springer."},{"key":"ref_35","doi-asserted-by":"crossref","unstructured":"Rosenberger, J., Urlaub, M., and Schramm, D. (2021, January 12\u201316). Multi-agent reinforcement learning for intelligent resource allocation in IIoT networks. Proceedings of the 2021 IEEE Global Conference Artificial Intelligence and Internet of Things (GCAIoT), Dubai, United Arab Emirates.","DOI":"10.1109\/GCAIoT53516.2021.9692913"},{"key":"ref_36","doi-asserted-by":"crossref","unstructured":"Hu, J., Wang, X., Li, D., and Xu, Y. (2020, January 21\u201323). Multi-agent DRL-Based Resource Allocation in Downlink Multi-cell OFDMA System. Proceedings of the 2020 International Conference on Wireless Communications and Signal Processing (IWCSP), Nanjing, China.","DOI":"10.1109\/WCSP49889.2020.9299746"},{"key":"ref_37","doi-asserted-by":"crossref","first-page":"32400","DOI":"10.1109\/ACCESS.2019.2901300","article-title":"Multi-Agent Deep Reinforcement Learning for Multi-Object Tracker","volume":"7","author":"Jiang","year":"2019","journal-title":"IEEE Access"},{"key":"ref_38","doi-asserted-by":"crossref","first-page":"1683","DOI":"10.1109\/JIOT.2021.3089823","article-title":"Multiagent Deep-Reinforcement-Learning-Based Resource Allocation for Heterogeneous QoS Guarantees for Vehicular Networks","volume":"9","author":"Tian","year":"2022","journal-title":"IEEE Internet Things J."},{"key":"ref_39","unstructured":"Zhu, Q., Wang, C.X., Hua, B., Mao, K., Jiang, S., and Yao, M. (2021). The Wiley 5G Ref: The Essential 5G Reference Online, Wiley Press."},{"key":"ref_40","doi-asserted-by":"crossref","unstructured":"Morais, D.H., and Morais, D.H. (2022). Key 5G Physical Layer Technologies: Enabling Mobile and Fixed Wireless Access, Springer.","DOI":"10.1007\/978-3-030-89209-8"},{"key":"ref_41","doi-asserted-by":"crossref","unstructured":"Modak, K., and Rahman, S. (2021, January 16\u201317). Multi-cell Interference Management in In-band D2D Communication under LTE-A Network. Proceedings of the 2021 International Conference on Computing, Electronics & Communications Engineering (ICCECE), Virtual.","DOI":"10.1109\/iCCECE52344.2021.9534849"},{"key":"ref_42","doi-asserted-by":"crossref","unstructured":"Jia, R., Liu, L., Zheng, X., Yang, Y., Wang, S., Huang, P., and Lv, T. (2022, January 16\u201320). Multi-Agent Deep Reinforcement Learning for Uplink Power Control in Multi-Cell Systems. Proceedings of the 2022 IEEE International Conference on Communications Workshops (ICC Workshops), Seoul, Republic of Korea.","DOI":"10.1109\/ICCWorkshops53468.2022.9814468"},{"key":"ref_43","unstructured":"Haarnoja, T., Zhou, A., Hartikainen, K., Tucker, G., Ha, S., Tan, J., Kumar, V., Zhu, H., Gupta, A., and Abbeel, P. (2018). Soft actor-critic algorithms and applications. arXiv."},{"key":"ref_44","doi-asserted-by":"crossref","first-page":"8616","DOI":"10.1109\/JIOT.2020.3047105","article-title":"An Incentive Mechanism for Privacy-Preserving Crowdsensing via Deep Reinforcement Learning","volume":"8","author":"Liu","year":"2021","journal-title":"IEEE Internet Things J."},{"key":"ref_45","first-page":"2171","article-title":"DEAP: Evolutionary algorithms made easy","volume":"13","author":"Fortin","year":"2012","journal-title":"J. Mach. Learn. Res."}],"container-title":["Sensors"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/1424-8220\/23\/15\/6822\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,10]],"date-time":"2025-10-10T20:23:12Z","timestamp":1760127792000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/1424-8220\/23\/15\/6822"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,7,31]]},"references-count":45,"journal-issue":{"issue":"15","published-online":{"date-parts":[[2023,8]]}},"alternative-id":["s23156822"],"URL":"https:\/\/doi.org\/10.3390\/s23156822","relation":{},"ISSN":["1424-8220"],"issn-type":[{"type":"electronic","value":"1424-8220"}],"subject":[],"published":{"date-parts":[[2023,7,31]]}}}