{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,22]],"date-time":"2025-10-22T10:47:02Z","timestamp":1761130022213,"version":"build-2065373602"},"reference-count":42,"publisher":"MDPI AG","issue":"21","license":[{"start":{"date-parts":[[2022,10,28]],"date-time":"2022-10-28T00:00:00Z","timestamp":1666915200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Sensors"],"abstract":"<jats:p>Multiple Input Multiple Output (MIMO) systems have been gaining significant attention from the research community due to their potential to improve data rates. However, a suitable scheduling mechanism is required to efficiently distribute available spectrum resources and enhance system capacity. This paper investigates the user selection problem in Multi-User MIMO (MU-MIMO) environment using the multi-agent Reinforcement learning (RL) methodology. Adopting multiple antennas\u2019 spatial degrees of freedom, devices can serve to transmit simultaneously in every time slot. We aim to develop an optimal scheduling policy by optimally selecting a group of users to be scheduled for transmission, given the channel condition and resource blocks at the beginning of each time slot. We first formulate the MU-MIMO scheduling problem as a single-state Markov Decision Process (MDP). We achieve the optimal policy by solving the formulated MDP problem using RL. We use aggregated sum-rate of the group of users selected for transmission, and a 20% higher sum-rate performance over the conventional methods is reported.<\/jats:p>","DOI":"10.3390\/s22218278","type":"journal-article","created":{"date-parts":[[2022,10,30]],"date-time":"2022-10-30T10:47:57Z","timestamp":1667126877000},"page":"8278","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":9,"title":["Optimal User Scheduling in Multi Antenna System Using Multi Agent Reinforcement Learning"],"prefix":"10.3390","volume":"22","author":[{"ORCID":"https:\/\/orcid.org\/0000-0003-0815-4883","authenticated-orcid":false,"given":"Muddasar","family":"Naeem","sequence":"first","affiliation":[{"name":"Institute of High Performance Computing and Networking, National Research Council of Italy, 80131 Naples, Italy"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-8177-032X","authenticated-orcid":false,"given":"Antonio","family":"Coronato","sequence":"additional","affiliation":[{"name":"Centro di Ricerche sulle Tecnologie ICT per la Salute ed il Benessere, Universit\u00e0 Giustino Fortunato, 82100 Benevento, Italy"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-2200-4868","authenticated-orcid":false,"given":"Zaib","family":"Ullah","sequence":"additional","affiliation":[{"name":"Institute of High Performance Computing and Networking, National Research Council of Italy, 80131 Naples, Italy"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Sajid","family":"Bashir","sequence":"additional","affiliation":[{"name":"Department of Electrical Engineering, National University of Sciences & Technology, Islamabad 44000, Pakistan"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-3580-9232","authenticated-orcid":false,"given":"Giovanni","family":"Paragliola","sequence":"additional","affiliation":[{"name":"Institute of High Performance Computing and Networking, National Research Council of Italy, 80131 Naples, Italy"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"1968","published-online":{"date-parts":[[2022,10,28]]},"reference":[{"key":"ref_1","first-page":"315","article-title":"On limits of wireless communication in a fading environment when using multiple antenna","volume":"6","author":"Foshini","year":"1998","journal-title":"Wirel. Pers. Commun."},{"key":"ref_2","doi-asserted-by":"crossref","unstructured":"Sarieddeen, H., Mansour, M.M., Jalloul, L.M., and Chehab, A. (2016, January 20\u201325). Efficient near optimal joint modulation classification and detection for MU-MIMO systems. Proceedings of the 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Shanghai, China.","DOI":"10.1109\/ICASSP.2016.7472369"},{"key":"ref_3","doi-asserted-by":"crossref","first-page":"1783","DOI":"10.1109\/TIT.2005.846425","article-title":"Dirty-paper coding versus TDMA for MIMO broadcast channels","volume":"51","author":"Jindal","year":"2005","journal-title":"IEEE Trans. Inf. Theory"},{"key":"ref_4","doi-asserted-by":"crossref","unstructured":"Lee, J., and Jindal, N. (November, January 29). Dirty paper coding vs. linear precoding for MIMO broadcast channels. In Proceedings of the 2006 Fortieth Asilomar Conference on Signals, Systems and Computers, Pacific Grove, CA, USA.","DOI":"10.1109\/ACSSC.2006.354855"},{"key":"ref_5","doi-asserted-by":"crossref","first-page":"195","DOI":"10.1109\/TCOMM.2004.840638","article-title":"A vector-perturbation technique for near-capacity multiantenna multiuser communication-part I: Channel inversion and regularization","volume":"53","author":"Peel","year":"2005","journal-title":"IEEE Trans. Commun."},{"key":"ref_6","unstructured":"Sutton, R.S., and Barto, A.G. (2018). Reinforcement Learning: An Introduction, MIT Press."},{"key":"ref_7","doi-asserted-by":"crossref","unstructured":"Di Sarno, C., Formicola, V., Sicuranza, M., and Paragliola, G. (2013, January 2\u20136). Addressing Security Issues of Electronic Health Record Systems through Enhanced SIEM Technology. Proceedings of the 2013 International Conference on Availability, Reliability and Security, Regensburg, Germany.","DOI":"10.1109\/ARES.2013.85"},{"key":"ref_8","doi-asserted-by":"crossref","unstructured":"Coronato, A., Di Napoli, C., Paragliola, G., and Serino, L. (2021, January 21\u201324). Intelligent Planning of Onshore Touristic Itineraries for Cruise Passengers in a Smart City. Proceedings of the 2021 17th International Conference on Intelligent Environments (IE), Dubai, United Arab Emirates.","DOI":"10.1109\/IE51775.2021.9486648"},{"key":"ref_9","unstructured":"Coronato, A., de Pietro, G., and Paragliola, G. (October, January 30). A Monitoring System Enhanced by Means of Situation-Awareness for Cognitive Impaired People. Proceedings of the 8th International Conference on Body Area Networks, Boston, MA, USA."},{"key":"ref_10","doi-asserted-by":"crossref","unstructured":"Paragliola, G., Coronato, A., Naeem, M., and De Pietro, G. (2018, January 26\u201329). A Reinforcement Learning-Based Approach for the Risk Management of e-Health Environments: A Case Study. Proceedings of the 2018 14th International Conference on Signal-Image Technology & Internet-Based Systems (SITIS), Las Palmas de Gran Canaria, Spain.","DOI":"10.1109\/SITIS.2018.00114"},{"key":"ref_11","doi-asserted-by":"crossref","first-page":"209320","DOI":"10.1109\/ACCESS.2020.3038605","article-title":"A Gentle Introduction to Reinforcement Learning and its Application in Different Fields","volume":"8","author":"Naeem","year":"2020","journal-title":"IEEE Access"},{"key":"ref_12","doi-asserted-by":"crossref","first-page":"5732","DOI":"10.1109\/TWC.2020.2996368","article-title":"Power control in cellular massive MIMO with varying user activity: A deep learning solution","volume":"19","author":"Canh","year":"2020","journal-title":"IEEE Trans. Wirel. Commun."},{"key":"ref_13","doi-asserted-by":"crossref","first-page":"1676","DOI":"10.1109\/25.790549","article-title":"A Q-learning-based dynamic channel assignment technique for mobile communication systems","volume":"48","author":"Nie","year":"1999","journal-title":"IEEE Trans. Veh. Technol."},{"key":"ref_14","doi-asserted-by":"crossref","unstructured":"Bennis, M., and Niyato, D. (2010, January 6\u201310). A Q-learning based approach to interference avoidance in self-organized femtocell networks. Proceedings of the 2010 IEEE Globecom Workshops, Miami, FL, USA.","DOI":"10.1109\/GLOCOMW.2010.5700414"},{"key":"ref_15","unstructured":"Santos, E.C. (2017). A simple reinforcement learning mechanism for resource allocation in lte-a networks with markov decision process and q-learning. arXiv."},{"key":"ref_16","unstructured":"Kong, P.Y., and Panaitopol, D. (2013, January 8\u201311). Reinforcement learning approach to dynamic activation of base station resources in wireless networks. Proceedings of the 2013 IEEE 24th Annual International Symposium on Personal, Indoor, and Mobile Radio Communications (PIMRC), London, UK."},{"key":"ref_17","doi-asserted-by":"crossref","first-page":"37","DOI":"10.1109\/MPOT.2009.934896","article-title":"Multi-user MIMO systems: The future in the making","volume":"28","author":"Kurve","year":"2009","journal-title":"IEEE Potentials"},{"key":"ref_18","doi-asserted-by":"crossref","unstructured":"Naeem, M., Bashir, S., Khan, M.U., and Syed, A.A. (2016, January 12\u201316). Performance comparison of scheduling algorithms for MU-MIMO systems. Proceedings of the 2016 13th International Bhurban Conference on Applied Sciences and Technology (IBCAST), Islamabad, Pakistan.","DOI":"10.1109\/IBCAST.2016.7429939"},{"key":"ref_19","doi-asserted-by":"crossref","unstructured":"Naeem, M., Bashir, S., Khan, M.U., and Syed, A.A. (2015, January 12\u201313). Modified SINR based user selection for MU-MIMO systems. Proceedings of the 2015 International Conference on Information and Communication Technologies (ICICT), Karachi, Pakistan.","DOI":"10.1109\/ICICT.2015.7469587"},{"key":"ref_20","doi-asserted-by":"crossref","first-page":"2225","DOI":"10.1109\/LCOMM.2022.3186350","article-title":"User-centric access point selection in cell-free massive MIMO systems: A game-theoretic approach","volume":"26","author":"Wei","year":"2022","journal-title":"IEEE Commun. Lett."},{"key":"ref_21","doi-asserted-by":"crossref","unstructured":"Carvajal, H., Orozco, N., Cacuango, S., Salazar, P., Rosero, E., and Almeida, F. (2022). A Scheduling Scheme for Improving the Performance and Security of MU-MIMO Systems. Sensors, 22.","DOI":"10.3390\/s22145369"},{"key":"ref_22","doi-asserted-by":"crossref","unstructured":"Yin, Z., Chen, J., Li, G., Wang, H., He, W., and Ni, Y. (2022). A Deep Learning-Based User Selection Scheme for Cooperative NOMA System with Imperfect CSI. Wirel. Commun. Mob. Comput., 2022.","DOI":"10.1155\/2022\/7732029"},{"key":"ref_23","first-page":"4044","article-title":"Deep Learning for Multi-User MIMO Systems: Joint Design of Pilot, Limited Feedback, and Precoding","volume":"20","author":"Jang","year":"2022","journal-title":"IEEE Trans. Commun."},{"key":"ref_24","doi-asserted-by":"crossref","first-page":"2160","DOI":"10.1109\/TGCN.2021.3093439","article-title":"Machine Learning Based Beam Selection with Low Complexity Hybrid Beamforming Design for 5G Massive MIMO Systems","volume":"5","author":"Ahmed","year":"2021","journal-title":"IEEE Trans. Green Commun. Netw."},{"key":"ref_25","first-page":"3189","article-title":"Energy-efficient low-complexity algorithm in 5G massive MIMO systems","volume":"67","author":"Salh","year":"2021","journal-title":"Comput. Mater. Contin."},{"key":"ref_26","doi-asserted-by":"crossref","unstructured":"Perdana, R.H.Y., Nguyen, T.V., and An, B. (2021, January 17\u201320). Deep Learning-based Power Allocation in Massive MIMO Systems with SLNR and SINR Criterions. Proceedings of the 2021 Twelfth International Conference on Ubiquitous and Future Networks (ICUFN), Jeju Island, Korea.","DOI":"10.1109\/ICUFN49451.2021.9528565"},{"key":"ref_27","doi-asserted-by":"crossref","unstructured":"Perdana, R.H.Y., Nguyen, T.V., and An, B. (2022). Deep neural network design with SLNR and SINR criterions for downlink power allocation in multi-cell multi-user massive MIMO systems. ICT Express.","DOI":"10.1016\/j.icte.2022.01.011"},{"key":"ref_28","doi-asserted-by":"crossref","first-page":"2259","DOI":"10.1007\/s11432-009-0214-6","article-title":"Leakage-based user scheduling in MU-MIMO broadcast channel","volume":"52","author":"Xia","year":"2009","journal-title":"Sci. China Ser. F Inf. Sci."},{"key":"ref_29","doi-asserted-by":"crossref","unstructured":"Xia, X., Wu, G., Fang, S., and Li, S. (2010, January 12\u201314). SINR or SLNR: In successive user scheduling in mu-mimo broadcast channel with finite rate feedback. Proceedings of the 2010 International Conference on Communications and Mobile Computing, Shenzhen, China.","DOI":"10.1109\/CMC.2010.295"},{"key":"ref_30","doi-asserted-by":"crossref","unstructured":"Naeem, M., Khan, M.U., Bashir, S., and Syed, A.A. (2015, January 14\u201316). Modified leakage based user selection for MU-MIMO systems. Proceedings of the 2015 13th International Conference on Frontiers of Information Technology (FIT), Islamabad, Pakistan.","DOI":"10.1109\/FIT.2015.25"},{"key":"ref_31","unstructured":"Zhao, L., Li, B., Meng, K., Gong, B., and Zhou, Y. (2013, January 8\u201311). A novel user scheduling for multiuser MIMO systems with block diagonalization. Proceedings of the 2013 IEEE 24th Annual International Symposium on Personal, Indoor, and Mobile Radio Communications (PIMRC), London, UK."},{"key":"ref_32","doi-asserted-by":"crossref","first-page":"1411","DOI":"10.1007\/s11277-019-06222-3","article-title":"A near optimal scheduling algorithm for efficient radio resource management in multi-user MIMO systems","volume":"106","author":"Naeem","year":"2019","journal-title":"Wirel. Pers. Commun."},{"key":"ref_33","doi-asserted-by":"crossref","unstructured":"Sharifi, S. (2022). A POMDP Framework for Antenna Selection and User Scheduling in Multi-User Massive MIMO Systems. [Ph.D. Thesis, Ontario Tech University].","DOI":"10.1109\/TCOMM.2022.3227304"},{"key":"ref_34","doi-asserted-by":"crossref","first-page":"1317","DOI":"10.55730\/1300-0632.3851","article-title":"Binary flower pollination algorithm based user scheduling for multiuser MIMOsystems","volume":"30","author":"Mohanty","year":"2022","journal-title":"Turk. J. Electr. Eng. Comput. Sci."},{"key":"ref_35","first-page":"2399","article-title":"Machine learning based hybrid precoder with user scheduling technique for maximizing sum rate in downlink MU-MIMO system","volume":"14","author":"Rajarajeswarie","year":"2022","journal-title":"Int. J. Inf. Technol."},{"key":"ref_36","first-page":"102","article-title":"Bayesian Reinforcement Learning: A Survey","volume":"8","author":"Ghavamzadeh","year":"2016","journal-title":"Found. Trends Mach. Learn."},{"key":"ref_37","doi-asserted-by":"crossref","unstructured":"Naeem, M., De Pietro, G., and Coronato, A. (2021). Application of reinforcement learning and deep learning in multiple-input and multiple-output (MIMO) systems. Sensors, 22.","DOI":"10.3390\/s22010309"},{"key":"ref_38","doi-asserted-by":"crossref","unstructured":"He, S., Du, J., and Liao, Y. (2021). Multi-User Scheduling for 6G V2X Ultra-Massive MIMO System. Sensors, 21.","DOI":"10.3390\/s21206742"},{"key":"ref_39","doi-asserted-by":"crossref","unstructured":"Lee, B.O., Je, H.W., Sohn, I., Shin, O.S., and Lee, K.B. (December, January 30). Interference-Aware Decentralized Precoding for Multicell MIMO TDD Systems. Proceedings of the IEEE GLOBECOM 2008\u20142008 IEEE Global Telecommunications Conference, New Orleans, LA, USA.","DOI":"10.1109\/GLOCOM.2008.ECP.858"},{"key":"ref_40","unstructured":"Dearden, R., Friedman, N., and Andre, D. (2013). Model-Based Bayesian Exploration. arXiv."},{"key":"ref_41","doi-asserted-by":"crossref","unstructured":"Tse, D., and Viswanath, P. (2005). Fundamentals of Wireless Communication, Cambridge University Press.","DOI":"10.1017\/CBO9780511807213"},{"key":"ref_42","doi-asserted-by":"crossref","first-page":"528","DOI":"10.1109\/JSAC.2005.862421","article-title":"On the optimality of multiantenna broadcast scheduling using zero-forcing beamforming","volume":"24","author":"Yoo","year":"2006","journal-title":"IEEE J. Sel. Areas Commun."}],"container-title":["Sensors"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/1424-8220\/22\/21\/8278\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T01:05:11Z","timestamp":1760144711000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/1424-8220\/22\/21\/8278"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,10,28]]},"references-count":42,"journal-issue":{"issue":"21","published-online":{"date-parts":[[2022,11]]}},"alternative-id":["s22218278"],"URL":"https:\/\/doi.org\/10.3390\/s22218278","relation":{},"ISSN":["1424-8220"],"issn-type":[{"type":"electronic","value":"1424-8220"}],"subject":[],"published":{"date-parts":[[2022,10,28]]}}}