{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,1]],"date-time":"2026-02-01T21:33:26Z","timestamp":1769981606594,"version":"3.49.0"},"reference-count":47,"publisher":"MDPI AG","issue":"10","license":[{"start":{"date-parts":[[2019,10,14]],"date-time":"2019-10-14T00:00:00Z","timestamp":1571011200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"Horizon 2020 Project NEWTON","award":["ICT-688503"],"award-info":[{"award-number":["ICT-688503"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Information"],"abstract":"<jats:p>Due to large-scale control problems in 5G access networks, the complexity of radio resource management is expected to increase significantly. Reinforcement learning is seen as a promising solution that can enable intelligent decision-making and reduce the complexity of different optimization problems for radio resource management. The packet scheduler is an important entity of radio resource management that allocates users\u2019 data packets in the frequency domain according to the implemented scheduling rule. In this context, by making use of reinforcement learning, we could actually determine, in each state, the most suitable scheduling rule to be employed that could improve the quality of service provisioning. In this paper, we propose a reinforcement learning-based framework to solve scheduling problems with the main focus on meeting the user fairness requirements. This framework makes use of feed forward neural networks to map momentary states to proper parameterization decisions for the proportional fair scheduler. The simulation results show that our reinforcement learning framework outperforms the conventional adaptive schedulers oriented on fairness objective. Discussions are also raised to determine the best reinforcement learning algorithm to be implemented in the proposed framework based on various scheduler settings.<\/jats:p>","DOI":"10.3390\/info10100315","type":"journal-article","created":{"date-parts":[[2019,10,14]],"date-time":"2019-10-14T12:14:05Z","timestamp":1571055245000},"page":"315","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":14,"title":["A Comparison of Reinforcement Learning Algorithms in Fairness-Oriented OFDMA Schedulers"],"prefix":"10.3390","volume":"10","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-9121-0286","authenticated-orcid":false,"given":"Ioan-Sorin","family":"Com\u0219a","sequence":"first","affiliation":[{"name":"Department of Computer Science, Brunel University London, Kingston Lane, London UB8 3PH, UK"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Sijing","family":"Zhang","sequence":"additional","affiliation":[{"name":"School of Computer Science and Technology, University of Bedfordshire, Luton LU1 3JU, UK"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-4890-5648","authenticated-orcid":false,"given":"Mehmet","family":"Aydin","sequence":"additional","affiliation":[{"name":"Department of Computer Science and Creative Technologies, University of the West of England, Bristol BS16 1QY, UK"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Pierre","family":"Kuonen","sequence":"additional","affiliation":[{"name":"Department of Communications and Information Technology, HEIA-FR, CH-1700 Fribourg, Switzerland"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Ramona","family":"Trestian","sequence":"additional","affiliation":[{"name":"Faculty of Science and Technology, Middlesex University London, Hendon, London NW4 4BT, UK"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-2578-5580","authenticated-orcid":false,"given":"Gheorghi\u021b\u0103","family":"Ghinea","sequence":"additional","affiliation":[{"name":"Department of Computer Science, Brunel University London, Kingston Lane, London UB8 3PH, UK"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"1968","published-online":{"date-parts":[[2019,10,14]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","first-page":"1065","DOI":"10.1109\/JSAC.2014.2328098","article-title":"What Will 5G Be?","volume":"3","author":"Andrews","year":"2014","journal-title":"IEEE J. Sel. Areas Commun."},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"138","DOI":"10.1109\/MCOM.2018.1701031","article-title":"Learning Radio Resource Management in RANs: Framework, Opportunities, and Challenges","volume":"56","author":"Calabrese","year":"2018","journal-title":"IEEE Commun. Mag."},{"key":"ref_3","doi-asserted-by":"crossref","unstructured":"Comsa, I.-S., and Trestian, R. (2019). Information Science Reference. Next-Generation Wireless Networks Meet Advanced Machine Learning Applications, IGI Global.","DOI":"10.4018\/978-1-5225-7458-3"},{"key":"ref_4","unstructured":"Comsa, I.-S. (2014). Sustainable Scheduling Policies for Radio Access Networks Based on LTE Technology. [Ph.D. Thesis, University of Bedfordshire]."},{"key":"ref_5","doi-asserted-by":"crossref","unstructured":"Comsa, I.-S., Zhang, S., Aydin, M., Kuonen, P., Trestian, R., and Ghinea, G. (2019, January 24\u201326). Enhancing User Fairness in OFDMA Radio Access Networks Through Machine Learning. Proceedings of the 2019 Wireless Days (WD), Manchester, UK.","DOI":"10.1109\/WD.2019.8734262"},{"key":"ref_6","doi-asserted-by":"crossref","first-page":"678","DOI":"10.1109\/SURV.2012.060912.00100","article-title":"Downlink Packet Scheduling in LTE Cellular Networks: Key Design Issues and a Survey","volume":"15","author":"Capozzi","year":"2013","journal-title":"IEEE Commun. Surv. Tutor."},{"key":"ref_7","doi-asserted-by":"crossref","unstructured":"Comsa, I.-S., Zhang, S., Aydin, M., Chen, J., Kuonen, P., and Wagen, J.-F. (2014, January 8\u201312). Adaptive Proportional Fair Parameterization Based LTE Scheduling Using Continuous Actor-Critic Reinforcement Learning. Proceedings of the IEEE Global Communications Conference (GLOBECOM), Austin, TX, USA.","DOI":"10.1109\/GLOCOM.2014.7037498"},{"key":"ref_8","doi-asserted-by":"crossref","unstructured":"Comsa, I.-S., Aydin, M., Zhang, S., Kuonen, P., Wagen, J.-F., and Lu, Y. (2014, January 8\u201311). Scheduling Policies Based on Dynamic Throughput and Fairness Tradeoff Control in LTE-A Networks. Proceedings of the IEEE Local Computer Networks (LCN), Edmonton, AB, Canada.","DOI":"10.1109\/LCN.2014.6925806"},{"key":"ref_9","doi-asserted-by":"crossref","first-page":"5","DOI":"10.1109\/SURV.2013.050113.00015","article-title":"Fairness in Wireless Networks: Issues, Measures and Challenges","volume":"16","author":"Shi","year":"2014","journal-title":"IEEE Commun. Surv. Tutor."},{"key":"ref_10","unstructured":"Jain, R., Chiu, D., and Hawe, W. (1984). A Quantitative Measure of Fairness and Discrimination for Resource Allocation in Shared Computer Systems, Eastern Research Laboratory, Digital Equipment Corporation. Technical Report TR-301."},{"key":"ref_11","unstructured":"Next Generation of Mobile Networks (NGMN) (2019, October 12). NGMN Radio Access Performance Evaluation Methodology. In A White Paper by the NGMN Alliance. Available online: https:\/\/www.ngmn.org\/wp-content\/uploads\/NGMN_Radio_Access_Performance_Evaluation_Methodology.pdf."},{"key":"ref_12","unstructured":"Sutton, R.S., and Barto, A.G. (2018). Reinforcement Learning: An Introduction, MIT Press Cambridge."},{"key":"ref_13","doi-asserted-by":"crossref","first-page":"5835","DOI":"10.1109\/TWC.2016.2571695","article-title":"SON Coordination in Heterogeneous Networks: A Reinforcement Learning Framework","volume":"15","author":"Iacoboaiea","year":"2016","journal-title":"IEEE Trans. Wirel. Commun."},{"key":"ref_14","doi-asserted-by":"crossref","first-page":"4589","DOI":"10.1109\/TVT.2014.2374237","article-title":"Learning based Frequency and Time-Domain Inter-Cell Interference Coordination in HetNets","volume":"64","author":"Simsek","year":"2015","journal-title":"IEEE Trans. Veh. Technol."},{"key":"ref_15","doi-asserted-by":"crossref","unstructured":"De Domenico, A., and Ktenas, D. (2018, January 15\u201318). Reinforcement Learning for Interference-Aware Cell DTX in Heterogeneous Networks. Proceedings of the IEEE Wireless Communications and Networking Conference (WCNC), Barcelona, Spain.","DOI":"10.1109\/WCNC.2018.8376993"},{"key":"ref_16","doi-asserted-by":"crossref","first-page":"3281","DOI":"10.1109\/TWC.2019.2912754","article-title":"Deep Reinforcement Learning-Based Modulation and Coding Scheme Selection in Cognitive Heterogeneous Networks","volume":"18","author":"Zhang","year":"2019","journal-title":"IEEE Trans. Wirel. Commun."},{"key":"ref_17","doi-asserted-by":"crossref","first-page":"310","DOI":"10.1109\/TWC.2018.2879433","article-title":"Deep Multi-User Reinforcement Learning for Distributed Dynamic Spectrum Access","volume":"18","author":"Naparstek","year":"2019","journal-title":"IEEE Trans. Wirel. Commun."},{"key":"ref_18","unstructured":"Comsa, I.-S., Aydin, M., Zhang, S., Kuonen, P., and Wagen, J.-F. (2011, January 10\u201310). Reinforcement Learning Based Radio Resource Scheduling in LTE-Advanced. Proceedings of the International Conference on Automation and Computing, Huddersfield, UK."},{"key":"ref_19","doi-asserted-by":"crossref","first-page":"39","DOI":"10.4018\/jdst.2012040103","article-title":"Multi Objective Resource Scheduling in LTE Networks Using Reinforcement Learning","volume":"3","author":"Comsa","year":"2012","journal-title":"Int. J. Distrib. Syst. Technol."},{"key":"ref_20","unstructured":"Comsa, I.-S., De-Domenico, A., and Ktenas, D. (2019). Method for Allocating Transmission Resources Using Reinforcement Learning. (Application No. US 2019\/0124667 A1), U.S. Patent."},{"key":"ref_21","doi-asserted-by":"crossref","first-page":"10800","DOI":"10.1109\/TVT.2018.2869305","article-title":"Design and Optimization of Scheduling and Non-Orthogonal Multiple Access Algorithms with Imperfect Channel State Information","volume":"67","author":"He","year":"2018","journal-title":"IEEE Trans. Veh. Technol."},{"key":"ref_22","doi-asserted-by":"crossref","first-page":"408","DOI":"10.1109\/JSTSP.2019.2903745","article-title":"A General Framework for Temporal Fair User Scheduling in NOMA Systems","volume":"13","author":"Shahsavari","year":"2019","journal-title":"IEEE J. Sel. Top. Signal Process."},{"key":"ref_23","doi-asserted-by":"crossref","first-page":"2922","DOI":"10.1109\/ACCESS.2015.2506261","article-title":"Fairness-Aware Non-Orthogonal Multi-User Access with Discrete Hierarchical Modulation for 5G Cellular Relay Networks","volume":"3","author":"Kaneko","year":"2015","journal-title":"IEEE Access"},{"key":"ref_24","doi-asserted-by":"crossref","first-page":"8333","DOI":"10.1109\/TVT.2017.2695400","article-title":"Self-Organizing Algorithms for Interference Coordination in Small Cell Networks","volume":"66","author":"Ahmed","year":"2017","journal-title":"IEEE Trans. Veh. Technol."},{"key":"ref_25","doi-asserted-by":"crossref","first-page":"2587","DOI":"10.1109\/TWC.2017.2667644","article-title":"Optimal Performance Versus Fairness Tradeoff for Resource Allocation in Wireless Systems","volume":"16","author":"Zabini","year":"2017","journal-title":"IEEE Trans. Wirel. Commun."},{"key":"ref_26","doi-asserted-by":"crossref","first-page":"3238","DOI":"10.1109\/JSYST.2017.2702109","article-title":"Fair-QoS Broker Algorithm for Overload-State Downlink Resource Scheduling in LTE Networks","volume":"12","author":"Ferdosian","year":"2018","journal-title":"IEEE Syst. J."},{"key":"ref_27","doi-asserted-by":"crossref","first-page":"945","DOI":"10.1109\/COMST.2018.2789722","article-title":"Seamless Multimedia Delivery Within a Heterogeneous Wireless Networks Environment: Are We There Yet?","volume":"20","author":"Trestian","year":"2018","journal-title":"IEEE Commun. Surv. Tutor."},{"key":"ref_28","doi-asserted-by":"crossref","unstructured":"Schwarz, S., Mehlfuhrer, C., and Rupp, M. (2011, January 5\u20139). Throughput Maximizing Multiuser Scheduling with Adjustable Fairness. Proceedings of the IEEE International Conference on Communications (ICC), Kyoto, Japan.","DOI":"10.1109\/icc.2011.5963489"},{"key":"ref_29","doi-asserted-by":"crossref","unstructured":"Proebster, M., Mueller, C.M., and Bakker, H. (2010, January 26\u201330). Adaptive Fairness Control for a Proportional Fair LTE Scheduler. Proceedings of the IEEE International Symposium on Personal, Indoor and Mobile Radio Communications (PIMRC), Instanbul, Turkey.","DOI":"10.1109\/PIMRC.2010.5671970"},{"key":"ref_30","unstructured":"Comsa, I.-S., Aydin, M., Zhang, S., Kuonen, P., and Wagen, J.-F. (2012, January 22\u201325). A Novel Dynamic Q-learning-based Scheduler Technique for LTE-advanced Technologies Using Neural Networks. Proceedings of the IEEE Local Computer Networks (LCN), Clearwater, FL, USA."},{"key":"ref_31","doi-asserted-by":"crossref","unstructured":"Comsa, I.-S., De-Domenico, A., and Ktenas, D. (2017, January 4\u20138). QoS-Driven Scheduling in 5G Radio Access Networks\u2014A Reinforcement Learning Approach. Proceedings of the IEEE Global Communications Conference (GLOBECOM), Singapore.","DOI":"10.1109\/GLOCOM.2017.8254926"},{"key":"ref_32","unstructured":"Comsa, I.-S., Trestian, R., and Ghinea, G. (June, January 29). 360\u2218 Mulsemedia Experience over Next Generation Wireless Networks\u2014A Reinforcement Learning Approach. Proceedings of the Tenth International Conference on Quality of Multimedia Experience (QoMEX), Cagliari, Italy."},{"key":"ref_33","doi-asserted-by":"crossref","first-page":"1661","DOI":"10.1109\/TNSM.2018.2863563","article-title":"Towards 5G: A Reinforcement Learning-based Scheduling Solution for Data Traffic Management","volume":"15","author":"Comsa","year":"2018","journal-title":"IEEE Trans. Net. Serv. Manag."},{"key":"ref_34","doi-asserted-by":"crossref","unstructured":"Comsa, I.S., and Trestian, R. (2019). Guaranteeing User Rates with Reinforcement Learning in 5G Radio Access Networks. Next-Generation Wireless Networks Meet Advanced Machine Learning Applications, IGI Global.","DOI":"10.4018\/978-1-5225-7458-3.ch008"},{"key":"ref_35","doi-asserted-by":"crossref","first-page":"498","DOI":"10.1109\/TVT.2010.2091660","article-title":"Simulating LTE cellular systems: An Open-Source Framework","volume":"60","author":"Piro","year":"2011","journal-title":"IEEE Trans. Veh. Technol."},{"key":"ref_36","doi-asserted-by":"crossref","first-page":"127","DOI":"10.1109\/MCOM.2005.1561930","article-title":"Utility-based Resource Allocation and Scheduling in OFDM-based Wireless Broadband Networks","volume":"43","author":"Song","year":"2005","journal-title":"IEEE Commun. Mag."},{"key":"ref_37","doi-asserted-by":"crossref","unstructured":"Comsa, I.S., and Trestian, R. (2019). Machine Learning in Radio Resource Scheduling. Next-Generation Wireless Networks Meet Advanced Machine Learning Applications, IGI Global.","DOI":"10.4018\/978-1-5225-7458-3"},{"key":"ref_38","doi-asserted-by":"crossref","unstructured":"Wiering, M.A., van Hasselt, H., Pietersma, A.-D., and Schomaker, L. (2011, January 11\u201315). Reinforcement Learning Algorithms for Solving classification Problems. Proceedings of the IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning, Paris, France.","DOI":"10.1109\/ADPRL.2011.5967372"},{"key":"ref_39","doi-asserted-by":"crossref","unstructured":"Szepesvari, C. (2010). Algorithms for Reinforcement Learning: Synthesis Lectures on Artificial Intelligence and Machine Learning, Morgan.","DOI":"10.1007\/978-3-031-01551-9"},{"key":"ref_40","unstructured":"Van Hasselt, H.P. (2011). Insights in Reinforcement Learning Formal Analysis and Empirical Evaluation of Temporal- Difference Learning Algorithms. [Ph.D. Thesis, University of Utrecht]."},{"key":"ref_41","first-page":"2613","article-title":"Double Q-learning","volume":"23","year":"2011","journal-title":"Adv. Neural. Inf. Process. Syst."},{"key":"ref_42","unstructured":"Rummery, G.A., and Niranjan, M. (1994). Online Q-Learning Using Connectionist Systems, University of Cambridge. Technical Note."},{"key":"ref_43","unstructured":"Wiering, M. (2005, January 20\u201321). QV(lambda)-Learning: A new On-Policy Reinforcement Learning Algorithm. Proceedings of the 7th European Workshop on Reinforcement Learning, Utrecht, The Netherlands."},{"key":"ref_44","unstructured":"Wiering, M.A., and van Hasselt, H. (April, January 30). The QV Family Compared to Other Reinforcement Learning Algorithms. Proceedings of the IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning, Nashville, TN, USA."},{"key":"ref_45","doi-asserted-by":"crossref","unstructured":"van Hasselt, H., and Wiering, M.A. (2009, January 14\u201319). Using Continuous Action Spaces to Solve Discrete Problems. Proceedings of the International Joint Conference on Neural Networks, Atlanta, GA, USA.","DOI":"10.1109\/IJCNN.2009.5178745"},{"key":"ref_46","doi-asserted-by":"crossref","first-page":"201","DOI":"10.1109\/TSMCC.2011.2106494","article-title":"Experience Replay for Real-Time Reinforcement Learning Control","volume":"42","author":"Adam","year":"2012","journal-title":"IEEE Trans. Syst. Man Cybern. Part C"},{"key":"ref_47","unstructured":"Proebster, M.C. (2016). Size-Based Scheduling to Improve the User Experience in Cellular Networks. [Ph.D. Thesis, Universit\u00e4t Stuttgart]."}],"container-title":["Information"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2078-2489\/10\/10\/315\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T13:26:16Z","timestamp":1760189176000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2078-2489\/10\/10\/315"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2019,10,14]]},"references-count":47,"journal-issue":{"issue":"10","published-online":{"date-parts":[[2019,10]]}},"alternative-id":["info10100315"],"URL":"https:\/\/doi.org\/10.3390\/info10100315","relation":{},"ISSN":["2078-2489"],"issn-type":[{"value":"2078-2489","type":"electronic"}],"subject":[],"published":{"date-parts":[[2019,10,14]]}}}