{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,5,6]],"date-time":"2026-05-06T15:42:38Z","timestamp":1778082158632,"version":"3.51.4"},"reference-count":22,"publisher":"MDPI AG","issue":"10","license":[{"start":{"date-parts":[[2023,5,17]],"date-time":"2023-05-17T00:00:00Z","timestamp":1684281600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"DMTC and Thales","award":["\u201cNetworked 268 FAST Collaboration\u201d with RMIT internal project code RE-03662."],"award-info":[{"award-number":["\u201cNetworked 268 FAST Collaboration\u201d with RMIT internal project code RE-03662."]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Sensors"],"abstract":"<jats:p>The advent of the Internet of Things (IoT) has triggered an increased demand for sensing devices with multiple integrated wireless transceivers. These platforms often support the advantageous use of multiple radio technologies to exploit their differing characteristics. Intelligent radio selection techniques allow these systems to become highly adaptive, ensuring more robust and reliable communications under dynamic channel conditions. In this paper, we focus on the wireless links between devices equipped by deployed operating personnel and intermediary access-point infrastructure. We use multi-radio platforms and wireless devices with multiple and diverse transceiver technologies to produce robust and reliable links through the adaptive control of available transceivers. In this work, the term \u2018robust\u2019 refers to communications that can be maintained despite changes in the environmental and radio conditions, i.e., during periods of interference caused by non-cooperative actors or multi-path or fading conditions in the physical environment. In this paper, a multi-objective reinforcement learning (MORL) framework is applied to address a multi-radio selection and power control problem. We propose independent reward functions to manage the trade-off between the conflicting objectives of minimised power consumption and maximised bit rate. We also adopt an adaptive exploration strategy for learning a robust behaviour policy and compare its online performance to conventional methods. An extension to the multi-objective state\u2013action\u2013reward\u2013state\u2013action (SARSA) algorithm is proposed to implement this adaptive exploration strategy. When applying adaptive exploration to the extended multi-objective SARSA algorithm, we achieve a 20% increase in the F1 score in comparison to one with decayed exploration policies.<\/jats:p>","DOI":"10.3390\/s23104821","type":"journal-article","created":{"date-parts":[[2023,5,18]],"date-time":"2023-05-18T07:03:58Z","timestamp":1684393438000},"page":"4821","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":4,"title":["Reinforcement-Learning-Based Robust Resource Management for Multi-Radio Systems"],"prefix":"10.3390","volume":"23","author":[{"ORCID":"https:\/\/orcid.org\/0000-0001-9380-8532","authenticated-orcid":false,"given":"James","family":"Delaney","sequence":"first","affiliation":[{"name":"Manufacturing, Materials and Mechatronics, School of Engineering, STEM College, RMIT University, 124 La Trobe St., Melbourne, VIC 3000, Australia"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-9557-3623","authenticated-orcid":false,"given":"Steve","family":"Dowey","sequence":"additional","affiliation":[{"name":"Manufacturing, Materials and Mechatronics, School of Engineering, STEM College, RMIT University, 124 La Trobe St., Melbourne, VIC 3000, Australia"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-3306-6148","authenticated-orcid":false,"given":"Chi-Tsun","family":"Cheng","sequence":"additional","affiliation":[{"name":"Manufacturing, Materials and Mechatronics, School of Engineering, STEM College, RMIT University, 124 La Trobe St., Melbourne, VIC 3000, Australia"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"1968","published-online":{"date-parts":[[2023,5,17]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","first-page":"1303","DOI":"10.1109\/LCOMM.2020.3048515","article-title":"Distributed Multi-Radio Access Control for Decentralized OFDMA Multi-RAT Wireless Networks","volume":"25","author":"Chae","year":"2021","journal-title":"IEEE Commun. Lett."},{"key":"ref_2","doi-asserted-by":"crossref","unstructured":"Hassan, W., and Farag, T. (2020). Adaptive Allocation Algorithm for Multi-Radio Multi-Channel Wireless Mesh Networks. Future Internet, 12.","DOI":"10.3390\/fi12080127"},{"key":"ref_3","doi-asserted-by":"crossref","unstructured":"P\u00e9rez, E., Parada, R., and Monzo, C. (2022). Global Emergency System Based on WPAN and LPWAN Hybrid Networks. Sensors, 22.","DOI":"10.3390\/s22207921"},{"key":"ref_4","doi-asserted-by":"crossref","first-page":"665","DOI":"10.1007\/s12243-018-0626-7","article-title":"Cognitive-Based Multi-Radio Prototype for Industrial Environment","volume":"73","author":"Ligios","year":"2018","journal-title":"Ann. Telecommun."},{"key":"ref_5","doi-asserted-by":"crossref","first-page":"6446","DOI":"10.1109\/TVT.2018.2805190","article-title":"Optimal Radio Access Technology Selection Algorithm for LTE-WiFi Network","volume":"67","author":"Roy","year":"2018","journal-title":"IEEE Trans. Veh. Technol."},{"key":"ref_6","doi-asserted-by":"crossref","first-page":"4539","DOI":"10.1109\/TVT.2018.2793186","article-title":"Smart Multi-RAT Access Based on Multiagent Reinforcement Learning","volume":"67","author":"Yan","year":"2018","journal-title":"IEEE Trans. Veh. Technol."},{"key":"ref_7","doi-asserted-by":"crossref","unstructured":"Chincoli, M., and Liotta, A. (2018). Self-Learning Power Control in Wireless Sensor Networks. Sensors, 18.","DOI":"10.3390\/s18020375"},{"key":"ref_8","doi-asserted-by":"crossref","first-page":"1094","DOI":"10.1109\/JSAC.2010.100914","article-title":"An Adaptive Link Layer for Heterogeneous Multi-Radio Mobile Sensor Networks","volume":"28","author":"Gummeson","year":"2010","journal-title":"IEEE J. Sel. Areas Commun."},{"key":"ref_9","doi-asserted-by":"crossref","first-page":"21645","DOI":"10.1109\/ACCESS.2019.2898205","article-title":"Intelligent User-Centric Network Selection: A Model-Driven Reinforcement Learning Framework","volume":"7","author":"Wang","year":"2019","journal-title":"IEEE Access"},{"key":"ref_10","doi-asserted-by":"crossref","first-page":"385","DOI":"10.1109\/TSMC.2014.2358639","article-title":"Multiobjective Reinforcement Learning: A Comprehensive Overview","volume":"45","author":"Liu","year":"2015","journal-title":"IEEE Trans. Syst. Man Cybern. Syst."},{"key":"ref_11","doi-asserted-by":"crossref","first-page":"483","DOI":"10.1002\/wcm.72","article-title":"A Survey of Mobility Models for Ad Hoc Network Research","volume":"2","author":"Camp","year":"2002","journal-title":"Wirel. Commun. Mob. Comput."},{"key":"ref_12","doi-asserted-by":"crossref","first-page":"357","DOI":"10.1038\/s41586-020-2649-2","article-title":"Array programming with NumPy","volume":"585","author":"Harris","year":"2020","journal-title":"Nature"},{"key":"ref_13","doi-asserted-by":"crossref","first-page":"90","DOI":"10.1109\/MCSE.2007.55","article-title":"Matplotlib: A 2D graphics environment","volume":"9","author":"Hunter","year":"2007","journal-title":"Comput. Sci. Eng."},{"key":"ref_14","unstructured":"Wes McKinney (July, January 28). Data Structures for Statistical Computing in Python. Proceedings of the 9th Python in Science Conference, Austin, TX, USA."},{"key":"ref_15","unstructured":"Brockman, G., Cheung, V., Pettersson, L., Schneider, J., Schulman, J., Tang, J., and Zaremba, W. (2016). Openai gym. arXiv."},{"key":"ref_16","unstructured":"(1998). Standard for Information Technology\u2014Telecommunications and Information Exchange between Systems\u2014Local and Metropolitan Area Networks\u2014Specific Requirements\u2014Part 11: Wireless LAN Medium Access Control (MAC) and Physical Layer (PHY) Specifications (Standard No. ANSI\/IEEE Std 802.11)."},{"key":"ref_17","unstructured":"(2006). IEEE Standard for Information Technology\u2014Local and Metropolitan Area Networks\u2014Specific Requirements\u2014Part 15.4: Wireless Medium Access Control (MAC) and Physical Layer (PHY) Specifications for Low Rate Wireless Personal Area Networks (WPANs) (Standard No. IEEE Std 802.15.4-2006)."},{"key":"ref_18","doi-asserted-by":"crossref","first-page":"541","DOI":"10.1016\/j.neunet.2010.01.001","article-title":"Online Learning of Shaping Rewards in Reinforcement Learning","volume":"23","author":"Kudenko","year":"2010","journal-title":"Neural Netw."},{"key":"ref_19","unstructured":"Tokic, M. (2010). Proceedings of the Annual Conference on Artificial Intelligence, Springer."},{"key":"ref_20","unstructured":"Sprague, N., and Ballard, D. (2003, January 12\u201314). Multiple-Goal Reinforcement Learning with Modular Sarsa(O). Proceedings of the 18th International Joint Conference on Artificial Intelligence, IJCAI\u201903, San Francisco, CA, USA,."},{"key":"ref_21","first-page":"335","article-title":"Value-Difference Based Exploration: Adaptive Control between Epsilon-Greedy and Softmax","volume":"Volume 7006","author":"Tokic","year":"2011","journal-title":"KI 2011: Advances in Artificial Intelligence"},{"key":"ref_22","doi-asserted-by":"crossref","first-page":"427","DOI":"10.1016\/j.ipm.2009.03.002","article-title":"A Systematic Analysis of Performance Measures for Classification Tasks","volume":"45","author":"Sokolova","year":"2009","journal-title":"Inf. Process. Manag."}],"container-title":["Sensors"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/1424-8220\/23\/10\/4821\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,10]],"date-time":"2025-10-10T19:36:36Z","timestamp":1760124996000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/1424-8220\/23\/10\/4821"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,5,17]]},"references-count":22,"journal-issue":{"issue":"10","published-online":{"date-parts":[[2023,5]]}},"alternative-id":["s23104821"],"URL":"https:\/\/doi.org\/10.3390\/s23104821","relation":{},"ISSN":["1424-8220"],"issn-type":[{"value":"1424-8220","type":"electronic"}],"subject":[],"published":{"date-parts":[[2023,5,17]]}}}