{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,6,2]],"date-time":"2026-06-02T08:43:31Z","timestamp":1780389811118,"version":"3.54.1"},"reference-count":34,"publisher":"MDPI AG","issue":"21","license":[{"start":{"date-parts":[[2022,10,24]],"date-time":"2022-10-24T00:00:00Z","timestamp":1666569600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"Guangxi Natural Science Foundation","award":["2020GXNSFDA238001"],"award-info":[{"award-number":["2020GXNSFDA238001"]}]},{"name":"Guangxi Natural Science Foundation","award":["2020KY05033"],"award-info":[{"award-number":["2020KY05033"]}]},{"DOI":"10.13039\/501100012434","name":"Middle-aged and Young Teachers\u2019 Basic Ability Promotion Project of Guangxi","doi-asserted-by":"publisher","award":["2020GXNSFDA238001"],"award-info":[{"award-number":["2020GXNSFDA238001"]}],"id":[{"id":"10.13039\/501100012434","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100012434","name":"Middle-aged and Young Teachers\u2019 Basic Ability Promotion Project of Guangxi","doi-asserted-by":"publisher","award":["2020KY05033"],"award-info":[{"award-number":["2020KY05033"]}],"id":[{"id":"10.13039\/501100012434","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Sensors"],"abstract":"<jats:p>Optical transport networks (OTNs) are widely used in backbone- and metro-area transmission networks to increase network transmission capacity. In the OTN, it is particularly crucial to rationally allocate routes and maximize network capacities. By employing deep reinforcement learning (DRL)- and software-defined networking (SDN)-based solutions, the capacity of optical networks can be effectively increased. However, because most DRL-based routing optimization methods have low sample usage and difficulty in coping with sudden network connectivity changes, converging in software-defined OTN scenarios is challenging. Additionally, the generalization ability of these methods is weak. This paper proposes an ensembles- and message-passing neural-network-based Deep Q-Network (EMDQN) method for optical network routing optimization to address this problem. To effectively explore the environment and improve agent performance, the multiple EMDQN agents select actions based on the highest upper-confidence bounds. Furthermore, the EMDQN agent captures the network\u2019s spatial feature information using a message passing neural network (MPNN)-based DRL policy network, which enables the DRL agent to have generalization capability. The experimental results show that the EMDQN algorithm proposed in this paper performs better in terms of convergence. EMDQN effectively improves the throughput rate and link utilization of optical networks and has better generalization capabilities.<\/jats:p>","DOI":"10.3390\/s22218139","type":"journal-article","created":{"date-parts":[[2022,10,24]],"date-time":"2022-10-24T10:09:23Z","timestamp":1666606163000},"page":"8139","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":22,"title":["A Routing Optimization Method for Software-Defined Optical Transport Networks Based on Ensembles and Reinforcement Learning"],"prefix":"10.3390","volume":"22","author":[{"ORCID":"https:\/\/orcid.org\/0000-0001-6516-9067","authenticated-orcid":false,"given":"Junyan","family":"Chen","sequence":"first","affiliation":[{"name":"School of Computer Science and Information Security, Guilin University of Electronic Technology, Guilin 541004, China"},{"name":"School of Computer Science and Engineering, Northeastern University, Shenyang 110819, China"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Wei","family":"Xiao","sequence":"additional","affiliation":[{"name":"School of Computer Science and Information Security, Guilin University of Electronic Technology, Guilin 541004, China"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Xinmei","family":"Li","sequence":"additional","affiliation":[{"name":"School of Computer Science and Information Security, Guilin University of Electronic Technology, Guilin 541004, China"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Yang","family":"Zheng","sequence":"additional","affiliation":[{"name":"Institute of Automation, Chinese Academy of Sciences, Beijing 100190, China"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Xuefeng","family":"Huang","sequence":"additional","affiliation":[{"name":"School of Computer Science and Information Security, Guilin University of Electronic Technology, Guilin 541004, China"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Danli","family":"Huang","sequence":"additional","affiliation":[{"name":"School of Computer Science and Information Security, Guilin University of Electronic Technology, Guilin 541004, China"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Min","family":"Wang","sequence":"additional","affiliation":[{"name":"School of Computer Science and Information Security, Guilin University of Electronic Technology, Guilin 541004, China"}],"role":[{"vocabulary":"crossref","role":"author"}]}],"member":"1968","published-online":{"date-parts":[[2022,10,24]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","first-page":"200","DOI":"10.1016\/j.jnca.2016.12.019","article-title":"Quality of service (QoS) in software defined networking (SDN)","volume":"80","author":"Karakus","year":"2017","journal-title":"J. Netw. Comput. Appl."},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"6242","DOI":"10.1109\/JIOT.2019.2960033","article-title":"Deep-Reinforcement-Learning-Based QoS-Aware Secure Routing for SDN-IoT","volume":"7","author":"Guo","year":"2020","journal-title":"IEEE Internet Things J."},{"key":"ref_3","doi-asserted-by":"crossref","unstructured":"Sun, P., Lan, J., Guo, Z., Xu, Y., and Hu, Y. (2020, January 4\u20138). Improving the Scalability of Deep Reinforcement Learning-Based Routing with Control on Partial Nodes. Proceedings of the 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2020), Barcelona, Spain.","DOI":"10.1109\/ICASSP40776.2020.9054483"},{"key":"ref_4","doi-asserted-by":"crossref","first-page":"1048","DOI":"10.1109\/TCCN.2021.3102971","article-title":"Federated Deep Reinforcement Learning for Traffic Monitoring in SDN-Based IoT Networks","volume":"7","author":"Nguyen","year":"2021","journal-title":"IEEE Trans. Cogn. Commun. Netw."},{"key":"ref_5","unstructured":"Gilmer, J., Schoenholz, S.S., Riley, P.F., Vinyals, O., and Dahl, G.E. (2017, January 4\u201311). Neural message passing for quantum chemistry. Proceedings of the 34th International Conference on Machine Learning (ICML 2017), Sydney, Australia."},{"key":"ref_6","unstructured":"Mnih, V., Kavukcuoglu, K., Silver, D., Graves, A., Antonoglou, I., Wierstra, D., and Riedmiller, M. (2013). Playing Atari with deep reinforcement learning. arXiv."},{"key":"ref_7","doi-asserted-by":"crossref","unstructured":"Ali Khan, A., Zafrullah, M., Hussain, M., and Ahmad, A. (2017, January 19\u201322). Performance analysis of OSPF and hybrid networks. Proceedings of the International Symposium on Wireless Systems and Networks (ISWSN 2017), Lahore, Pakistan.","DOI":"10.1109\/ISWSN.2017.8250022"},{"key":"ref_8","doi-asserted-by":"crossref","first-page":"779","DOI":"10.1109\/TNET.2016.2614247","article-title":"Traffic engineering with Equal-Cost-Multipath: An algorithmic perspective","volume":"25","author":"Chiesa","year":"2017","journal-title":"IEEE\/ACM Trans. Netw."},{"key":"ref_9","doi-asserted-by":"crossref","first-page":"108330","DOI":"10.1016\/j.knosys.2022.108330","article-title":"Dynamic placement of multiple controllers based on SDN and allocation of computational resources based on heuristic ant colony algorithm","volume":"241","author":"Li","year":"2022","journal-title":"Knowl. Based Syst."},{"key":"ref_10","doi-asserted-by":"crossref","unstructured":"Di Stefano, A., Cammarata, G., Morana, G., and Zito, D. (2015, January 4\u20136). A4SDN\u2014Adaptive Alienated Ant Algorithm for Software-Defined Networking. Proceedings of the 10th International Conference on P2P, Parallel, Grid, Cloud and Internet Computing (3PGCIC 2015), Krakow, Poland.","DOI":"10.1109\/3PGCIC.2015.120"},{"key":"ref_11","doi-asserted-by":"crossref","unstructured":"Chen, F., and Zheng, X. (2015). Machine-learning based routing pre-plan for sdn. International Workshop on Multi-Disciplinary Trends in Artificial Intelligence, Springer.","DOI":"10.1007\/978-3-319-26181-2_14"},{"key":"ref_12","doi-asserted-by":"crossref","first-page":"102426","DOI":"10.1016\/j.yofte.2020.102426","article-title":"Heuristic planning algorithm for sharing restoration interfaces in OTN over DWDM networks","volume":"61","author":"Xavier","year":"2021","journal-title":"Opt. Fiber Technol."},{"key":"ref_13","unstructured":"Fang, C., Feng, C., and Chen, X. (2010, January 14\u201315). A heuristic algorithm for minimum cost multicast routing in OTN network. Proceedings of the 19th Annual Wireless and Optical Communications Conference (WOCC 2010), Shanghai, China."},{"key":"ref_14","first-page":"8354150","article-title":"ALBLP: Adaptive Load-Balancing Architecture Based on Link-State Prediction in Software-Defined Networking","volume":"2022","author":"Chen","year":"2022","journal-title":"Wirel. Commun. Mob. Comput."},{"key":"ref_15","doi-asserted-by":"crossref","unstructured":"Yan, M., Li, S., Chan, C.A., Shen, Y., and Yu, Y. (2021). Mobility Prediction Using a Weighted Markov Model Based on Mobile User Classification. Sensors, 21.","DOI":"10.3390\/s21051740"},{"key":"ref_16","doi-asserted-by":"crossref","first-page":"281","DOI":"10.1049\/cit2.12003","article-title":"SDN-based intrusion detection system for IoT using deep learning classifier (IDSIoT-SDL)","volume":"6","author":"Wani","year":"2021","journal-title":"CAAI Trans. Intell. Technol."},{"key":"ref_17","doi-asserted-by":"crossref","first-page":"8173","DOI":"10.1109\/JIOT.2020.3042901","article-title":"Anypath Routing Protocol Design via Q-Learning for Underwater Sensor Networks","volume":"8","author":"Zhou","year":"2021","journal-title":"IEEE Internet Things J."},{"key":"ref_18","doi-asserted-by":"crossref","unstructured":"Jalil, S.Q., Rehmani, M., and Chalup, S. (2020, January 19\u201324). DQR: Deep Q-Routing in Software Defined Networks. Proceedings of the 2020 International Joint Conference on Neural Networks (IJCNN), Glasgow, UK.","DOI":"10.1109\/IJCNN48605.2020.9206767"},{"key":"ref_19","doi-asserted-by":"crossref","first-page":"107891","DOI":"10.1016\/j.comnet.2021.107891","article-title":"ScaleDRL: A scalable deep reinforcement learning approach for traffic engineering in SDN with pinning control","volume":"190","author":"Sun","year":"2021","journal-title":"Comput. Netw."},{"key":"ref_20","first-page":"93","article-title":"SDN Routing Optimization Algorithm Based on Reinforcement Learning","volume":"57","author":"Che","year":"2021","journal-title":"Comput. Eng. Appl."},{"key":"ref_21","doi-asserted-by":"crossref","unstructured":"Su\u00e1rez-Varela, J., Mestres, A., Yu, J., Kuang, L., Feng, H., Barlet-Ros, P., and Cabellos-Aparicio, A. (2019, January 3\u20137). Routing based on deep reinforcement learning in optical transport networks. Proceedings of the 2019 Optical Fiber Communications Conference and Exhibition (OFC), San Diego, CA, USA.","DOI":"10.1364\/OFC.2019.M2A.6"},{"key":"ref_22","unstructured":"Haarnoja, T., Zhou, A., Abbeel, P., and Levine, S. (2018, January 10\u201315). Soft actor-critic: Off-policy maximum entropy deep reinforcement learning with a stochastic actor. Proceedings of the 35th International Conference on Machine Learning (ICML 2018), Stockholm, Sweden."},{"key":"ref_23","unstructured":"Kumar, A., Fu, J., Soh, M., Tucker, G., and Levine, S. (2019, January 8\u201314). Stabilizing off-policy Q-learning via bootstrapping error reduction. Proceedings of the 33rd International Conference on Neural Information Processing Systems, Vancouver, BC, Canada."},{"key":"ref_24","first-page":"6742","article-title":"Double reinforcement learning for efficient off-policy evaluation in Markov decision processes","volume":"21","author":"Kallus","year":"2002","journal-title":"J. Mach. Learn. Res."},{"key":"ref_25","doi-asserted-by":"crossref","first-page":"167","DOI":"10.1049\/cit2.12043","article-title":"Target-driven visual navigation in indoor scenes using reinforcement learning and imitation learning","volume":"7","author":"Qiang","year":"2022","journal-title":"CAAI Trans. Intell. Technol."},{"key":"ref_26","unstructured":"Agarwal, R., Schuurmans, D., and Norouzi, M. (2020, January 13\u201318). An optimistic perspective on offline reinforcement learning. Proceedings of the 37th International Conference on Machine Learning (ICML 2020), Virtual Event."},{"key":"ref_27","doi-asserted-by":"crossref","unstructured":"Shahri, E., Pedreiras, P., and Almeida, L. (2022). Extending MQTT with Real-Time Communication Services Based on SDN. Sensors, 22.","DOI":"10.3390\/s22093162"},{"key":"ref_28","unstructured":"Almasan, P., Su\u00e1rez-Varela, J., Badia-Sampera, A., Rusek, K., Barlet-Ros, P., and Cabellos-Aparicio, A. (2020). Deep Reinforcement Learning meets Graph Neural Networks: Exploring a routing optimization use case. arXiv."},{"key":"ref_29","doi-asserted-by":"crossref","first-page":"241","DOI":"10.1007\/s11704-019-8208-z","article-title":"A survey on ensemble learning","volume":"14","author":"Dong","year":"2020","journal-title":"Front. Comput. Sci."},{"key":"ref_30","doi-asserted-by":"crossref","first-page":"98","DOI":"10.1016\/j.neucom.2020.09.042","article-title":"A new method of data missing estimation with FNN-based tensor heterogeneous ensemble learning for internet of vehicle","volume":"420","author":"Zhang","year":"2021","journal-title":"Neurocomputing"},{"key":"ref_31","doi-asserted-by":"crossref","first-page":"321","DOI":"10.1080\/13658816.2020.1808897","article-title":"A comparative study of heterogeneous ensemble-learning techniques for landslide susceptibility mapping","volume":"35","author":"Fang","year":"2021","journal-title":"Int. J. Geogr. Inf. Sci."},{"key":"ref_32","doi-asserted-by":"crossref","unstructured":"Lei, L., Kou, L., Zhan, X., Zhang, J., and Ren, Y. (2022). An Anomaly Detection Algorithm Based on Ensemble Learning for 5G Environment. Sensors, 22.","DOI":"10.3390\/s22197436"},{"key":"ref_33","doi-asserted-by":"crossref","first-page":"81","DOI":"10.1109\/35.900635","article-title":"Issues for routing in the optical layer","volume":"39","author":"Strand","year":"2001","journal-title":"IEEE Commun. Mag."},{"key":"ref_34","unstructured":"Chen, R., Sidor, S., Abbeel, P., and Schulman, J. (2017). UCB exploration via Q-ensembles. arXiv."}],"container-title":["Sensors"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/1424-8220\/22\/21\/8139\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T01:01:48Z","timestamp":1760144508000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/1424-8220\/22\/21\/8139"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,10,24]]},"references-count":34,"journal-issue":{"issue":"21","published-online":{"date-parts":[[2022,11]]}},"alternative-id":["s22218139"],"URL":"https:\/\/doi.org\/10.3390\/s22218139","relation":{},"ISSN":["1424-8220"],"issn-type":[{"value":"1424-8220","type":"electronic"}],"subject":[],"published":{"date-parts":[[2022,10,24]]}}}