{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,7]],"date-time":"2026-02-07T10:54:37Z","timestamp":1770461677704,"version":"3.49.0"},"reference-count":28,"publisher":"MDPI AG","issue":"9","license":[{"start":{"date-parts":[[2022,4,27]],"date-time":"2022-04-27T00:00:00Z","timestamp":1651017600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"National Natural Science Foundation","award":["2021Z040"],"award-info":[{"award-number":["2021Z040"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Electronics"],"abstract":"<jats:p>Sea freight is one of the most important ways for the transportation and distribution of coal and other bulk cargo. This paper proposes a method for optimizing the scheduling efficiency of the bulk cargo loading process based on deep reinforcement learning. The process includes a large number of states and possible choices that need to be taken into account, which are currently performed by skillful scheduling engineers on site. In terms of modeling, we extracted important information based on actual working data of the terminal to form the state space of the model. The yard information and the demand information of the ship are also considered. The scheduling output of each convey path from the yard to the cabin is the action of the agent. To avoid conflicts of occupying one machine at same time, certain restrictions are placed on whether the action can be executed. Based on Double DQN, an improved deep reinforcement learning method is proposed with a fully connected network structure and selected action sets according to the value of the network and the occupancy status of environment. To make the network converge more quickly, an improved new epsilon-greedy exploration strategy is also proposed, which uses different exploration rates for completely random selection and feasible random selection of actions. After training, an improved scheduling result is obtained when the tasks arrive randomly and the yard state is random. An important contribution of this paper is to integrate the useful features of the working time of the bulk cargo terminal into a state set, divide the scheduling process into discrete actions, and then reduce the scheduling problem into simple inputs and outputs. Another major contribution of this article is the design of a reinforcement learning algorithm for the bulk cargo terminal scheduling problem, and the training efficiency of the proposed algorithm is improved, which provides a practical example for solving bulk cargo terminal scheduling problems using reinforcement learning.<\/jats:p>","DOI":"10.3390\/electronics11091390","type":"journal-article","created":{"date-parts":[[2022,4,27]],"date-time":"2022-04-27T13:40:57Z","timestamp":1651066857000},"page":"1390","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":17,"title":["Intelligent Scheduling Method for Bulk Cargo Terminal Loading Process Based on Deep Reinforcement Learning"],"prefix":"10.3390","volume":"11","author":[{"given":"Changan","family":"Li","sequence":"first","affiliation":[{"name":"Key Laboratory of Advanced Forging & Stamping Technology and Science of Ministry of Education of China, Yanshan University, Qinhuangdao 066004, China"},{"name":"Chnenergy (Tianjin) Port Co., Ltd., Tianjin 300450, China"}]},{"given":"Sirui","family":"Wu","sequence":"additional","affiliation":[{"name":"Research Institute of Intelligent Control and Systems, Harbin Institute of Technology, Harbin 150001, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-7601-4332","authenticated-orcid":false,"given":"Zhan","family":"Li","sequence":"additional","affiliation":[{"name":"Research Institute of Intelligent Control and Systems, Harbin Institute of Technology, Harbin 150001, China"},{"name":"Ningbo Institute of Intelligent Equipment Technology Co., Ltd., Ningbo 315201, China"}]},{"given":"Yuxiao","family":"Zhang","sequence":"additional","affiliation":[{"name":"Research Institute of Intelligent Control and Systems, Harbin Institute of Technology, Harbin 150001, China"}]},{"given":"Lijie","family":"Zhang","sequence":"additional","affiliation":[{"name":"Key Laboratory of Advanced Forging & Stamping Technology and Science of Ministry of Education of China, Yanshan University, Qinhuangdao 066004, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-4299-8270","authenticated-orcid":false,"given":"Luis","family":"Gomes","sequence":"additional","affiliation":[{"name":"NOVA School of Sciences and Technology\u2014Centre of Technology and Systems, NOVA University Lisbon, 2829-516 Monte de Caparica, Portugal"}]}],"member":"1968","published-online":{"date-parts":[[2022,4,27]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","first-page":"052044","DOI":"10.1088\/1742-6596\/1601\/5\/052044","article-title":"Research on Intelligent Optimization of Bulk Cargo Terminal Control System","volume":"1601","author":"Wang","year":"2020","journal-title":"J. Phys. Conf. Ser."},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"170","DOI":"10.1016\/j.eswa.2017.06.010","article-title":"A Machine Learning-based system for berth scheduling at bulk terminals","volume":"87","year":"2017","journal-title":"Expert Syst. Appl."},{"key":"ref_3","doi-asserted-by":"crossref","first-page":"3","DOI":"10.1016\/j.retrec.2012.11.001","article-title":"Modeling yard crane operators as reinforcement learning agents","volume":"42","author":"Fotuhi","year":"2013","journal-title":"Res. Transp. Econ."},{"key":"ref_4","doi-asserted-by":"crossref","first-page":"269","DOI":"10.1057\/palgrave.mel.9100186","article-title":"The berth allocation problem with service time and delay time objectives","volume":"9","author":"Imai","year":"2007","journal-title":"Marit. Econ. Logist."},{"key":"ref_5","doi-asserted-by":"crossref","first-page":"75","DOI":"10.1016\/j.tre.2015.06.008","article-title":"Integrated berth allocation and quay crane assignment problem: Set partitioning models and computational results","volume":"81","author":"Iris","year":"2015","journal-title":"Transp. Res. Part E Logist. Transp. Rev."},{"key":"ref_6","doi-asserted-by":"crossref","first-page":"142","DOI":"10.1016\/j.trd.2017.05.002","article-title":"The multi-port berth allocation problem with speed optimization and emission considerations","volume":"54","author":"Venturini","year":"2017","journal-title":"Transp. Res. Part D Transp. Environ."},{"key":"ref_7","doi-asserted-by":"crossref","first-page":"258","DOI":"10.1016\/j.cie.2016.04.008","article-title":"Behavior perception-based disruption models for berth allocation and quay crane assignment problems","volume":"97","author":"Liu","year":"2016","journal-title":"Comput. Ind. Eng."},{"key":"ref_8","doi-asserted-by":"crossref","first-page":"27","DOI":"10.1002\/1520-6750(198902)36:1<27::AID-NAV3220360103>3.0.CO;2-0","article-title":"An interactive optimization system for bulk-cargo ship scheduling","volume":"36","author":"Fisher","year":"1989","journal-title":"Nav. Res. Logist."},{"key":"ref_9","doi-asserted-by":"crossref","first-page":"834","DOI":"10.1057\/palgrave.jors.2600973","article-title":"A combined ship scheduling and allocation problem","volume":"51","author":"Fagerholt","year":"2000","journal-title":"J. Oper. Res. Soc."},{"key":"ref_10","doi-asserted-by":"crossref","first-page":"606","DOI":"10.1016\/j.cie.2010.12.018","article-title":"Model and heuristic for berth allocation in tidal bulk ports with stock level constraints","volume":"60","author":"Barros","year":"2011","journal-title":"Comput. Ind. Eng."},{"key":"ref_11","doi-asserted-by":"crossref","first-page":"926","DOI":"10.1016\/j.ejor.2016.08.073","article-title":"A branch and price algorithm to solve the integrated production planning and scheduling in bulk ports","volume":"258","author":"Menezes","year":"2017","journal-title":"Eur. J. Oper. Res."},{"key":"ref_12","doi-asserted-by":"crossref","first-page":"156","DOI":"10.1016\/j.swevo.2015.11.002","article-title":"A HPSO for solving dynamic and discrete berth allocation problem and dynamic quay crane assignment problem simultaneously","volume":"27","author":"Hsu","year":"2016","journal-title":"Swarm Evol. Comput."},{"key":"ref_13","doi-asserted-by":"crossref","first-page":"54","DOI":"10.1016\/j.ejor.2011.01.021","article-title":"A decision model for berth allocation under uncertainty","volume":"212","author":"Zhen","year":"2011","journal-title":"Eur. J. Oper. Res."},{"key":"ref_14","doi-asserted-by":"crossref","unstructured":"Lujan, E., Vergara, E., Rodriguez-Melquiades, J., Jim\u00e9nez-Carri\u00f3n, M., Sabino-Escobar, C., and Gutierrez, F. (2021). A Fuzzy Optimization Model for the Berth Allocation Problem and Quay Crane Allocation Problem (BAP + QCAP) with n Quays. J. Mar. Sci. Eng., 9.","DOI":"10.3390\/jmse9020152"},{"key":"ref_15","doi-asserted-by":"crossref","first-page":"114215","DOI":"10.1016\/j.eswa.2020.114215","article-title":"A reduced vns based approach for the dynamic continuous berth allocation problem in bulk terminals with tidal constraints","volume":"168","author":"Cheimanoff","year":"2021","journal-title":"Expert Syst. Appl."},{"key":"ref_16","doi-asserted-by":"crossref","unstructured":"Sezer, A., and Altan, A. (2021, January 11\u201313). Optimization of deep learning model parameters in classification of solder paste defects. Proceedings of the 2021 3rd International Congress on Human-Computer Interaction, Optimization and Robotic Applications (HORA), Ankara, Turkey.","DOI":"10.1109\/HORA52670.2021.9461342"},{"key":"ref_17","first-page":"3501613","article-title":"Effective Fault Diagnosis Based on Wavelet and Convolutional Attention Neural Network for Induction Motors","volume":"71","author":"Tran","year":"2021","journal-title":"IEEE Trans. Instrum. Meas."},{"key":"ref_18","unstructured":"Tassel, P., Gebser, M., and Schekotihin, K. (2021). A reinforcement learning environment for job-shop scheduling. arXiv."},{"key":"ref_19","doi-asserted-by":"crossref","first-page":"110686","DOI":"10.1016\/j.measurement.2021.110686","article-title":"Effective IoT-based Deep Learning Platform for Online Fault Diagnosis of Power Transformers Against Cyberattack and Data Uncertainties","volume":"190","author":"Tran","year":"2022","journal-title":"Measurement"},{"key":"ref_20","doi-asserted-by":"crossref","first-page":"291","DOI":"10.1108\/SSMT-04-2021-0013","article-title":"Detection of solder paste defects with an optimization-based deep learning model using image processing techniques","volume":"33","author":"Sezer","year":"2021","journal-title":"Solder. Surf. Mt. Technol."},{"key":"ref_21","doi-asserted-by":"crossref","first-page":"23186","DOI":"10.1109\/ACCESS.2022.3153471","article-title":"Reliable Deep Learning and IoT-Based Monitoring System for Secure Computer Numerical Control Machines Against Cyber-Attacks with Experimental Verification","volume":"10","author":"Tran","year":"2022","journal-title":"IEEE Access"},{"key":"ref_22","doi-asserted-by":"crossref","unstructured":"Fran\u00e7ois-Lavet, V., Henderson, P., Islam, R., Bellemare, M.G., and Pineau, J. (2018). An introduction to deep reinforcement learning. arXiv.","DOI":"10.1561\/9781680835397"},{"key":"ref_23","doi-asserted-by":"crossref","first-page":"529","DOI":"10.1038\/nature14236","article-title":"Human-level control through deep reinforcement learning","volume":"518","author":"Mnih","year":"2015","journal-title":"Nature"},{"key":"ref_24","doi-asserted-by":"crossref","unstructured":"Van Hasselt, H., Guez, A., and Silver, D. (2016, January 12\u201317). Deep reinforcement learning with double q-learning. Proceedings of the AAAI Conference on Artificial Intelligence, Phoenix, AZ, USA.","DOI":"10.1609\/aaai.v30i1.10295"},{"key":"ref_25","doi-asserted-by":"crossref","unstructured":"Hessel, M., Modayil, J., Van Hasselt, H., Schaul, T., Ostrovski, G., Dabney, W., and Silver, D. (2018, January 2\u20137). Rainbow: Combining improvements in deep reinforcement learning. Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, New Orleans, LA, USA.","DOI":"10.1609\/aaai.v32i1.11796"},{"key":"ref_26","unstructured":"Lillicrap, T.P., Hunt, J.J., Pritzel, A., Heess, N., Erez, T., Tassa, Y., Silver, D., and Wierstra, D. (2015). Continuous control with deep reinforcement learning. arXiv."},{"key":"ref_27","unstructured":"Mnih, V., Badia, A.P., Mirza, M., Graves, A., Lillicrap, T., Harley, T., Silver, D., and Kavukcuoglu, K. (2016, January 19\u201324). Asynchronous methods for deep reinforcement learning. Proceedings of the International Conference on Machine Learning, New York, NY, USA."},{"key":"ref_28","doi-asserted-by":"crossref","first-page":"895","DOI":"10.1007\/s10462-021-09996-w","article-title":"Multi-agent deep reinforcement learning: A survey","volume":"55","author":"Gronauer","year":"2022","journal-title":"Artif. Intell. Rev."}],"container-title":["Electronics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2079-9292\/11\/9\/1390\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,10]],"date-time":"2025-10-10T23:01:51Z","timestamp":1760137311000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2079-9292\/11\/9\/1390"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,4,27]]},"references-count":28,"journal-issue":{"issue":"9","published-online":{"date-parts":[[2022,5]]}},"alternative-id":["electronics11091390"],"URL":"https:\/\/doi.org\/10.3390\/electronics11091390","relation":{},"ISSN":["2079-9292"],"issn-type":[{"value":"2079-9292","type":"electronic"}],"subject":[],"published":{"date-parts":[[2022,4,27]]}}}