{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,5]],"date-time":"2026-02-05T10:39:13Z","timestamp":1770287953171,"version":"3.49.0"},"reference-count":25,"publisher":"SAGE Publications","issue":"6","license":[{"start":{"date-parts":[[2019,7,3]],"date-time":"2019-07-03T00:00:00Z","timestamp":1562112000000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/journals.sagepub.com\/page\/policies\/text-and-data-mining-license"}],"content-domain":{"domain":["journals.sagepub.com"],"crossmark-restriction":true},"short-container-title":["Journal of Intelligent &amp; Fuzzy Systems"],"published-print":{"date-parts":[[2019,12,23]]},"abstract":"<jats:p>Transfer learning has been identified as conducive to improving the speed of machine learning in many areas. In multi-task reinforcement learning, transfer learning can assist the transfer of experiences between different tasks. The research conducted in this article is focused on two aspects. On the one hand, multi-task parallel transfer learning can improve the learning speed of parallel learning tasks. On the other hand, the learning of the current optimal experience can help the target point rewards to be transmitted to the starting point. The value of this self-learning can also accelerate the convergence speed of the reinforcement learning. According to the research into these two aspects, this paper uses the idea of particle swarm optimization (PSO) to conduct self-learning and interactive learning in multi-task parallel learning. In this paper, a new multi-task learning algorithm named PSO-MTPRL (Multi-Task Parallel Reinforcement Learning based on PSO) is proposed. Based on the idea of PSO algorithm, the Boltzmann strategy, Self-Learning Process (SLP) and Interactive Learning Process (ILP) are selected probabilistically. Based on the characteristic exhibited by reinforcement learning, segmented learning model is recommended. In the early learning stages, the complete Boltzmann exploration strategy is applied, and B-SLP-ILP (Boltzmann-SLP- ILP) learning procedure is conducted exclusively in the middle stage of the learning. In the late learning stages, Boltzmann exploration is involved again. The segmented learning model can help ensure the balance of the exploration and exploitation, in addition to ensuring that all tasks convergence.<\/jats:p>","DOI":"10.3233\/jifs-190209","type":"journal-article","created":{"date-parts":[[2019,7,5]],"date-time":"2019-07-05T10:55:27Z","timestamp":1562324127000},"page":"8567-8575","update-policy":"https:\/\/doi.org\/10.1177\/sage-journals-update-policy","source":"Crossref","is-referenced-by-count":5,"title":["Particle swarm optimization based multi-task parallel reinforcementlearning algorithm"],"prefix":"10.1177","volume":"37","author":[{"given":"Duan","family":"Junhua","sequence":"first","affiliation":[{"name":"School of Computer, Northwestern Polytechnical University, Beilin District, Xi\u2019an Shaanxi, P.R. China"}]},{"given":"Zhu","family":"Yi-an","sequence":"additional","affiliation":[{"name":"School of Computer, Northwestern Polytechnical University, Beilin District, Xi\u2019an Shaanxi, P.R. China"}]},{"given":"Zhong","family":"Dong","sequence":"additional","affiliation":[{"name":"School of Computer, Northwestern Polytechnical University, Beilin District, Xi\u2019an Shaanxi, P.R. China"}]},{"given":"Zhang","family":"Lixiang","sequence":"additional","affiliation":[{"name":"School of Computer, Northwestern Polytechnical University, Beilin District, Xi\u2019an Shaanxi, P.R. China"}]},{"given":"Zhang","family":"Lin","sequence":"additional","affiliation":[{"name":"School of Computer, Northwestern Polytechnical University, Beilin District, Xi\u2019an Shaanxi, P.R. China"}]}],"member":"179","published-online":{"date-parts":[[2019,7,3]]},"reference":[{"issue":"1","key":"e_1_3_2_2_2","first-page":"1","article-title":"Reinforcement Learning Algorithms:Survey and Classification[J]","volume":"10","author":"Ravishankar N.R.","unstructured":"RavishankarN.R. and VijayakumarM.V., Reinforcement Learning Algorithms:Survey and Classification[J], Indian Journal of Science and Technology, 2017 10(1), 1\u20138.","journal-title":"Indian Journal of Science and Technology, 2017"},{"key":"e_1_3_2_3_2","doi-asserted-by":"publisher","DOI":"10.1007\/s10846-017-0468-y"},{"key":"e_1_3_2_4_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.robot.2019.02.013"},{"issue":"99","key":"e_1_3_2_5_2","first-page":"1","article-title":"Expert Level Control of Ramp Metering Based on Multi-Task Deep Reinforcement Learning[J]","author":"Belletti F","year":"2017","unstructured":"BellettiF, HazizaD, GomesG, et al. Expert Level Control of Ramp Metering Based on Multi-Task Deep Reinforcement Learning[J]. IEEE Transactions on Intelligent Transportation Systems, 2017, PP(99): 1\u201310.","journal-title":"IEEE Transactions on Intelligent Transportation Systems"},{"key":"e_1_3_2_6_2","first-page":"5907","article-title":"Learning task-parametrized assistive strategies for exoskeleton robots by multi-task reinforcement learning[C]","author":"Hamaya M","year":"2017","unstructured":"HamayaM, MatsubaraT, NodaT, et al. Learning task-parametrized assistive strategies for exoskeleton robots by multi-task reinforcement learning[C]. IEEE International Conference on Robotics & Automation. IEEE, 2017, 5907\u20135912.","journal-title":"IEEE International Conference on Robotics & Automation. IEEE"},{"key":"e_1_3_2_7_2","doi-asserted-by":"publisher","DOI":"10.1109\/MCI.2019.2901089"},{"key":"e_1_3_2_8_2","first-page":"237","article-title":"Hierarchical Reinforcement Learning Framework towards Multi-agent Navigation[C]","author":"Ding W.","year":"2018","unstructured":"DingW., LiS. and QianH., Hierarchical Reinforcement Learning Framework towards Multi-agent Navigation[C]. IEEE International Conference on Robotics and Biomimetics, ROBIO (2018), 237\u2013242.","journal-title":"IEEE International Conference on Robotics and Biomimetics"},{"key":"e_1_3_2_9_2","first-page":"4108","article-title":"Deep Decentralized Multi-task Multi-Agent Reinforcement Learning under Partial Observability[C]","volume":"6","author":"Omidshafiei S.","year":"2017","unstructured":"OmidshafieiS., PazisJ., AmatoC., and al et, Deep Decentralized Multi-task Multi-Agent Reinforcement Learning under Partial Observability[C], The proceeding of 34th International Conference on Machine Learning, ICML 6 (2017), 4108\u20134122.","journal-title":"The proceeding of 34th International Conference on Machine Learning, ICML"},{"key":"e_1_3_2_10_2","article-title":"Autonomous Extracting a Hierarchical Structure of Tasks in Reinforcement Learning and Multi-task Reinforcement Learning[J]","author":"Ghazanfari Behzad","year":"2017","unstructured":"GhazanfariBehzad, Taylor MatthewE. Autonomous Extracting a Hierarchical Structure of Tasks in Reinforcement Learning and Multi-task Reinforcement Learning[J]. arXiv preprint:1709.04579, 2017.","journal-title":"arXiv preprint:1709.04579"},{"key":"e_1_3_2_11_2","first-page":"2146","article-title":"Nonparametric Risk and Stability Analysis for Multi-Task Learning Problems[C]","volume":"2016","author":"Wang S.","unstructured":"WangS., Nonparametric Risk and Stability Analysis for Multi-Task Learning Problems[C], International Joint Conference on Artificial Intelligence. AAAI 2016, (Press), 2146\u20132152.","journal-title":"International Joint Conference on Artificial Intelligence. AAAI"},{"key":"e_1_3_2_12_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.bica.2018.09.003"},{"key":"e_1_3_2_13_2","first-page":"91","article-title":"Towards Knowledge Transfer in Deep Reinforcement Learning[C]. Proceedings - 5th Brazilian Conference on Intelligent Systems","author":"Glatt R.","year":"2016","unstructured":"GlattR., SilvaF.L.D. and CostaA.H.R., Towards Knowledge Transfer in Deep Reinforcement Learning[C]. Proceedings - 5th Brazilian Conference on Intelligent Systems, BRACIS (2016), 91\u201396.","journal-title":"BRACIS"},{"key":"e_1_3_2_14_2","unstructured":"Silva FelipeLeno Da Costa AnnaHelena Reali. Transfer learning for multiagent reinforcement learning systems[C]: Proceedings of International Joint Conference on Artificial Intelligence 2016: 3982\u20133983."},{"key":"e_1_3_2_15_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.artint.2015.05.008"},{"key":"e_1_3_2_16_2","first-page":"3516","article-title":"Multi-Task Domain Adaptation for Deep Learning of Instance Grasping from Simulation[C]. Proceedings of IEEE International Conference on Robotics and Automation","author":"Fang K.","year":"2018","unstructured":"FangK., BaiY. and HinterstoisserS., et al., Multi-Task Domain Adaptation for Deep Learning of Instance Grasping from Simulation[C]. Proceedings of IEEE International Conference on Robotics and Automation, ICRA (2018), 3516\u20133523.","journal-title":"ICRA"},{"key":"e_1_3_2_17_2","volume-title":"Parallel Transfer Learning: Accelerating Reinforcement Learning in Multi-Agent Systems[D]","author":"Adam Taylor","year":"2016","unstructured":"AdamTaylor Parallel Transfer Learning: Accelerating Reinforcement Learning in Multi-Agent Systems[D], University of Dublin 2016."},{"key":"e_1_3_2_18_2","unstructured":"MannionPatrick DugganJim HowleyEnda. Parallel Learning using Heterogeneous Agents[C]: Proceedings of Adaptive and Learning Agents Workshop 2015."},{"key":"e_1_3_2_19_2","first-page":"1","article-title":"Speedy q-learning: a computationally efficient reinforcement learning algorithm with a near optimal rate of convergence[J]","author":"Azar M. G","year":"2013","unstructured":"AzarM. G, MunosR, GhavamzadehM, et al. Speedy q-learning: a computationally efficient reinforcement learning algorithm with a near optimal rate of convergence[J]. Journal of Machine Learning Research, 2013: 1\u201326.","journal-title":"Journal of Machine Learning Research"},{"key":"e_1_3_2_20_2","doi-asserted-by":"crossref","unstructured":"TanF YanP GuanX. Deep Reinforcement Learning: From Q-Learning to Deep Q-Learning. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) 2017: 475\u2013483.","DOI":"10.1007\/978-3-319-70093-9_50"},{"key":"e_1_3_2_21_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.knosys.2019.03.018"},{"issue":"1","key":"e_1_3_2_22_2","first-page":"179","article-title":"A filtering mechanism based optimization for particle swarm optimization algorithm[C]","volume":"9","author":"Ji W.","year":"2016","unstructured":"JiW., ZhuS., A filtering mechanism based optimization for particle swarm optimization algorithm[C], International Journal of Future Generation Communication and Networking 9(1) (2016), 179\u2013186.","journal-title":"International Journal of Future Generation Communication and Networking"},{"key":"e_1_3_2_23_2","first-page":"50","article-title":"A new approach to the Restart Genetic Algorithm to solve zero-one knapsack problem[C]. Proceedings of IEEE 4th International Conference on Knowledge-Based Engineering and Innovation","author":"Gupta I.K.","year":"2017","unstructured":"GuptaI.K., ChoubeyA. and ChoubeyS., A new approach to the Restart Genetic Algorithm to solve zero-one knapsack problem[C]. Proceedings of IEEE 4th International Conference on Knowledge-Based Engineering and Innovation, KBEI (2017), 50\u201353.","journal-title":"KBEI"},{"key":"e_1_3_2_24_2","first-page":"668","volume-title":"Proceedings of IEEE International Conference on Electro Information Technology","author":"Adam B","year":"2016","unstructured":"AdamB, AlexanderU. A new parameter adaptation method for Genetic Algorithms and Ant Colony Optimization algorithms[C]. Proceedings of IEEE International Conference on Electro Information Technology, 2016: 668\u2013673."},{"issue":"2","key":"e_1_3_2_25_2","first-page":"171","article-title":"Optimization of Straight Cylindrical Turning Using Artificial Bee Colony (ABC) Algorithm[J]","volume":"98","author":"Prasanth R.S.S.","year":"2016","unstructured":"PrasanthR.S.S. and RajK.H., Optimization of Straight Cylindrical Turning Using Artificial Bee Colony (ABC) Algorithm[J], Journal of the Institution of Engineers 98(2) (2016), 171\u2013177.","journal-title":"Journal of the Institution of Engineers"},{"key":"e_1_3_2_26_2","doi-asserted-by":"publisher","DOI":"10.1007\/s13748-012-0026-6"}],"container-title":["Journal of Intelligent &amp; Fuzzy Systems"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/journals.sagepub.com\/doi\/pdf\/10.3233\/JIFS-190209","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/journals.sagepub.com\/doi\/full-xml\/10.3233\/JIFS-190209","content-type":"application\/xml","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/journals.sagepub.com\/doi\/pdf\/10.3233\/JIFS-190209","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2026,2,4]],"date-time":"2026-02-04T20:18:30Z","timestamp":1770236310000},"score":1,"resource":{"primary":{"URL":"https:\/\/journals.sagepub.com\/doi\/10.3233\/JIFS-190209"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2019,7,3]]},"references-count":25,"journal-issue":{"issue":"6","published-print":{"date-parts":[[2019,12,23]]}},"alternative-id":["10.3233\/JIFS-190209"],"URL":"https:\/\/doi.org\/10.3233\/jifs-190209","relation":{},"ISSN":["1064-1246","1875-8967"],"issn-type":[{"value":"1064-1246","type":"print"},{"value":"1875-8967","type":"electronic"}],"subject":[],"published":{"date-parts":[[2019,7,3]]}}}