{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,14]],"date-time":"2026-04-14T14:36:27Z","timestamp":1776177387226,"version":"3.50.1"},"reference-count":37,"publisher":"Springer Science and Business Media LLC","issue":"3","license":[{"start":{"date-parts":[[2025,2,19]],"date-time":"2025-02-19T00:00:00Z","timestamp":1739923200000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by-nc-nd\/4.0"},{"start":{"date-parts":[[2025,2,19]],"date-time":"2025-02-19T00:00:00Z","timestamp":1739923200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by-nc-nd\/4.0"}],"funder":[{"name":"National key R & D project","award":["2021YFB3301802"],"award-info":[{"award-number":["2021YFB3301802"]}]},{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"publisher","award":["62302103"],"award-info":[{"award-number":["62302103"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100022952","name":"Guangdong Provincial Key Laboratory of Sensor Technology and Biomedical Instrument","doi-asserted-by":"publisher","award":["2020B1212060069"],"award-info":[{"award-number":["2020B1212060069"]}],"id":[{"id":"10.13039\/100022952","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Complex Intell. Syst."],"published-print":{"date-parts":[[2025,3]]},"DOI":"10.1007\/s40747-025-01816-w","type":"journal-article","created":{"date-parts":[[2025,2,19]],"date-time":"2025-02-19T14:39:04Z","timestamp":1739975944000},"update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":8,"title":["Control strategy of robotic manipulator based on multi-task reinforcement learning"],"prefix":"10.1007","volume":"11","author":[{"given":"Tao","family":"Wang","sequence":"first","affiliation":[]},{"given":"Ziming","family":"Ruan","sequence":"additional","affiliation":[]},{"given":"Yuyan","family":"Wang","sequence":"additional","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0003-2800-4647","authenticated-orcid":false,"given":"Chong","family":"Chen","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2025,2,19]]},"reference":[{"key":"1816_CR1","doi-asserted-by":"publisher","first-page":"3953","DOI":"10.1109\/TASE.2022.3141248","volume":"19","author":"RNA Algburi","year":"2022","unstructured":"Algburi RNA, Gao H, Al-Huda Z (2022) Improvement of an industrial robotic flaw detection system. IEEE Trans Autom Sci Eng 19:3953\u20133967","journal-title":"IEEE Trans Autom Sci Eng"},{"key":"1816_CR2","doi-asserted-by":"publisher","first-page":"2015","DOI":"10.1177\/01423312221142564","volume":"45","author":"S Guan","year":"2023","unstructured":"Guan S, Zhuang Z, Tao H, Chen Y, Stojanovic V, Paszke W (2023) Feedback-aided PD-type iterative learning control for time-varying systems with non-uniform trial lengths. Trans Inst Meas Control 45:2015\u20132026","journal-title":"Trans Inst Meas Control"},{"key":"1816_CR3","doi-asserted-by":"publisher","first-page":"7565","DOI":"10.1007\/s00521-021-06848-0","volume":"34","author":"RNA Algburi","year":"2022","unstructured":"Algburi RNA, Gao H, Al-Huda Z (2022) A new synergy of singular spectrum analysis with a conscious algorithm to detect faults in industrial robotics. Neural Comput Appl 34:7565\u20137580","journal-title":"Neural Comput Appl"},{"key":"1816_CR4","doi-asserted-by":"crossref","unstructured":"Zhong J, Wang T, Cheng L (2021) Collision-free path planning for welding manipulator via hybrid algorithm of deep reinforcement learning and inverse kinematics. Complex Intell Syst 8: 1899\u20131912","DOI":"10.1007\/s40747-021-00366-1"},{"key":"1816_CR5","doi-asserted-by":"publisher","first-page":"4569","DOI":"10.1109\/LRA.2023.3284660","volume":"8","author":"L Sun","year":"2023","unstructured":"Sun L, Zhang H, Xu W, Tomizuka M (2023) Efficient multi-task and transfer reinforcement learning with parameter-compositional framework. IEEE Rob Autom Lett 8:4569\u20134576","journal-title":"IEEE Rob Autom Lett"},{"key":"1816_CR6","unstructured":"Kalashnikov D, Varley J, Chebotar Y, Swanson B, Jonschkowski R, Finn C, Levine S, Hausman K (2021) Mt-Opt: Continuous multi-task robotic reinforcement learning at scale. arXiv preprint arXiv:2104.08212"},{"key":"1816_CR7","first-page":"1094","volume":"PMLR","author":"T Yu","year":"2020","unstructured":"Yu T, Quillen D, He Z, Julian R, Hausman K, Finn C, Levine S (2020) Meta-world: a benchmark and evaluation for multi-task and meta reinforcement learning. Conf Robot Learn PMLR:1094\u20131100","journal-title":"Conf Robot Learn"},{"key":"1816_CR8","doi-asserted-by":"publisher","first-page":"102227","DOI":"10.1016\/j.rcim.2021.102227","volume":"73","author":"R Zhang","year":"2022","unstructured":"Zhang R, Lv Q, Li J, Bao J, Liu T, Liu S (2022) A reinforcement learning method for human-robot collaboration in assembly tasks. Robot Comput Integr Manuf 73:102227","journal-title":"Robot Comput Integr Manuf"},{"key":"1816_CR9","doi-asserted-by":"crossref","unstructured":"Han Y, Li T, Wang Q (2024) A DQN based approach for large-scale EVs charging scheduling. Complex & Intelligent Systems","DOI":"10.1007\/s40747-024-01587-w"},{"key":"1816_CR10","doi-asserted-by":"crossref","unstructured":"Du Z, Xie X, Qu Z, Hu Y, Stojanovic V (2024) Dynamic event-triggered consensus control for interval type-2 fuzzy multi-agent systems. IEEE Trans Circuits Syst I Regul Pap","DOI":"10.1109\/TCSI.2024.3371492"},{"key":"1816_CR11","doi-asserted-by":"publisher","first-page":"1943","DOI":"10.1177\/01423312231225782","volume":"46","author":"Y Tao","year":"2024","unstructured":"Tao Y, Tao H, Zhuang Z, Stojanovic V, Paszke W (2024) Quantized iterative learning control of communication-constrained systems with encoding and decoding mechanism. Trans Inst Meas Control 46:1943\u20131954","journal-title":"Trans Inst Meas Control"},{"key":"1816_CR12","doi-asserted-by":"crossref","unstructured":"Bai F, Zhang H, Tao T, Wu Z, Wang Y, Xu B (2023) Picor: Multi-task deep reinforcement learning with policy correction, Proceedings of the AAAI Conference on Artificial Intelligence, pp. 6728\u20136736","DOI":"10.1609\/aaai.v37i6.25825"},{"key":"1816_CR13","doi-asserted-by":"publisher","first-page":"13530","DOI":"10.1109\/TVT.2023.3276898","volume":"72","author":"N Zhao","year":"2023","unstructured":"Zhao N, Pei Y, Liang Y-C, Niyato D (2023) Multi-agent deep reinforcement learning based incentive mechanism for multi-task federated edge learning. IEEE Trans Veh Technol 72:13530\u201313535","journal-title":"IEEE Trans Veh Technol"},{"key":"1816_CR14","doi-asserted-by":"publisher","first-page":"8579","DOI":"10.1109\/TIE.2021.3105977","volume":"69","author":"S Li","year":"2021","unstructured":"Li S, Zheng P, Fan J, Wang L (2021) Toward proactive human\u2013robot collaborative assembly: a multimodal transfer-learning-enabled action prediction approach. IEEE Trans Industr Electron 69:8579\u20138588","journal-title":"IEEE Trans Industr Electron"},{"key":"1816_CR15","doi-asserted-by":"publisher","first-page":"5586","DOI":"10.1109\/TKDE.2021.3070203","volume":"34","author":"Y Zhang","year":"2021","unstructured":"Zhang Y, Yang Q (2021) A survey on multi-task learning. IEEE Trans Knowl Data Eng 34:5586\u20135609","journal-title":"IEEE Trans Knowl Data Eng"},{"key":"1816_CR16","doi-asserted-by":"publisher","first-page":"104048","DOI":"10.1016\/j.artint.2023.104048","volume":"326","author":"C Bai","year":"2024","unstructured":"Bai C, Wang L, Hao J, Yang Z, Zhao B, Wang Z, Li X (2024) Pessimistic value iteration for multi-task data sharing in offline reinforcement learning. Artif Intell 326:104048","journal-title":"Artif Intell"},{"key":"1816_CR17","doi-asserted-by":"publisher","first-page":"3812","DOI":"10.1109\/LRA.2023.3271445","volume":"8","author":"G Cheng","year":"2023","unstructured":"Cheng G, Dong L, Cai W, Sun C (2023) Multi-task reinforcement learning with attention-based mixture of experts. IEEE Rob Autom Lett 8:3812\u20133819","journal-title":"IEEE Rob Autom Lett"},{"key":"1816_CR18","doi-asserted-by":"publisher","first-page":"315","DOI":"10.1016\/j.jmsy.2023.02.009","volume":"67","author":"Y Ping","year":"2023","unstructured":"Ping Y, Liu Y, Zhang L, Wang L, Xu X (2023) Sequence generation for multi-task scheduling in cloud manufacturing with deep reinforcement learning. J Manuf Syst 67:315\u2013337","journal-title":"J Manuf Syst"},{"key":"1816_CR19","first-page":"5824","volume":"33","author":"T Yu","year":"2020","unstructured":"Yu T, Kumar S, Gupta A, Levine S, Hausman K, Finn C (2020) Gradient surgery for multi-task learning. Adv Neural Inf Process Syst 33:5824\u20135836","journal-title":"Adv Neural Inf Process Syst"},{"key":"1816_CR20","doi-asserted-by":"publisher","first-page":"8194","DOI":"10.1109\/LRA.2022.3185384","volume":"7","author":"H Zhang","year":"2022","unstructured":"Zhang H, Kan Z (2022) Temporal logic guided meta q-learning of multiple tasks. IEEE Rob Autom Lett 7:8194\u20138201","journal-title":"IEEE Rob Autom Lett"},{"key":"1816_CR21","first-page":"4767","volume":"33","author":"R Yang","year":"2020","unstructured":"Yang R, Xu H, Wu Y, Wang X (2020) Multi-task reinforcement learning with soft modularization. Adv Neural Inf Process Syst 33:4767\u20134777","journal-title":"Adv Neural Inf Process Syst"},{"key":"1816_CR22","doi-asserted-by":"crossref","unstructured":"Li W-H, Bilen H (2020) Knowledge distillation for multi-task learning, Computer Vision\u2013ECCV 2020 Workshops: Glasgow, UK, August 23\u201328, 2020, Proceedings, Part VI 16, Springer, pp. 163\u2013176","DOI":"10.1007\/978-3-030-65414-6_13"},{"key":"1816_CR23","doi-asserted-by":"crossref","unstructured":"Jacob GM, Agarwal V, Stenger B (2023) Online knowledge distillation for multi-task learning, Proceedings of the IEEE\/CVF Winter Conference on Applications of Computer Vision, pp. 2359\u20132368","DOI":"10.1109\/WACV56688.2023.00239"},{"key":"1816_CR24","doi-asserted-by":"crossref","unstructured":"Zentner K, Puri U, Zhang Y, Julian R, Sukhatme GS (2022) Efficient multi-task learning via iterated single-task transfer. 2022 IEEE\/RSJ International Conference on Intelligent Robots and, Systems (IROS). IEEE, pp. 10141\u201310146","DOI":"10.1109\/IROS47612.2022.9981244"},{"key":"1816_CR25","doi-asserted-by":"crossref","unstructured":"Duan Y, Chen X, Xu H, Chen Z, Liang X, Zhang T, Li Z (2021) Transnas-bench-101: Improving transferability and generalizability of cross-task neural architecture search. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, pp. 5251\u20135260","DOI":"10.1109\/CVPR46437.2021.00521"},{"key":"1816_CR26","doi-asserted-by":"crossref","unstructured":"Liang J, Meyerson E, Miikkulainen R (2018) Evolutionary architecture search for deep multitask networks. Proceedings of the genetic and evolutionary computation conference, pp. 466\u2013473","DOI":"10.1145\/3205455.3205489"},{"key":"1816_CR27","first-page":"8728","volume":"33","author":"X Sun","year":"2020","unstructured":"Sun X, Panda R, Feris R, Saenko K (2020) Adashare: learning what to share for efficient deep multi-task learning. Adv Neural Inf Process Syst 33:8728\u20138740","journal-title":"Adv Neural Inf Process Syst"},{"key":"1816_CR28","first-page":"3854","volume":"PMLR","author":"P Guo","year":"2020","unstructured":"Guo P., Lee C.-Y., Ulbricht D. (2020) Learning to branch for multi-task learning. Int Conf Mach Learn PMLR:3854\u20133863","journal-title":"Int Conf Mach Learn"},{"key":"1816_CR29","unstructured":"Javaloy A, Valera I (2021) Rotograd: Gradient homogenization in multitask learning. arXiv preprint arXiv:2103.02631"},{"key":"1816_CR30","doi-asserted-by":"crossref","unstructured":"Senushkin D, Patakin N, Kuznetsov A, Konushin A (2023) Independent component alignment for multi-task learning. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, pp. 20083\u201320093","DOI":"10.1109\/CVPR52729.2023.01923"},{"key":"1816_CR31","unstructured":"Suteu M, Guo Y (2019) Regularizing deep multi-task networks using orthogonal gradients. arXiv preprint arXiv:1912.06844"},{"key":"1816_CR32","doi-asserted-by":"crossref","unstructured":"Br\u00e4m T, Brunner G, Richter O, Wattenhofer R (2020) Attentive multi-task deep reinforcement learning, machine learning and knowledge discovery in databases: european conference. ECML PKDD 2019, W\u00fcrzburg, Germany, September 16\u201320, 2019, Proceedings, Part III, Springer, pp. 134\u2013149","DOI":"10.1007\/978-3-030-46133-1_9"},{"key":"1816_CR33","unstructured":"Sodhani S, Zhang A, Pineau J (2021) Multi-task reinforcement learning with context-based representations. International Conference on Machine Learning, PMLR, pp. 9767\u20139779"},{"key":"1816_CR34","first-page":"1861","volume":"PMLR","author":"T Haarnoja","year":"2018","unstructured":"Haarnoja T, Zhou A, Abbeel P, Levine S (2018) Soft actor-critic: off-policy maximum entropy deep reinforcement learning with a stochastic actor. Int Conf Mach Learn PMLR:1861\u20131870","journal-title":"Int Conf Mach Learn"},{"key":"1816_CR35","unstructured":"Lillicrap T (2015) Continuous control with deep reinforcement learning. arXiv preprint arXiv:1509.02971"},{"key":"1816_CR36","unstructured":"Schulman J, Wolski F, Dhariwal P, Radford A, Klimov O (2017) Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347"},{"key":"1816_CR37","unstructured":"Schulman J (2015) Trust Region Policy Optimization. arXiv preprint arXiv:1502.05477"}],"container-title":["Complex &amp; Intelligent Systems"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s40747-025-01816-w.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1007\/s40747-025-01816-w\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s40747-025-01816-w.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,3,4]],"date-time":"2025-03-04T12:08:23Z","timestamp":1741090103000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/s40747-025-01816-w"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,2,19]]},"references-count":37,"journal-issue":{"issue":"3","published-print":{"date-parts":[[2025,3]]}},"alternative-id":["1816"],"URL":"https:\/\/doi.org\/10.1007\/s40747-025-01816-w","relation":{},"ISSN":["2199-4536","2198-6053"],"issn-type":[{"value":"2199-4536","type":"print"},{"value":"2198-6053","type":"electronic"}],"subject":[],"published":{"date-parts":[[2025,2,19]]},"assertion":[{"value":"12 October 2024","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"1 February 2025","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"19 February 2025","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Declarations"}},{"value":"The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Conflict of interest"}}],"article-number":"175"}}