{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,12]],"date-time":"2025-10-12T04:08:35Z","timestamp":1760242115621,"version":"build-2065373602"},"reference-count":33,"publisher":"MDPI AG","issue":"1","license":[{"start":{"date-parts":[[2018,12,29]],"date-time":"2018-12-29T00:00:00Z","timestamp":1546041600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"publisher","award":["61403406"],"award-info":[{"award-number":["61403406"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Symmetry"],"abstract":"<jats:p>Transfer Learning (TL) has received a great deal of attention because of its ability to speed up Reinforcement Learning (RL) by reusing learned knowledge from other tasks. This paper proposes a new transfer learning framework, referred to as Transfer Learning via Artificial Neural Network Approximator (TL-ANNA). It builds an Artificial Neural Network (ANN) transfer approximator to transfer the related knowledge from the source task into the target task and reuses the transferred knowledge with a Probabilistic Policy Reuse (PPR) scheme. Specifically, the transfer approximator maps the state of the target task symmetrically to states of the source task with a certain mapping rule, and activates the related knowledge (components of the action-value function) of the source task as the input of the ANNs; it then predicts the quality of the actions in the target task with the ANNs. The target learner uses the PPR scheme to bias the RL with the suggested action from the transfer approximator. In this way, the transfer approximator builds a symmetric knowledge path between the target task and the source task. In addition, two mapping rules for the transfer approximator are designed, namely, Full Mapping Rule and Group Mapping Rule. Experiments performed on the RoboCup soccer Keepaway task verified that the proposed transfer learning methods outperform two other transfer learning methods in both jumpstart and time to threshold metrics and are more robust to the quality of source knowledge. In addition, the TL-ANNA with the group mapping rule exhibits slightly worse performance than the one with the full mapping rule, but with less computation and space cost when appropriate grouping method is used.<\/jats:p>","DOI":"10.3390\/sym11010025","type":"journal-article","created":{"date-parts":[[2018,12,31]],"date-time":"2018-12-31T07:22:30Z","timestamp":1546240950000},"page":"25","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":3,"title":["Reusing Source Task Knowledge via Transfer Approximator in Reinforcement Transfer Learning"],"prefix":"10.3390","volume":"11","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-3126-8187","authenticated-orcid":false,"given":"Qiao","family":"Cheng","sequence":"first","affiliation":[{"name":"College of Intelligence Science and Technology, National University of Defense Technology, Changsha 410073, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Xiangke","family":"Wang","sequence":"additional","affiliation":[{"name":"College of Intelligence Science and Technology, National University of Defense Technology, Changsha 410073, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Yifeng","family":"Niu","sequence":"additional","affiliation":[{"name":"College of Intelligence Science and Technology, National University of Defense Technology, Changsha 410073, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Lincheng","family":"Shen","sequence":"additional","affiliation":[{"name":"College of Intelligence Science and Technology, National University of Defense Technology, Changsha 410073, China"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"1968","published-online":{"date-parts":[[2018,12,29]]},"reference":[{"key":"ref_1","first-page":"2125","article-title":"Transfer learning via inter-task mappings for temporal difference learning","volume":"8","author":"Taylor","year":"2007","journal-title":"J. Mach. Learn. Res."},{"key":"ref_2","doi-asserted-by":"crossref","unstructured":"Fern\u00e1ndez, F., and Veloso, M. (2006, January 8\u201312). Probabilistic Policy Reuse in a Reinforcement Learning Agent. Proceedings of the Fifth International Joint Conference on Autonomous Agents and Multiagent Systems, Hakodate, Japan.","DOI":"10.1145\/1160633.1160762"},{"key":"ref_3","unstructured":"Taylor, M.E., Suay, H.B., and Chernova, S. (2011, January 2\u20136). Integrating reinforcement learning with human demonstrations of varying ability. Proceedings of the 10th International Conference on Autonomous Agents and Multiagent Systems, Taipei, Taiwan."},{"key":"ref_4","doi-asserted-by":"crossref","unstructured":"Cheng, Q., Wang, X., and Shen, L. (2017, January 26\u201328). Transfer learning via linear multi-variable mapping under reinforcement learning framework. Proceedings of the 2017 36th Chinese Control Conference (CCC), Dalian, China.","DOI":"10.23919\/ChiCC.2017.8028754"},{"key":"ref_5","unstructured":"Taylor, M.E., Kuhlmann, G., and Stone, P. (2008, January 12\u201316). Autonomous transfer for reinforcement learning. Proceedings of the Seventh International Joint Conference on Autonomous Agents and Multiagent Systems, Estoril, Portugal."},{"key":"ref_6","unstructured":"da Silva, F.L., and Costa, A.H.R. (2017, January 8\u201312). Towards Zero-Shot Autonomous Inter-Task Mapping through Object-Oriented Task Description. Proceedings of the Transfer in Reinforcement Learning Workshop (TiRL) in AAMAS 2017, S\u00e3o Paulo, Brazil."},{"key":"ref_7","doi-asserted-by":"crossref","unstructured":"Cheng, Q., Wang, X., and Shen, L. (2017, January 5\u20138). An Autonomous Inter-task Mapping Learning Method via Artificial Neural Network for Transfer Learning. Proceedings of the 2017 IEEE International Conference on Robotics and Biomimetics, Macao, China.","DOI":"10.1109\/ROBIO.2017.8324510"},{"key":"ref_8","unstructured":"Wang, Z., and Taylor, M.E. (2016, January 21\u201323). Effective Transfer via Demonstrations in Reinforcement Learning: A Preliminary Study. Proceedings of the 2016 AAAI Spring Symposium Series, Palo Alto, CA, USA."},{"key":"ref_9","unstructured":"Brys, T., Harutyunyan, A., Taylor, M.E., and Now\u00e9, A. (2015, January 4\u20138). Policy transfer using reward shaping. Proceedings of the 2015 International Conference on Autonomous Agents and Multiagent Systems, Istanbul, Turkey."},{"key":"ref_10","doi-asserted-by":"crossref","first-page":"33275","DOI":"10.1109\/ACCESS.2018.2844882","article-title":"Context-Aware Indoor VLC\/RF Heterogeneous Network Selection: Reinforcement Learning with Knowledge Transfer","volume":"6","author":"Du","year":"2018","journal-title":"IEEE Access"},{"key":"ref_11","unstructured":"Zhan, Y., and Taylor, M.E. (arXiv, 2015). Online transfer learning in reinforcement learning domains, arXiv."},{"key":"ref_12","doi-asserted-by":"crossref","first-page":"77","DOI":"10.1109\/TCYB.2014.2319733","article-title":"Stochastic abstract policies: Generalizing knowledge to improve reinforcement learning","volume":"45","author":"Koga","year":"2015","journal-title":"IEEE Trans. Cybern."},{"key":"ref_13","unstructured":"Laflamme, S. (2017). Transfer in Reinforcement Learning: An Empirical Comparison of Methods in Mario AI. [Master\u2019s Thesis, McGill University Libraries]."},{"key":"ref_14","doi-asserted-by":"crossref","unstructured":"Laroche, R., and Barlier, M. (2017, January 4\u20139). Transfer Reinforcement Learning with Shared Dynamics. Proceedings of the AAAI-17\u2014Thirty-First AAAI Conference on Artificial Intelligence, San Francisco, CA, USA.","DOI":"10.1609\/aaai.v31i1.10796"},{"key":"ref_15","unstructured":"Wang, Y., Meng, Q., Cheng, W., Liug, Y., Ma, Z.M., and Liu, T.Y. (arXiv, 2018). Target Transfer Q-Learning and Its Convergence Analysis, arXiv."},{"key":"ref_16","first-page":"163","article-title":"Transfer Learning Through Policy Abstraction Using Learning Vector Quantization","volume":"10","author":"Faudzi","year":"2018","journal-title":"J. Telecommun. Electron. Comput. Eng."},{"key":"ref_17","doi-asserted-by":"crossref","unstructured":"Matsubara, T., Norinaga, Y., Ozawa, Y., and Cui, Y. (2018, January 20\u201324). Policy Transfer from Simulations to Real World by Transfer Component Analysis. Proceedings of the 14th IEEE International Conference on Automation Science and Engineering, Munich, Germany.","DOI":"10.1109\/COASE.2018.8560543"},{"key":"ref_18","unstructured":"Schwab, D., Zhu, Y., and Veloso, M. (2018, January 10\u201315). Zero Shot Transfer Learning for Robot Soccer. Proceedings of the 17th International Conference on Autonomous Agents and MultiAgent Systems, Stockholm, Sweden."},{"key":"ref_19","unstructured":"Gupta, A., Devin, C., Liu, Y., Abbeel, P., and Levine, S. (arXiv, 2017). Learning invariant feature spaces to transfer skills with reinforcement learning, arXiv."},{"key":"ref_20","unstructured":"Lehnert, L., Tellex, S., and Littman, M.L. (arXiv, 2017). Advantages and Limitations of using Successor Features for Transfer in Reinforcement Learning, arXiv."},{"key":"ref_21","doi-asserted-by":"crossref","first-page":"104","DOI":"10.1016\/j.patrec.2016.08.009","article-title":"Graph based skill acquisition and transfer learning for continuous reinforcement learning domains","volume":"87","author":"Shoeleh","year":"2017","journal-title":"Pattern Recognit. Lett."},{"key":"ref_22","doi-asserted-by":"crossref","unstructured":"Joshi, G., and Chowdhary, G. (arXiv, 2018). Cross-Domain Transfer in Reinforcement Learning using Target Apprentice, arXiv.","DOI":"10.1109\/ICRA.2018.8462977"},{"key":"ref_23","doi-asserted-by":"crossref","unstructured":"Kelly, S., and Heywood, M.I. (2015, January 11\u201315). Knowledge Transfer from Keepaway Soccer to Half-field Offense through Program Symbiosis: Building Simple Programs for a Complex Task. Proceedings of the 2015 Annual Conference on Genetic and Evolutionary Computation, Madrid, Spain.","DOI":"10.1145\/2739480.2754798"},{"key":"ref_24","unstructured":"Didi, S., and Nitschke, G. (April, January 30). Multi-agent behavior-based policy transfer. Proceedings of the 19th European Conference on the Applications of Evolutionary Computation, Porto, Portugal."},{"key":"ref_25","unstructured":"Rajendran, J., Prasanna, P., Ravindran, B., and Khapra, M. (arXiv, 2015). Adaapt: A deep architecture for adaptive policy transfer from multiple sources, arXiv."},{"key":"ref_26","doi-asserted-by":"crossref","unstructured":"Sutton, R.S., and Barto, A.G. (1998). Reinforcement Learning: An Introduction, MIT Press.","DOI":"10.1109\/TNN.1998.712192"},{"key":"ref_27","unstructured":"Albus, J.S. (1981). Brains, Behaviour, and Robotics, Byte Books."},{"key":"ref_28","doi-asserted-by":"crossref","first-page":"1703","DOI":"10.1109\/TPDS.2016.2626289","article-title":"Parallel deep neural network training for big data on blue gene\/q","volume":"28","author":"Chung","year":"2017","journal-title":"IEEE Trans. Parallel Distrib. Syst."},{"key":"ref_29","doi-asserted-by":"crossref","first-page":"16","DOI":"10.1016\/j.future.2018.04.050","article-title":"Multi-threaded learning control mechanism for neural networks","volume":"87","author":"Wei","year":"2018","journal-title":"Future Gener. Comput. Syst."},{"key":"ref_30","doi-asserted-by":"crossref","first-page":"866","DOI":"10.1016\/j.robot.2010.03.007","article-title":"Probabilistic policy reuse for inter-task transfer learning","volume":"58","author":"Veloso","year":"2010","journal-title":"Robot. Auton. Syst."},{"key":"ref_31","doi-asserted-by":"crossref","first-page":"8376","DOI":"10.1109\/ACCESS.2018.2808266","article-title":"An Adaptive Strategy Selection Method With Reinforcement Learning for Robotic Soccer Games","volume":"6","author":"Shi","year":"2018","journal-title":"IEEE Access"},{"key":"ref_32","unstructured":"Stone, P., Kuhlmann, G., Taylor, M.E., and Liu, Y. (2005). Keepaway soccer: From machine learning testbed to benchmark. Robot Soccer World Cup, Springer."},{"key":"ref_33","first-page":"1633","article-title":"Transfer learning for reinforcement learning domains: A survey","volume":"10","author":"Taylor","year":"2009","journal-title":"J. Mach. Learn. Res."}],"container-title":["Symmetry"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2073-8994\/11\/1\/25\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T15:36:35Z","timestamp":1760196995000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2073-8994\/11\/1\/25"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2018,12,29]]},"references-count":33,"journal-issue":{"issue":"1","published-online":{"date-parts":[[2019,1]]}},"alternative-id":["sym11010025"],"URL":"https:\/\/doi.org\/10.3390\/sym11010025","relation":{},"ISSN":["2073-8994"],"issn-type":[{"type":"electronic","value":"2073-8994"}],"subject":[],"published":{"date-parts":[[2018,12,29]]}}}