{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,11,17]],"date-time":"2025-11-17T03:04:58Z","timestamp":1763348698217,"version":"build-2065373602"},"reference-count":72,"publisher":"MDPI AG","issue":"9","license":[{"start":{"date-parts":[[2024,9,18]],"date-time":"2024-09-18T00:00:00Z","timestamp":1726617600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"Natural Science Research Project of Guizhou Provincial Education Department (Youth Science and Technology Talent Development Project)","award":["Qianjiaoji [2024]42"],"award-info":[{"award-number":["Qianjiaoji [2024]42"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Symmetry"],"abstract":"<jats:p>As a complex nonlinear system, the inverted pendulum (IP) system has the characteristics of asymmetry and instability. In this paper, the IP system is controlled by a learned deep neural network (DNN) that directly maps the system states to control commands in an end-to-end style. On the basis of deep reinforcement learning (DRL), the detail reward function (DRF) is designed to guide the DNN learning control strategy, which greatly enhances the pertinence and flexibility of the control. Moreover, a two-phase learning protocol (offline learning phase and online learning phase) is proposed to solve the \u201creal gap\u201d problem of the IP system. Firstly, the DNN learns the offline control strategy based on a simplified IP dynamic model and DRF. Then, a security controller is designed and used on the IP platform to optimize the DNN online. The experimental results demonstrate that the DNN has good robustness to model errors after secondary learning on the platform. When the length of the pendulum is reduced by 25% or increased by 25%, the steady-state error of the pendulum angle is less than 0.05 rad. The error is within the allowable range. The DNN is robust to changes in the length of the pendulum. The DRF and the two-phase learning protocol improve the adaptability of the controller to the complex and variable characteristics of the real platform and provide reference for other learning-based robot control problems.<\/jats:p>","DOI":"10.3390\/sym16091227","type":"journal-article","created":{"date-parts":[[2024,9,18]],"date-time":"2024-09-18T09:43:13Z","timestamp":1726652593000},"page":"1227","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":3,"title":["Balance Controller Design for Inverted Pendulum Considering Detail Reward Function and Two-Phase Learning Protocol"],"prefix":"10.3390","volume":"16","author":[{"given":"Xiaochen","family":"Liu","sequence":"first","affiliation":[{"name":"School of Mechanical and Electrical Engineering, Guizhou Normal University, Guiyang 550025, China"}]},{"given":"Sipeng","family":"Wang","sequence":"additional","affiliation":[{"name":"School of Mechanical and Electrical Engineering, Guizhou Normal University, Guiyang 550025, China"}]},{"given":"Xingxing","family":"Li","sequence":"additional","affiliation":[{"name":"School of Mechanical and Electrical Engineering, Guizhou Normal University, Guiyang 550025, China"}]},{"given":"Ze","family":"Cui","sequence":"additional","affiliation":[{"name":"School of Control Science and Engineering, Dalian University of Technology, Dalian 116024, China"}]}],"member":"1968","published-online":{"date-parts":[[2024,9,18]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","first-page":"367","DOI":"10.1007\/s11071-005-7290-y","article-title":"Lyapunov-based controller for the inverted pendulum cart system","volume":"40","author":"Ibanez","year":"2005","journal-title":"Nonlinear Dyn."},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"29","DOI":"10.1017\/S0263574719000456","article-title":"Nonlinear optimal control for the wheeled inverted pendulum system","volume":"38","author":"Rigatos","year":"2020","journal-title":"Robotica"},{"key":"ref_3","doi-asserted-by":"crossref","unstructured":"Balcazar, R., Rubio, J.J., Orozco, E., and Garcia, E. (2022). The regulation of an electric oven and an inverted pendulum. Symmetry, 14.","DOI":"10.3390\/sym14040759"},{"key":"ref_4","doi-asserted-by":"crossref","first-page":"2428","DOI":"10.1017\/S0263574721001727","article-title":"Stabilization and tracking control of an xz type inverted pendulum system using Lightning Search Algorithm tuned nonlinear PID controller","volume":"40","author":"Marul","year":"2022","journal-title":"Robotica"},{"key":"ref_5","doi-asserted-by":"crossref","first-page":"6496","DOI":"10.1109\/ACCESS.2019.2963399","article-title":"Fuzzy swing up control and optimal state feedback stabilization for self-erecting inverted pendulum","volume":"8","author":"Susanto","year":"2020","journal-title":"IEEE Access"},{"key":"ref_6","doi-asserted-by":"crossref","first-page":"683","DOI":"10.1007\/s10846-020-01158-4","article-title":"Implementation of a perceptual controller for an inverted pendulum robot","volume":"99","author":"Johnson","year":"2020","journal-title":"J. Intell. Robot. Syst."},{"key":"ref_7","doi-asserted-by":"crossref","first-page":"1672","DOI":"10.1007\/s40435-020-00753-5","article-title":"Design and control of real-time inverted pendulum system with force-voltage parameter correlation","volume":"9","author":"Shreedharan","year":"2021","journal-title":"Int. J. Dyn. Control"},{"key":"ref_8","doi-asserted-by":"crossref","first-page":"38","DOI":"10.1016\/j.robot.2018.06.004","article-title":"Biped robot state estimation using compliant inverted pendulum model","volume":"108","author":"Bae","year":"2018","journal-title":"Robot. Auton. Syst."},{"key":"ref_9","doi-asserted-by":"crossref","first-page":"4004","DOI":"10.1109\/LRA.2023.3279585","article-title":"Design and Implementation of a Two-Wheeled Inverted Pendulum Robot With a Sliding Mechanism for Off-Road Transportation","volume":"8","author":"Lee","year":"2023","journal-title":"IEEE Robot. Autom. Lett."},{"key":"ref_10","doi-asserted-by":"crossref","first-page":"7667","DOI":"10.1109\/LRA.2021.3100269","article-title":"Learning-based balance control of wheel-legged robots","volume":"6","author":"Cui","year":"2021","journal-title":"IEEE Robot. Autom. Lett."},{"key":"ref_11","doi-asserted-by":"crossref","unstructured":"Choi, S.Y., Le, T.P., Nguyen, Q.D., Layek, M.A., Lee, S.G., and Chung, T.C. (2019). Toward self-driving bicycles using state-of-the-art deep reinforcement learning algorithms. Symmetry, 11.","DOI":"10.3390\/sym11020290"},{"key":"ref_12","doi-asserted-by":"crossref","first-page":"1357314","DOI":"10.1080\/23311916.2017.1357314","article-title":"Stabilization of nonlinear inverted pendulum system using MOGA and APSO tuned nonlinear PID controller","volume":"4","author":"Valluru","year":"2017","journal-title":"Cogent Eng."},{"key":"ref_13","doi-asserted-by":"crossref","first-page":"440","DOI":"10.1016\/j.simpat.2010.08.003","article-title":"Simulation studies of inverted pendulum based on PID controllers","volume":"19","author":"Wang","year":"2011","journal-title":"Simul. Model. Pract. Theory"},{"key":"ref_14","doi-asserted-by":"crossref","first-page":"1185","DOI":"10.1007\/s11071-014-1735-0","article-title":"Nonlinear control of triple inverted pendulum based on GA\u2013PIDNN","volume":"79","author":"Zhang","year":"2015","journal-title":"Nonlinear Dyn."},{"key":"ref_15","doi-asserted-by":"crossref","first-page":"2589","DOI":"10.1007\/s13369-020-05161-7","article-title":"Real-time stabilization control of a rotary inverted pendulum using LQR-based sliding mode controller","volume":"46","author":"Chawla","year":"2021","journal-title":"Arab. J. Sci. Eng."},{"key":"ref_16","doi-asserted-by":"crossref","first-page":"3684","DOI":"10.21595\/jve.2016.16787","article-title":"Tuning of LQR controller for an experimental inverted pendulum system based on The Bees Algorithm","volume":"18","author":"Bilgic","year":"2016","journal-title":"J. Vibroeng."},{"key":"ref_17","first-page":"168","article-title":"Particle swarm optimization based lqr control of an inverted pendulum","volume":"2","year":"2017","journal-title":"Eng. Technol. J."},{"key":"ref_18","doi-asserted-by":"crossref","first-page":"130","DOI":"10.14513\/actatechjaur.v12.n2.499","article-title":"State space based linear controller design for the inverted pendulum","volume":"12","author":"Kuczmann","year":"2019","journal-title":"Acta Tech. Jaurinensis"},{"key":"ref_19","doi-asserted-by":"crossref","first-page":"203","DOI":"10.14419\/ijet.v7i4.44.26985","article-title":"State-feedback control with a full-state estimator for a cart-inverted pendulum system","volume":"7","author":"Siradjuddin","year":"2018","journal-title":"Int. J. Eng. Technol."},{"key":"ref_20","doi-asserted-by":"crossref","unstructured":"Saleem, O., Abbas, F., and Iqbal, J. (2023). Complex fractional-order LQIR for inverted-pendulum-type robotic mechanisms: Design and experimental validation. Mathematics, 11.","DOI":"10.3390\/math11040913"},{"key":"ref_21","doi-asserted-by":"crossref","first-page":"93185","DOI":"10.1109\/ACCESS.2024.3415494","article-title":"Phase-Based Adaptive Fractional LQR for Inverted-Pendulum-Type Robots: Formulation and Verification","volume":"12","author":"Saleem","year":"2024","journal-title":"IEEE Access"},{"key":"ref_22","first-page":"753","article-title":"Advanced sliding mode control techniques for Inverted Pendulum: Modelling and simulation","volume":"21","author":"Irfan","year":"2018","journal-title":"Eng. Sci. Technol. Int. J."},{"key":"ref_23","doi-asserted-by":"crossref","first-page":"1645","DOI":"10.1177\/1077546315598031","article-title":"Model free sliding mode stabilizing control of a real rotary inverted pendulum","volume":"23","year":"2017","journal-title":"J. Vib. Control"},{"key":"ref_24","doi-asserted-by":"crossref","first-page":"14","DOI":"10.1016\/j.ins.2019.05.004","article-title":"Hierarchical sliding-mode control of spatial inverted pendulum with heterogeneous comprehensive learning particle swarm optimization","volume":"495","author":"Wang","year":"2019","journal-title":"Inf. Sci."},{"key":"ref_25","doi-asserted-by":"crossref","first-page":"353","DOI":"10.1007\/s40747-019-0097-0","article-title":"Incremental SMC-based CNF control strategy considering magnetic ball suspension and inverted pendulum systems through cuckoo search-genetic optimization algorithm","volume":"5","author":"Mazinan","year":"2019","journal-title":"Complex Intell. Syst."},{"key":"ref_26","doi-asserted-by":"crossref","first-page":"628","DOI":"10.1109\/JAS.2017.7510613","article-title":"Robust control design of wheeled inverted pendulum assistant robot","volume":"4","author":"Mahmoud","year":"2017","journal-title":"IEEE\/CAA J. Autom. Sin."},{"key":"ref_27","doi-asserted-by":"crossref","first-page":"14667","DOI":"10.1007\/s00521-020-04821-x","article-title":"Robust control based on adaptive neural network for Rotary inverted pendulum with oscillation compensation","volume":"32","author":"Zabihifar","year":"2020","journal-title":"Neural Comput. Appl."},{"key":"ref_28","doi-asserted-by":"crossref","first-page":"776","DOI":"10.1016\/j.ifacol.2017.08.252","article-title":"Model predictive control for an inverted-pendulum robot with time-varying constraints","volume":"50","author":"Ohhira","year":"2017","journal-title":"IFAC-PapersOnLine"},{"key":"ref_29","doi-asserted-by":"crossref","first-page":"374","DOI":"10.1080\/01691864.2016.1141115","article-title":"Following control approach based on model predictive control for wheeled inverted pendulum robot","volume":"30","author":"Hirose","year":"2016","journal-title":"Adv. Robot."},{"key":"ref_30","doi-asserted-by":"crossref","first-page":"236","DOI":"10.1016\/j.automatica.2018.04.025","article-title":"Event-triggered fuzzy control of nonlinear systems with its application to inverted pendulum systems","volume":"94","author":"Su","year":"2018","journal-title":"Automatica"},{"key":"ref_31","doi-asserted-by":"crossref","first-page":"26083","DOI":"10.1109\/ACCESS.2021.3057658","article-title":"Design of a decoupling fuzzy control scheme for omnidirectional inverted pendulum real-world control","volume":"9","author":"Chiu","year":"2021","journal-title":"IEEE Access"},{"key":"ref_32","first-page":"164","article-title":"Design and implementation of adaptive control logic for cart-inverted pendulum system","volume":"233","author":"Hanwate","year":"2019","journal-title":"Proc. Inst. Mech. Eng. Part I: J. Syst. Control Eng."},{"key":"ref_33","doi-asserted-by":"crossref","first-page":"111257","DOI":"10.1016\/j.chaos.2021.111257","article-title":"Fuzzy logic and gradient descent-based optimal adaptive robust controller with inverted pendulum verification","volume":"151","author":"Lakmesari","year":"2021","journal-title":"Chaos Solitons Fractals"},{"key":"ref_34","doi-asserted-by":"crossref","first-page":"888","DOI":"10.1109\/TAC.2020.2987313","article-title":"Adaptive optimal control of linear periodic systems: An off-policy value iteration approach","volume":"66","author":"Pang","year":"2020","journal-title":"IEEE Trans. Autom. Control"},{"key":"ref_35","doi-asserted-by":"crossref","first-page":"1139","DOI":"10.1007\/s12555-019-0912-9","article-title":"Adaptive reinforcement learning strategy with sliding mode control for unknown and disturbed wheeled inverted pendulum","volume":"19","author":"Dao","year":"2021","journal-title":"Int. J. Control Autom. Syst."},{"key":"ref_36","doi-asserted-by":"crossref","first-page":"3537","DOI":"10.1007\/s11276-019-02225-x","article-title":"Design and application of adaptive PID controller based on asynchronous advantage actor\u2013critic learning method","volume":"27","author":"Sun","year":"2021","journal-title":"Wirel. Netw."},{"key":"ref_37","doi-asserted-by":"crossref","unstructured":"Ma, Y., Xu, D., Huang, J., and Li, Y. (2023). Robust Control of An Inverted Pendulum System Based on Policy Iteration in Reinforcement Learning. Appl. Sci., 13.","DOI":"10.3390\/app132413181"},{"key":"ref_38","doi-asserted-by":"crossref","first-page":"107518","DOI":"10.1016\/j.engappai.2023.107518","article-title":"Reinforcement learning to achieve real-time control of triple inverted pendulum","volume":"128","author":"Baek","year":"2024","journal-title":"Eng. Appl. Artif. Intell."},{"key":"ref_39","doi-asserted-by":"crossref","first-page":"3093","DOI":"10.1007\/s12555-019-0278-z","article-title":"Balance control for the first-order inverted pendulum based on the advantage actor-critic algorithm","volume":"18","author":"Zheng","year":"2020","journal-title":"Int. J. Control Autom. Syst."},{"key":"ref_40","unstructured":"Wang, J.X., Kurth-Nelson, Z., Tirumala, D., Soyer, H., and Munos, R. (2016). Learning to reinforcement learn. arXiv."},{"key":"ref_41","unstructured":"Bellemare, M.G., Dabney, W., and Munos, R. (2017, January 6\u201311). A distributional perspective on reinforcement learning. Proceedings of the ICML'17: 34th International Conference on Machine Learning, Sydney, NSW, Australia."},{"key":"ref_42","unstructured":"Jiang, G., Wu, C.P., and Cybenko, G. (1998, January 18). Minimax-based reinforcement learning with state aggregation. Proceedings of the 37th IEEE Conference on Decision and Control (Cat. No.98CH36171), Tampa, FL, USA."},{"key":"ref_43","doi-asserted-by":"crossref","first-page":"201","DOI":"10.1109\/TSMCC.2011.2106494","article-title":"Experience replay for real-time reinforcement learning control","volume":"42","author":"Adam","year":"2011","journal-title":"IEEE Trans. Syst. Man Cybern. Part C (Appl. Rev.)"},{"key":"ref_44","doi-asserted-by":"crossref","first-page":"698","DOI":"10.1177\/0278364920987859","article-title":"How to train your robot with deep reinforcement learning: Lessons we have learned","volume":"40","author":"Ibarz","year":"2021","journal-title":"Int. J. Robot. Res."},{"key":"ref_45","doi-asserted-by":"crossref","unstructured":"Manrique Escobar, C.A., Pappalardo, C.M., and Guida, D. (2020). A parametric study of a deep reinforcement learning control system applied to the swing-up problem of the cart-pole. Appl. Sci., 10.","DOI":"10.3390\/app10249013"},{"key":"ref_46","doi-asserted-by":"crossref","unstructured":"Hu, H., Chen, Y., Wang, T., Feng, F., and Chen, W.J. (2023). Research on the deep deterministic policy algorithm based on the first-order inverted pendulum. Appl. Sci., 13.","DOI":"10.3390\/app13137594"},{"key":"ref_47","doi-asserted-by":"crossref","first-page":"120118","DOI":"10.1016\/j.energy.2021.120118","article-title":"A novel energy management strategy of hybrid electric vehicle via an improved TD3 deep reinforcement learning","volume":"224","author":"Zhou","year":"2021","journal-title":"Energy"},{"key":"ref_48","doi-asserted-by":"crossref","unstructured":"Zhan, G., Zhang, X., Li, Z., Xu, L., and Zhou, D. (2022). Multiple-uav reinforcement learning algorithm based on improved ppo in ray framework. Drones, 6.","DOI":"10.3390\/drones6070166"},{"key":"ref_49","doi-asserted-by":"crossref","first-page":"6658724","DOI":"10.1155\/2021\/6658724","article-title":"Averaged Soft Actor-Critic for Deep Reinforcement Learning","volume":"2021","author":"Ding","year":"2021","journal-title":"Complexity"},{"key":"ref_50","doi-asserted-by":"crossref","first-page":"36682","DOI":"10.1109\/ACCESS.2019.2905621","article-title":"Imitation reinforcement learning-based remote rotary inverted pendulum control in openflow network","volume":"7","author":"Kim","year":"2019","journal-title":"IEEE Access"},{"key":"ref_51","doi-asserted-by":"crossref","first-page":"417","DOI":"10.1177\/00202940211000380","article-title":"A real-time HIL control system on rotary inverted pendulum hardware platform based on double deep Q-network","volume":"54","author":"Dai","year":"2021","journal-title":"Meas. Control"},{"key":"ref_52","doi-asserted-by":"crossref","unstructured":"Malviya, S., Kumar, P., Namasudra, S., and Tiwary, U.S. (2022). Experience replay-based deep reinforcement learning for dialogue management optimisation. Trans. Asian Low-Resour. Lang. Inf. Process.","DOI":"10.1145\/3539223"},{"key":"ref_53","doi-asserted-by":"crossref","unstructured":"\u00d6zalp, R., Varol, N.K., Ta\u015fci, B., and Ucar, A. (2020). A review of deep reinforcement learning algorithms and comparative results on inverted pendulum system. Machine Learning Paradigms: Advances in Deep Learning-Based Technological Applications, Springer.","DOI":"10.1007\/978-3-030-49724-8_10"},{"key":"ref_54","doi-asserted-by":"crossref","unstructured":"Israilov, S., Fu, L., S\u00e1nchez-Rodr\u00edguez, J., Fusco, F., Allbert, G., and Raufaste, C. (2023). Reinforcement learning approach to control an inverted pendulum: A general framework for educational purposes. PLoS ONE, 18.","DOI":"10.1371\/journal.pone.0280071"},{"key":"ref_55","doi-asserted-by":"crossref","first-page":"335","DOI":"10.1108\/IR-11-2019-0240","article-title":"Deep reinforcement learning-based attitude motion control for humanoid robots with stability constraints","volume":"47","author":"Shi","year":"2020","journal-title":"Ind. Robot Int. J. Robot. Res. Appl."},{"key":"ref_56","doi-asserted-by":"crossref","first-page":"3713","DOI":"10.1109\/TSMC.2018.2884725","article-title":"Deterministic policy gradient with integral compensator for robust quadrotor control","volume":"50","author":"Wang","year":"2019","journal-title":"IEEE Trans. Syst. Man Cybern. Syst."},{"key":"ref_57","unstructured":"Pham, H.X., La, H.M., and Feil-Seifer, D. (2018). Autonomous uav navigation using reinforcement learning. arXiv."},{"key":"ref_58","doi-asserted-by":"crossref","first-page":"834","DOI":"10.1109\/TSMC.1983.6313077","article-title":"Neuronlike adaptive elements that can solve difficult learning control problems","volume":"SMC-13","author":"Barto","year":"1983","journal-title":"IEEE Trans. Syst. Man Cybern."},{"key":"ref_59","doi-asserted-by":"crossref","first-page":"661","DOI":"10.1007\/s11633-014-0818-1","article-title":"Optimal control of nonlinear inverted pendulum system using PID controller and LQR: Performance analysis without and with disturbance input","volume":"11","author":"Prasad","year":"2014","journal-title":"Int. J. Autom. Comput."},{"key":"ref_60","doi-asserted-by":"crossref","first-page":"49089","DOI":"10.1109\/ACCESS.2018.2854283","article-title":"A deep hierarchical reinforcement learning algorithm in partially observable Markov decision processes","volume":"6","author":"Le","year":"2018","journal-title":"IEEE Access"},{"key":"ref_61","doi-asserted-by":"crossref","first-page":"181","DOI":"10.1016\/S0004-3702(99)00052-1","article-title":"Between MDPs and semi-MDPs: A framework for temporal abstraction in reinforcement learning","volume":"112","author":"Sutton","year":"1999","journal-title":"Artif. Intell."},{"key":"ref_62","unstructured":"Wei, C.Y., Jahromi, M.J., Luo, H., Sharma, H., and Jain, R. (2020, January 13\u201318). Model-free reinforcement learning in infinite-horizon average-reward markov decision processes. Proceedings of the ICML'20: 37th International Conference on Machine Learning, Virtual."},{"key":"ref_63","doi-asserted-by":"crossref","first-page":"162","DOI":"10.1016\/j.ins.2021.11.051","article-title":"Policy iteration reinforcement learning-based control using a grey wolf optimizer algorithm","volume":"585","author":"Zamfirache","year":"2022","journal-title":"Inf. Sci."},{"key":"ref_64","unstructured":"Lillicrap, T.P., Hunt, J.J., Pritzel, A., Tassa, Y., and Wierstra, D. (2015). Continuous control with deep reinforcement learning. arXiv."},{"key":"ref_65","doi-asserted-by":"crossref","unstructured":"Wu, X., Liu, S., Zhang, T., Yang, L., and Wang, T. (2018, January 16). Motion control for biped robot via DDPG-based deep reinforcement learning. Proceedings of the 2018 WRC Symposium on Advanced Robotics and Automation (WRC SARA), Beijing, China.","DOI":"10.1109\/WRC-SARA.2018.8584227"},{"key":"ref_66","doi-asserted-by":"crossref","first-page":"7194","DOI":"10.3390\/app12147194","article-title":"Position control of a mobile robot through deep reinforcement learning","volume":"12","author":"Quiroga","year":"2022","journal-title":"Appl. Sci."},{"key":"ref_67","doi-asserted-by":"crossref","first-page":"351","DOI":"10.1007\/s10846-018-0891-8","article-title":"A deep reinforcement learning strategy for UAV autonomous landing on a moving platform","volume":"93","author":"Sampedro","year":"2019","journal-title":"J. Intell. Robot. Syst."},{"key":"ref_68","doi-asserted-by":"crossref","first-page":"9258","DOI":"10.3934\/mbe.2022430","article-title":"An approach to solving optimal control problems of nonlinear systems by introducing detail-reward mechanism in deep reinforcement learning","volume":"19","author":"Yao","year":"2022","journal-title":"Math. Biosci. Eng."},{"key":"ref_69","doi-asserted-by":"crossref","unstructured":"Yao, S., Liu, X., Zhang, Y., and Cui, Z. (2022). Research on solving nonlinear problem of ball and beam system by introducing detail-reward function. Symmetry, 14.","DOI":"10.3390\/sym14091883"},{"key":"ref_70","first-page":"15312","article-title":"Information theoretic regret bounds for online nonlinear control","volume":"33","author":"Kakade","year":"2020","journal-title":"Adv. Neural Inf. Process. Syst."},{"key":"ref_71","doi-asserted-by":"crossref","first-page":"234","DOI":"10.1016\/j.neucom.2021.10.064","article-title":"Tednet: A pytorch toolkit for tensor decomposition networks","volume":"469","author":"Pan","year":"2022","journal-title":"Neurocomputing"},{"key":"ref_72","doi-asserted-by":"crossref","unstructured":"Rahman, M.D.M., Rashid, S.M.H., and Hossain, M.M. (2018). Implementation of Q learning and deep Q network for controlling a self balancing robot model. Robot. Biomim., 5.","DOI":"10.1186\/s40638-018-0091-9"}],"container-title":["Symmetry"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2073-8994\/16\/9\/1227\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,10]],"date-time":"2025-10-10T15:58:49Z","timestamp":1760111929000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2073-8994\/16\/9\/1227"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,9,18]]},"references-count":72,"journal-issue":{"issue":"9","published-online":{"date-parts":[[2024,9]]}},"alternative-id":["sym16091227"],"URL":"https:\/\/doi.org\/10.3390\/sym16091227","relation":{},"ISSN":["2073-8994"],"issn-type":[{"type":"electronic","value":"2073-8994"}],"subject":[],"published":{"date-parts":[[2024,9,18]]}}}