{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,6,9]],"date-time":"2025-06-09T07:03:55Z","timestamp":1749452635993},"reference-count":27,"publisher":"Walter de Gruyter GmbH","issue":"1","funder":[{"DOI":"10.13039\/501100002347","name":"Bundesministerium f\u00fcr Bildung und Forschung","doi-asserted-by":"publisher","award":["01IS19022"],"award-info":[{"award-number":["01IS19022"]}],"id":[{"id":"10.13039\/501100002347","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2022,1,27]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:p>Machine learning and particularly reinforcement learning methods may be applied to control tasks ranging from single control loops to the operation of whole production plants. However, their utilization in industrial contexts lacks understandability and requires suitable levels of operability and maintainability. In order to asses different application scenarios a simple measure for their complexity is proposed and evaluated on four examples in a simulated palette transport system of a cold rolling mill. The measure is based on the size of controller input and output space determined by different granularity levels in a hierarchical process control model. The impact of these decomposition strategies on system characteristics, especially operability and maintainability, are discussed, assuming solvability and a suitable quality of the reinforcement learning solution is provided.<\/jats:p>","DOI":"10.1515\/auto-2021-0118","type":"journal-article","created":{"date-parts":[[2022,1,12]],"date-time":"2022-01-12T13:16:53Z","timestamp":1641993413000},"page":"53-66","source":"Crossref","is-referenced-by-count":1,"title":["Assessment of reinforcement learning applications for industrial control based on complexity measures"],"prefix":"10.1515","volume":"70","author":[{"given":"Julian","family":"Grothoff","sequence":"first","affiliation":[{"name":"Chair of Information and Automation Systems for Process and Material Technology , 9165 RWTH Aachen University , Turmstr. 46 , Aachen , Germany"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Nicolas","family":"Camargo Torres","sequence":"additional","affiliation":[{"name":"Chair of Information and Automation Systems for Process and Material Technology , 9165 RWTH Aachen University , Turmstr. 46 , Aachen , Germany"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Tobias","family":"Kleinert","sequence":"additional","affiliation":[{"name":"Chair of Information and Automation Systems for Process and Material Technology , 9165 RWTH Aachen University , Turmstr. 46 , Aachen , Germany"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"374","published-online":{"date-parts":[[2022,1,13]]},"reference":[{"key":"2023033111590331578_j_auto-2021-0118_ref_001","doi-asserted-by":"crossref","unstructured":"Barredo-Arrieta, A., I. La\u00f1a and J. Del Ser. 2019. What lies beneath: A note on the explainability of black-box machine learning models for road traffic forecasting. In: Intelligent Transportation Systems Conference (ITSC).","DOI":"10.1109\/ITSC.2019.8916985"},{"key":"2023033111590331578_j_auto-2021-0118_ref_002","unstructured":"Dann, C. and E. Brunskill. 2015. Sample Complexity of Episodic Fixed-Horizon Reinforcement Learning. arXiv preprint."},{"key":"2023033111590331578_j_auto-2021-0118_ref_003","doi-asserted-by":"crossref","unstructured":"Elfaham, H. and U. Epple. 2020. Meta Models for Intralogistics. at \u2013 Automatisierungstechnik 68(3): 208\u2013221.","DOI":"10.1515\/auto-2019-0083"},{"key":"2023033111590331578_j_auto-2021-0118_ref_004","unstructured":"Furuta, H., T. Matsushima, T. Kozuno, Y. Matsuo, S. Levine, O. Nachum and S.\u2009S. Gu. 2021. Policy Information Capacity: Information-Theoretic Measure for Task Complexity in Deep Reinforcement Learning. arXiv preprint."},{"key":"2023033111590331578_j_auto-2021-0118_ref_005","doi-asserted-by":"crossref","unstructured":"Gazzaneo, V., J.\u2009C. Carrasco, D.\u2009R. Vinson and F.\u2009V. Lima. 2019. Process Operability Algorithms: Past, Present, and Future Developments. Industrial & Engineering Chemistry Research 59(6): 2457\u20132470.","DOI":"10.1021\/acs.iecr.9b05181"},{"key":"2023033111590331578_j_auto-2021-0118_ref_006","doi-asserted-by":"crossref","unstructured":"Grothoff, J. and H. Elfahaam. 2020. Interoperabilit\u00e4t und Wandelbarkeit in Cyber-Physischen-Produktionssystemen durch modulare Prozessf\u00fchrungs-Komponenten. In: Handbuch Industrie 4.0, Springer Reference Technik.","DOI":"10.1007\/978-3-662-45537-1_144-1"},{"key":"2023033111590331578_j_auto-2021-0118_ref_007","doi-asserted-by":"crossref","unstructured":"Grothoff, J. and T. Kleinert. 2020. Mapping of Standardized State Machines to Utilize Machine Learning Models in Process Control Environments. In: Cybersecurity workshop by European Steel Technology Platform.","DOI":"10.1007\/978-3-030-69367-1_4"},{"key":"2023033111590331578_j_auto-2021-0118_ref_008","unstructured":"Grothoff, J., C. Wagner and U. Epple. 2018. BaSys 4.0: Metamodell der Komponenten und Ihres Aufbaus. Publikationsserver der RWTH Aachen University, Aachen."},{"key":"2023033111590331578_j_auto-2021-0118_ref_009","doi-asserted-by":"crossref","unstructured":"Guidotti, R., A. Monreale, S. Ruggieri, F. Turini, F. Giannotti and D. Pedreschi. 2019. A Survey of Methods for Explaining Black Box Models. ACM Comput. Surv. 51(5): 1\u201341.","DOI":"10.1145\/3236009"},{"key":"2023033111590331578_j_auto-2021-0118_ref_010","doi-asserted-by":"crossref","unstructured":"Heuillet, A., F. Couthouis and N. D\u00edaz-Rodr\u00edguez. 2021. Explainability in deep reinforcement learning. Knowledge-Based Systems 214: 106685.","DOI":"10.1016\/j.knosys.2020.106685"},{"key":"2023033111590331578_j_auto-2021-0118_ref_011","unstructured":"Islam, S.\u2009R., W. Eberle and S.\u2009K. Ghafoor. 2020. Towards quantification of explainability in explainable artificial intelligence methods. In The Thirty-Third International Flairs Conference."},{"key":"2023033111590331578_j_auto-2021-0118_ref_012","doi-asserted-by":"crossref","unstructured":"Kearns, M. and S. Singh. 2002. Near-Optimal Reinforcement Learning in Polynomial Time. Machine Learning 492: 209\u2013232.","DOI":"10.1023\/A:1017984413808"},{"key":"2023033111590331578_j_auto-2021-0118_ref_013","unstructured":"Koenig, S. and R.\u2009G. Simmons. 1993. Complexity Analysis of Real-Time Reinforcement Learning. In: AAAI, pp.\u200999\u2013107."},{"key":"2023033111590331578_j_auto-2021-0118_ref_014","unstructured":"Lattimore, T., M. Hutter and P. Sunehag. 2013. The Sample-Complexity of General Reinforcement Learning. In: International Conference on Machine Learning."},{"key":"2023033111590331578_j_auto-2021-0118_ref_015","doi-asserted-by":"crossref","unstructured":"Lunze, J. and B. Nixdorf. 2001. Representation of Hybrid Systems by Means of Stochastic Automata. Mathematical and Computer Modelling of Dynamical Systems 4(7): 383\u2013422.","DOI":"10.1076\/mcmd.7.4.383.3639"},{"key":"2023033111590331578_j_auto-2021-0118_ref_016","unstructured":"Lunze, J. and J. Raisch. 2002. Discrete Models for Hybrid Systems. In: Modelling, Analysis, and Design of Hybrid Systems. Lecture Notes in Control and Information Sciences."},{"key":"2023033111590331578_j_auto-2021-0118_ref_017","doi-asserted-by":"crossref","unstructured":"Lunze, J. and J. Schr\u00f6der. 2001. Computation of complete abstractions of quantised systems. In: European Control Conference.","DOI":"10.23919\/ECC.2001.7076414"},{"key":"2023033111590331578_j_auto-2021-0118_ref_018","doi-asserted-by":"crossref","unstructured":"Najafi, E., G.\u2009A. Lopes and R. Babu\u0161ka. 2013. Reinforcement learning for sequential composition control. In: IEEE 52nd Annual Conference on Decision and Control (CDC), Florence, Italy.","DOI":"10.1109\/CDC.2013.6761042"},{"key":"2023033111590331578_j_auto-2021-0118_ref_019","doi-asserted-by":"crossref","unstructured":"Quah, T., D. Machalek and K.\u2009M. Powell. 2020. Comparing Reinforcement Learning Methods for Real-Time Optimization of a Chemical Process. Processes 8: 1497.","DOI":"10.3390\/pr8111497"},{"key":"2023033111590331578_j_auto-2021-0118_ref_020","doi-asserted-by":"crossref","unstructured":"Schwung, D., J.\u2009N. Reimann, A. Schwung and S.\u2009X. Ding. 2018. Self Learning in Flexible Manufacturing Units: A Reinforcement Learning Approach. In: International Conference on Intelligent Systems (IS 2018), Madeira, Portugal.","DOI":"10.1109\/IS.2018.8710460"},{"key":"2023033111590331578_j_auto-2021-0118_ref_021","doi-asserted-by":"crossref","unstructured":"Spielberg, S., A. Tulsyan, N.\u2009P. Lawrence, P.\u2009D. Loewen and B. Gopaluni. 2019. Toward self-driving processes: A deep reinforcement learning approach to control. AIChE Journal 65: e16689.","DOI":"10.1002\/aic.16689"},{"key":"2023033111590331578_j_auto-2021-0118_ref_022","unstructured":"Szita, I. and S. Csaba. 2010. Model-Based Reinforcement Learning with Nearly Tight Exploration Complexity Bounds. In ICML."},{"key":"2023033111590331578_j_auto-2021-0118_ref_023","doi-asserted-by":"crossref","unstructured":"Terzimehic, T., M. Wenger, A. Zoitl, A. Bayha, K. Becker, T. M\u00fcller and H. Schauerte. 2017. Towards an industry 4.0 compliant control software architecture using IEC 61499 and OPC UA. In: 22nd IEEE International Conference on Emerging Technologies and Factory Automation (ETFA).","DOI":"10.1109\/ETFA.2017.8247718"},{"key":"2023033111590331578_j_auto-2021-0118_ref_024","doi-asserted-by":"crossref","unstructured":"Wagner, C., C. v. Trotha, F. Palm and U. Epple. 2017. Fundamentals for the next Generation of Automation Solutions of the Fourth Industrial Revolution. In: The 2017 Asian Control Conference \u2013 ASCC 2017, Gold Coast, Australia.","DOI":"10.1109\/ASCC.2017.8287596"},{"key":"2023033111590331578_j_auto-2021-0118_ref_025","doi-asserted-by":"crossref","unstructured":"Yamasaki, T. and T. Ushio. 2005. Decentralized Supervisory Control of Discrete Event Systems Based on Reinforcement Learning. IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences E88-A: 2982\u20132988.","DOI":"10.1093\/ietfec\/e88-a.11.2982"},{"key":"2023033111590331578_j_auto-2021-0118_ref_026","doi-asserted-by":"crossref","unstructured":"Zhao, W., J.\u2009P. Queralta and T. Westerlund. 2020. Sim-to-Real Transfer in Deep Reinforcement Learning for Robotics: a Survey. In: IEEE Symposium Series on Computational Intelligence (SSCI).","DOI":"10.1109\/SSCI47803.2020.9308468"},{"key":"2023033111590331578_j_auto-2021-0118_ref_027","doi-asserted-by":"crossref","unstructured":"Zhu, L., Y. Cui, G. Takami H. Kanokogi and T. Matsubara. 2020. Scalable reinforcement learning for plant-wide control of vinyl acetate monomer process. Control Engineering Practice 97: 104331.","DOI":"10.1016\/j.conengprac.2020.104331"}],"container-title":["at - Automatisierungstechnik"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.degruyter.com\/document\/doi\/10.1515\/auto-2021-0118\/xml","content-type":"application\/xml","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/www.degruyter.com\/document\/doi\/10.1515\/auto-2021-0118\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,3,31]],"date-time":"2023-03-31T16:24:42Z","timestamp":1680279882000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.degruyter.com\/document\/doi\/10.1515\/auto-2021-0118\/html"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,1,1]]},"references-count":27,"journal-issue":{"issue":"1","published-online":{"date-parts":[[2022,1,13]]},"published-print":{"date-parts":[[2022,1,27]]}},"alternative-id":["10.1515\/auto-2021-0118"],"URL":"https:\/\/doi.org\/10.1515\/auto-2021-0118","relation":{},"ISSN":["2196-677X","0178-2312"],"issn-type":[{"value":"2196-677X","type":"electronic"},{"value":"0178-2312","type":"print"}],"subject":[],"published":{"date-parts":[[2022,1,1]]}}}