{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,3]],"date-time":"2026-03-03T00:54:56Z","timestamp":1772499296461,"version":"3.50.1"},"reference-count":55,"publisher":"Springer Science and Business Media LLC","issue":"1","license":[{"start":{"date-parts":[[2021,7,7]],"date-time":"2021-07-07T00:00:00Z","timestamp":1625616000000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2021,7,7]],"date-time":"2021-07-07T00:00:00Z","timestamp":1625616000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"DOI":"10.13039\/501100001659","name":"Deutsche Forschungsgemeinschaft","doi-asserted-by":"publisher","award":["415804944"],"award-info":[{"award-number":["415804944"]}],"id":[{"id":"10.13039\/501100001659","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100001659","name":"Deutsche Forschungsgemeinschaft","doi-asserted-by":"publisher","award":["415804944"],"award-info":[{"award-number":["415804944"]}],"id":[{"id":"10.13039\/501100001659","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100001659","name":"Deutsche Forschungsgemeinschaft","doi-asserted-by":"publisher","award":["415804944"],"award-info":[{"award-number":["415804944"]}],"id":[{"id":"10.13039\/501100001659","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["J Intell Manuf"],"published-print":{"date-parts":[[2022,1]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>A major goal of materials design is to find material structures with desired properties and in a second step to find a processing path to reach one of these structures. In this paper, we propose and investigate a deep reinforcement learning approach for the optimization of processing paths. The goal is to find optimal processing paths in the material structure space that lead to target-structures, which have been identified beforehand to result in desired material properties. There exists a target set containing one or multiple different structures, bearing the desired properties. Our proposed methods can find an optimal path from a start structure to a single target structure, or optimize the processing paths to one of the equivalent target-structures in the set. In the latter case, the algorithm learns during processing to simultaneously identify the best reachable target structure and the optimal path to it. The proposed methods belong to the family of model-free deep reinforcement learning algorithms. They are guided by structure representations as features of the process state and by a reward signal, which is formulated based on a distance function in the structure space. Model-free reinforcement learning algorithms learn through trial and error while interacting with the process. Thereby, they are not restricted to information from a priori sampled processing data and are able to adapt to the specific process. The optimization itself is model-free and does not require any prior knowledge about the process itself. We instantiate and evaluate the proposed methods by optimizing paths of a generic metal forming process. We show the ability of both methods to find processing paths leading close to target structures and the ability of the extended method to identify target-structures that can be reached effectively and efficiently and to focus on these targets for sample efficient processing path optimization.<\/jats:p>","DOI":"10.1007\/s10845-021-01805-z","type":"journal-article","created":{"date-parts":[[2021,7,7]],"date-time":"2021-07-07T17:27:01Z","timestamp":1625678821000},"page":"333-352","update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":20,"title":["Deep reinforcement learning methods for structure-guided processing path optimization"],"prefix":"10.1007","volume":"33","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-2752-0554","authenticated-orcid":false,"given":"Johannes","family":"Dornheim","sequence":"first","affiliation":[]},{"given":"Lukas","family":"Morand","sequence":"additional","affiliation":[]},{"given":"Samuel","family":"Zeitvogel","sequence":"additional","affiliation":[]},{"given":"Tarek","family":"Iraki","sequence":"additional","affiliation":[]},{"given":"Norbert","family":"Link","sequence":"additional","affiliation":[]},{"given":"Dirk","family":"Helm","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2021,7,7]]},"reference":[{"key":"1805_CR1","doi-asserted-by":"publisher","first-page":"4022","DOI":"10.2514\/1.J055247","volume":"1","author":"P Acar","year":"2016","unstructured":"Acar, P., & Sundararaghavan, V. (2016). Linear solution scheme for microstructure design with process constraints. AIAA Journal, 1, 4022\u20134031.","journal-title":"AIAA Journal"},{"issue":"12","key":"1805_CR2","doi-asserted-by":"publisher","first-page":"5041","DOI":"10.2514\/1.J057221","volume":"56","author":"P Acar","year":"2018","unstructured":"Acar, P., & Sundararaghavan, V. (2018). Reduced-order modeling approach for materials design with a sequence of processes. AIAA Journal, 56(12), 5041\u20135044.","journal-title":"AIAA Journal"},{"issue":"8","key":"1805_CR3","doi-asserted-by":"publisher","first-page":"1639","DOI":"10.1016\/S0022-5096(01)00016-3","volume":"49","author":"BL Adams","year":"2001","unstructured":"Adams, B. L., Henrie, A., Henrie, B., Lyon, M., Kalidindi, S., & Garmestani, H. (2001). Microstructure-sensitive design of a compliant beam. Journal of the Mechanics and Physics of Solids, 49(8), 1639\u20131663.","journal-title":"Journal of the Mechanics and Physics of Solids"},{"key":"1805_CR4","unstructured":"Andrychowicz, M., Wolski, F., Ray, A., Schneider, J., Fong, R., Welinder, P., McGrew, B., Tobin, J., Abbeel, O.P., & Zaremba, W. (2017). Hindsight experience replay. In Advances in Neural Information Processing Systems (pp. 5048\u20135058)."},{"issue":"6","key":"1805_CR5","doi-asserted-by":"publisher","first-page":"923","DOI":"10.1016\/0001-6160(85)90188-9","volume":"33","author":"RJ Asaro","year":"1985","unstructured":"Asaro, R. J., & Needleman, A. (1985). Overview No. 42 texture development and strain hardening in rate dependent polycrystals. Acta Metallurgica, 33(6), 923\u2013953. https:\/\/doi.org\/10.1016\/0001-6160(85)90188-9.","journal-title":"Acta Metallurgica"},{"key":"1805_CR6","doi-asserted-by":"crossref","unstructured":"Bachmann, F., Hielscher, R., & Schaeben, H. (2010). Texture analysis with mtex\u2013free and open source software toolbox. In Solid State Phenomena (Vol. 160, pp. 63\u201368). Trans Tech Publ (2010)","DOI":"10.4028\/www.scientific.net\/SSP.160.63"},{"issue":"6","key":"1805_CR7","doi-asserted-by":"publisher","first-page":"988","DOI":"10.1002\/srin.201300202","volume":"85","author":"M Baiker","year":"2014","unstructured":"Baiker, M., Helm, D., & Butz, A. (2014). Determination of mechanical properties of polycrystals by using crystal plasticity and numerical homogenization schemes. Steel Research International, 85(6), 988\u2013998. https:\/\/doi.org\/10.1002\/srin.201300202.","journal-title":"Steel Research International"},{"key":"1805_CR8","unstructured":"Brockman, G., Cheung, V., Pettersson, L., Schneider, J., Schulman, J., Tang, J., & Zaremba, W. (2016). Openai gym. arXiv preprint arXiv:1606.01540"},{"issue":"3","key":"1805_CR9","doi-asserted-by":"publisher","first-page":"191","DOI":"10.1016\/0036-9748(84)90506-4","volume":"18","author":"H Bunge","year":"1984","unstructured":"Bunge, H., & Esling, C. (1984). Texture development by plastic deformation. Scripta Metallurgica, 18(3), 191\u2013195.","journal-title":"Scripta Metallurgica"},{"key":"1805_CR10","unstructured":"Bunge, H. J. (2013). Texture analysis in materials science: mathematical methods. Elsevier."},{"key":"1805_CR11","first-page":"1","volume":"1","author":"J Dornheim","year":"2019","unstructured":"Dornheim, J., Link, N., & Gumbsch, P. (2019). Model-free adaptive optimal control of episodic fixed-horizon manufacturing processes using reinforcement learning. International Journal of Control Automation and Systems, 1, 1\u201312.","journal-title":"International Journal of Control Automation and Systems"},{"key":"1805_CR12","doi-asserted-by":"publisher","first-page":"37","DOI":"10.1016\/j.ijplas.2012.09.012","volume":"46","author":"P Eisenlohr","year":"2013","unstructured":"Eisenlohr, P., Diehl, M., Lebensohn, R. A., & Roters, F. (2013). A spectral method solution to crystal elasto-viscoplasticity at finite strains. International Journal of Plasticity, 46, 37\u201353. https:\/\/doi.org\/10.1016\/j.ijplas.2012.09.012.","journal-title":"International Journal of Plasticity"},{"key":"1805_CR13","first-page":"33","volume":"12","author":"HP Frederikse","year":"2008","unstructured":"Frederikse, H. P. (2008). Elastic constants of single crystals. Handbook of Chemistry and Physics, 12, 33\u201338.","journal-title":"Handbook of Chemistry and Physics"},{"issue":"6","key":"1805_CR14","doi-asserted-by":"publisher","first-page":"477","DOI":"10.1016\/j.pmatsci.2009.08.002","volume":"55","author":"DT Fullwood","year":"2010","unstructured":"Fullwood, D. T., Niezgoda, S. R., Adams, B. L., & Kalidindi, S. R. (2010). Microstructure sensitive design for performance optimization. Progress in Materials Science, 55(6), 477\u2013562.","journal-title":"Progress in Materials Science"},{"key":"1805_CR15","unstructured":"Grze\u015b, M. (2017). Reward shaping in episodic reinforcement learning. In Proceedings of the 16th Conference on Autonomous Agents and MultiAgent Systems, AAMAS \u2013 17 (pp. 565\u2013573). ACM."},{"key":"1805_CR16","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1016\/j.mechatronics.2015.09.004","volume":"34","author":"J G\u00fcnther","year":"2016","unstructured":"G\u00fcnther, J., Pilarski, P. M., Helfrich, G., Shen, H., & Diepold, K. (2016). Intelligent laser welding through representation, prediction, and control learning: An architecture with deep neural networks and reinforcement learning. Mechatronics, 34, 1\u201311.","journal-title":"Mechatronics"},{"key":"1805_CR17","unstructured":"Hoffmann, T. (2010). Identifikation und Validierung eines kristallplastischen Modells auf Makro- und Mikroebene. Ph.D. thesis, Fakult\u00e4t f\u00fcr Maschinenbau der Otto-von-Guericke-Universit\u00e4t Magdeburg."},{"issue":"2","key":"1805_CR18","doi-asserted-by":"publisher","first-page":"155","DOI":"10.1007\/s10851-009-0161-2","volume":"35","author":"DQ Huynh","year":"2009","unstructured":"Huynh, D. Q. (2009). Metrics for 3D rotations: Comparison and analysis. Journal of Mathematical Imaging and Vision, 35(2), 155\u2013164. https:\/\/doi.org\/10.1007\/s10851-009-0161-2.","journal-title":"Journal of Mathematical Imaging and Vision"},{"issue":"3","key":"1805_CR19","doi-asserted-by":"publisher","first-page":"537","DOI":"10.1016\/0022-5096(92)80003-9","volume":"40","author":"SR Kalidindi","year":"1992","unstructured":"Kalidindi, S. R., Bronkhorst, C. A., & Anand, L. (1992). Crystallographic texture evolution in bulk deformation processing of fcc metals. Journal of the Mechanics and Physics of Solids, 40(3), 537\u2013569. https:\/\/doi.org\/10.1016\/0022-5096(92)80003-9.","journal-title":"Journal of the Mechanics and Physics of Solids"},{"issue":"13","key":"1805_CR20","doi-asserted-by":"publisher","first-page":"3521","DOI":"10.1073\/pnas.1611835114","volume":"114","author":"J Kirkpatrick","year":"2017","unstructured":"Kirkpatrick, J., Pascanu, R., Rabinowitz, N., Veness, J., Desjardins, G., Rusu, A. A., et al. (2017). Overcoming catastrophic forgetting in neural networks. Proceedings of the national academy of sciences, 114(13), 3521\u20133526.","journal-title":"Proceedings of the national academy of sciences"},{"key":"1805_CR21","first-page":"1","volume":"1","author":"A Kuhnle","year":"2020","unstructured":"Kuhnle, A., Kaiser, J. P., Thei\u00df, F., Stricker, N., & Lanza, G. (2020). Designing an adaptive production control system using reinforcement learning. Journal of Intelligent Manufacturing, 1, 1\u201322.","journal-title":"Journal of Intelligent Manufacturing"},{"issue":"7","key":"1805_CR22","doi-asserted-by":"publisher","first-page":"1795","DOI":"10.1007\/s10845-020-01562-5","volume":"31","author":"A Kumar","year":"2020","unstructured":"Kumar, A., Dimitrakopoulos, R., & Maulen, M. (2020). Adaptive self-learning mechanisms for updating short-term production decisions in an industrial mining complex. Journal of Intelligent Manufacturing, 31(7), 1795\u20131811.","journal-title":"Journal of Intelligent Manufacturing"},{"issue":"2","key":"1805_CR23","doi-asserted-by":"publisher","first-page":"647","DOI":"10.1016\/j.actamat.2006.04.041","volume":"55","author":"D Li","year":"2007","unstructured":"Li, D., Garmestani, H., & Ahzi, S. (2007). Processing path optimization to achieve desired texture in polycrystalline materials. Acta Materialia, 55(2), 647\u2013654.","journal-title":"Acta Materialia"},{"key":"1805_CR24","unstructured":"Lillicrap, T. P., Hunt, J. J., Pritzel, A., Heess, N., Erez, T., Tassa, Y., Silver, D., Wierstra, D. (2015). Continuous control with deep reinforcement learning. arXiv preprint arXiv:1509.02971."},{"issue":"3\u20134","key":"1805_CR25","first-page":"293","volume":"8","author":"LJ Lin","year":"1992","unstructured":"Lin, L. J. (1992). Self-improving reactive agents based on reinforcement learning, planning and teaching. Machine Learning, 8(3\u20134), 293\u2013321.","journal-title":"Machine Learning"},{"issue":"4","key":"1805_CR26","doi-asserted-by":"publisher","first-page":"548","DOI":"10.1002\/nme.1289","volume":"63","author":"X Ling","year":"2005","unstructured":"Ling, X., Horstemeyer, M., & Potirniche, G. (2005). On the numerical implementation of 3d rate-dependent single crystal plasticity formulations. International Journal for Numerical Methods in Engineering, 63(4), 548\u2013568.","journal-title":"International Journal for Numerical Methods in Engineering"},{"issue":"1","key":"1805_CR27","doi-asserted-by":"publisher","first-page":"1","DOI":"10.9734\/JSRR\/2015\/14076","volume":"5","author":"R Liu","year":"2015","unstructured":"Liu, R., Kumar, A., Chen, Z., Agrawal, A., Sundararaghavan, V., & Choudhary, A. (2015). A predictive machine learning approach for microstructure optimization and materials design. Scientific Reports, 5(1), 1\u201312.","journal-title":"Scientific Reports"},{"key":"1805_CR28","doi-asserted-by":"publisher","first-page":"40","DOI":"10.1016\/j.jprocont.2018.11.004","volume":"75","author":"Y Ma","year":"2019","unstructured":"Ma, Y., Zhu, W., Benton, M. G., & Romagnoli, J. (2019). Continuous control of a polymerization system with deep reinforcement learning. Journal of Process Control, 75, 40\u201347.","journal-title":"Journal of Process Control"},{"key":"1805_CR29","doi-asserted-by":"publisher","first-page":"60","DOI":"10.1016\/j.neucom.2017.05.090","volume":"263","author":"P Mannion","year":"2017","unstructured":"Mannion, P., Devlin, S., Mason, K., Duggan, J., & Howley, E. (2017). Policy invariance under reward transformations for multi-objective reinforcement learning. Neurocomputing, 263, 60\u201373.","journal-title":"Neurocomputing"},{"issue":"7540","key":"1805_CR30","doi-asserted-by":"publisher","first-page":"529","DOI":"10.1038\/nature14236","volume":"518","author":"V Mnih","year":"2015","unstructured":"Mnih, V., Kavukcuoglu, K., Silver, D., Rusu, A. A., Veness, J., Bellemare, M. G., et al. (2015). Human-level control through deep reinforcement learning. Nature, 518(7540), 529\u2013533.","journal-title":"Nature"},{"key":"1805_CR31","first-page":"278","volume":"99","author":"AY Ng","year":"1999","unstructured":"Ng, A. Y., Harada, D., & Russell, S. (1999). Policy invariance under reward transformations: Theory and application to reward shaping. ICML, 99, 278\u2013287.","journal-title":"ICML"},{"issue":"5330","key":"1805_CR32","doi-asserted-by":"publisher","first-page":"1237","DOI":"10.1126\/science.277.5330.1237","volume":"277","author":"GB Olson","year":"1997","unstructured":"Olson, G. B. (1997). Computational design of hierarchically structured materials. Science, 277(5330), 1237\u20131242.","journal-title":"Science"},{"key":"1805_CR33","unstructured":"Pagenkopf, J. (2019). Bestimmung der Plastischen Anisotropie von Blechwerkstoffen durch ortsaufgel\u00f6ste Simulationen auf Gef\u00fcgeebene. Ph.D. thesis, Fakult\u00e4t f\u00fcr Maschinenbau des Karlsruher Instituts f\u00fcr Technologie (KIT)."},{"key":"1805_CR34","doi-asserted-by":"publisher","first-page":"672","DOI":"10.1016\/j.msea.2016.07.118","volume":"674","author":"J Pagenkopf","year":"2016","unstructured":"Pagenkopf, J., Butz, A., Wenk, M., & Helm, D. (2016). Virtual testing of dual-phase steels: Effect of martensite morphology on plastic flow behavior. Materials Science and Engineering: A, 674, 672\u2013686. https:\/\/doi.org\/10.1016\/j.msea.2016.07.118.","journal-title":"Materials Science and Engineering: A"},{"key":"1805_CR35","doi-asserted-by":"publisher","first-page":"334","DOI":"10.1016\/j.commatsci.2019.01.015","volume":"160","author":"A Paul","year":"2019","unstructured":"Paul, A., Acar, P., Liao, W. K., Choudhary, A., Sundararaghavan, V., & Agrawal, A. (2019). Microstructure optimization with constrained design objectives using machine learning-based feedback-aware data-generation. Computational Materials Science, 160, 334\u2013351.","journal-title":"Computational Materials Science"},{"issue":"17\u201320","key":"1805_CR36","doi-asserted-by":"publisher","first-page":"1729","DOI":"10.1016\/j.cma.2011.01.002","volume":"200","author":"R Quey","year":"2011","unstructured":"Quey, R., Dawson, P., & Barbe, F. (2011). Large-scale 3d random polycrystals for the finite element method: Generation, meshing and remeshing. Computer Methods in Applied Mechanics and Engineering, 200(17\u201320), 1729\u20131745.","journal-title":"Computer Methods in Applied Mechanics and Engineering"},{"issue":"4","key":"1805_CR37","doi-asserted-by":"publisher","first-page":"1162","DOI":"10.1107\/S1600576718009019","volume":"51","author":"R Quey","year":"2018","unstructured":"Quey, R., Villani, A., & Maurice, C. (2018). Nearly uniform sampling of crystal orientations. Journal of Applied Crystallography, 51(4), 1162\u20131173.","journal-title":"Journal of Applied Crystallography"},{"issue":"6","key":"1805_CR38","doi-asserted-by":"publisher","first-page":"433","DOI":"10.1016\/0022-5096(71)90010-X","volume":"19","author":"JR Rice","year":"1971","unstructured":"Rice, J. R. (1971). Inelastic constitutive relations for solids: An internal-variable theory and its application to metal plasticity. Journal of the Mechanics and Physics of Solids, 19(6), 433\u2013455. https:\/\/doi.org\/10.1016\/0022-5096(71)90010-X.","journal-title":"Journal of the Mechanics and Physics of Solids"},{"key":"1805_CR39","doi-asserted-by":"crossref","unstructured":"Riedmiller, M. (2005) Neural fitted q iteration\u2013first experiences with a data efficient neural reinforcement learning method. In European Conference on Machine Learning (pp. 317\u2013328). Springer.","DOI":"10.1007\/11564096_32"},{"issue":"4","key":"1805_CR40","doi-asserted-by":"publisher","first-page":"1152","DOI":"10.1016\/j.actamat.2009.10.058","volume":"58","author":"F Roters","year":"2010","unstructured":"Roters, F., Eisenlohr, P., Hantcherli, L., Tjahjanto, D. D., Bieler, T. R., & Raabe, D. (2010). Overview of constitutive laws, kinematics, homogenization and multiscale methods in crystal plasticity finite-element modeling: Theory, experiments, applications. Acta Materialia, 58(4), 1152\u20131211.","journal-title":"Acta Materialia"},{"key":"1805_CR41","unstructured":"Schaul, T., Horgan, D., Gregor, K. & Silver, D.: Universal value function approximators. In International conference on machine learning (pp. 1312\u20131320)."},{"key":"1805_CR42","unstructured":"Schaul, T., Quan, J., Antonoglou, I., & Silver, D. (2015). Prioritized experience replay. arXiv preprint arXiv:1511.05952."},{"issue":"8","key":"1805_CR43","doi-asserted-by":"publisher","first-page":"1183","DOI":"10.1016\/j.ijplas.2010.03.010","volume":"26","author":"JB Shaffer","year":"2010","unstructured":"Shaffer, J. B., Knezevic, M., & Kalidindi, S. R. (2010). Building texture evolution networks for deformation processing of polycrystalline fcc metals using spectral approaches: Applications to design for targeted performance. International Journal of Plasticity, 26(8), 1183\u20131194. https:\/\/doi.org\/10.1016\/j.ijplas.2010.03.010.","journal-title":"International Journal of Plasticity"},{"key":"1805_CR44","doi-asserted-by":"crossref","unstructured":"Sundar, S., & Sundararaghavan, V. (2020). Database development and exploration of process-microstructure relationships using variational autoencoders. Materials Today Communications","DOI":"10.1016\/j.mtcomm.2020.101201"},{"key":"1805_CR45","unstructured":"Sutton, R. S., Barto, A. G. (2018). Reinforcement learning: An introduction. MIT Press."},{"key":"1805_CR46","unstructured":"Thrun, S., & Schwartz, A. (1993) Issues in using function approximation for reinforcement learning. In Proceedings of the 1993 Connectionist Models Summer School Hillsdale. NJ: Lawrence Erlbaum."},{"key":"1805_CR47","doi-asserted-by":"publisher","unstructured":"Tome, C., Canova, G. R., Kocks, U. F., Christodoulou, N., Jonas, J. J. (1984). The relation between macroscopic and microscopic strain hardening in f.c.c. polycrystals. Acta Metallurgica 32(10), 1637\u20131653. https:\/\/doi.org\/10.1016\/0001-6160(84)90222-0","DOI":"10.1016\/0001-6160(84)90222-0"},{"key":"1805_CR48","doi-asserted-by":"crossref","unstructured":"Tran, A., Mitchell, J. A., Swiler, L., & Wildey, T. (2020). An active learning high-throughput microstructure calibration framework for solving inverse structure-process problems in materials informatics. Acta Materialia.","DOI":"10.1016\/j.actamat.2020.04.054"},{"key":"1805_CR49","doi-asserted-by":"crossref","unstructured":"Van\u00a0Hasselt, H., Guez, A., Silver, D.: Deep reinforcement learning with double q-learning. In 30th AAAI conference on artificial intelligence (2016)","DOI":"10.1609\/aaai.v30i1.10295"},{"key":"1805_CR50","first-page":"1","volume":"1","author":"S Veeramani","year":"2019","unstructured":"Veeramani, S., Muthuswamy, S., Sagar, K., & Zoppi, M. (2019). Artificial intelligence planners for multi-head path planning of swarmitfix agents. Journal of Intelligent Manufacturing, 1, 1\u201318.","journal-title":"Journal of Intelligent Manufacturing"},{"key":"1805_CR51","doi-asserted-by":"crossref","unstructured":"Virtanen, P., Gommers, R., Oliphant, T. E., Haberland, M., Reddy, T., Cournapeau, D., Burovski, E., Peterson, P., Weckesser, W., Bright, J., & Van der Walt, D. J. (2020) Scipy 1.0: fundamental algorithms for scientific computing in python. Nature Methods 1\u201312 (2020)","DOI":"10.1038\/s41592-020-0772-5"},{"issue":"2","key":"1805_CR52","doi-asserted-by":"publisher","first-page":"325","DOI":"10.1007\/s10845-013-0864-5","volume":"27","author":"X Wang","year":"2016","unstructured":"Wang, X., Wang, H., & Qi, C. (2016). Multi-agent reinforcement learning based maintenance policy for a resource constrained flow line system. Journal of Intelligent Manufacturing, 27(2), 325\u2013333.","journal-title":"Journal of Intelligent Manufacturing"},{"key":"1805_CR53","unstructured":"Wang, Z., Schaul, T., Hessel, M., Van\u00a0Hasselt, H., Lanctot, M., & De\u00a0Freitas, N. (2015) Dueling network architectures for deep reinforcement learning. arXiv preprint arXiv:1511.06581."},{"key":"1805_CR54","doi-asserted-by":"crossref","unstructured":"Watkins, C. J. C. H., & Dayan, P. (1992). Q-learning. Machine Learning, 8(3\u20134), 279\u2013292.","DOI":"10.1023\/A:1022676722315"},{"key":"1805_CR55","doi-asserted-by":"publisher","first-page":"111","DOI":"10.1016\/j.ijplas.2016.01.002","volume":"80","author":"H Zhang","year":"2016","unstructured":"Zhang, H., Diehl, M., Roters, F., & Raabe, D. (2016). A virtual laboratory using high resolution crystal plasticity simulations to determine the initial yield surface for sheet metal forming operations. International Journal of Plasticity, 80, 111\u2013138.","journal-title":"International Journal of Plasticity"}],"container-title":["Journal of Intelligent Manufacturing"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s10845-021-01805-z.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1007\/s10845-021-01805-z\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s10845-021-01805-z.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,1,3]],"date-time":"2023-01-03T06:48:12Z","timestamp":1672728492000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/s10845-021-01805-z"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2021,7,7]]},"references-count":55,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2022,1]]}},"alternative-id":["1805"],"URL":"https:\/\/doi.org\/10.1007\/s10845-021-01805-z","relation":{},"ISSN":["0956-5515","1572-8145"],"issn-type":[{"value":"0956-5515","type":"print"},{"value":"1572-8145","type":"electronic"}],"subject":[],"published":{"date-parts":[[2021,7,7]]},"assertion":[{"value":"5 October 2020","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"16 June 2021","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"7 July 2021","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}}]}}