{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,21]],"date-time":"2026-02-21T00:53:06Z","timestamp":1771635186096,"version":"3.50.1"},"reference-count":49,"publisher":"ASME International","issue":"4","license":[{"start":{"date-parts":[[2022,11,22]],"date-time":"2022-11-22T00:00:00Z","timestamp":1669075200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.asme.org\/publications-submissions\/publishing-information\/legal-policies"}],"funder":[{"DOI":"10.13039\/100000185","name":"Defense Advanced Research Projects Agency","doi-asserted-by":"publisher","award":["FA8750-20-C-0002"],"award-info":[{"award-number":["FA8750-20-C-0002"]}],"id":[{"id":"10.13039\/100000185","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["asmedigitalcollection.asme.org"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2023,8,1]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:p>Reinforcement learning algorithms can autonomously learn to search a design space for high-performance solutions. However, modern engineering often entails the use of computationally intensive simulation, which can lead to slower design timelines with highly iterative approaches such as reinforcement learning. This work provides a reinforcement learning framework that leverages models of varying fidelity to enable an effective solution search while reducing overall computational needs. Specifically, it utilizes models of varying fidelity while training the agent, iteratively progressing from low- to high fidelity. To demonstrate the effectiveness of the proposed framework, we apply it to two multimodal multi-objective constrained mixed integer nonlinear design problems involving the components of a ground and aerial vehicle. Specifically, for each problem, we utilize a high-fidelity and a low-fidelity deep neural network surrogate model, trained on performance data generated from underlying ground truth models. A tradeoff between solution quality and the proportion of low-fidelity surrogate model usage is observed. Specifically, high-quality solutions are achieved with substantial reductions in computational expense, showcasing the effectiveness of the framework for design problems where the use of just a high-fidelity model is infeasible. This solution quality-computational efficiency tradeoff is contextualized by visualizing the exploration behavior of the design agents.<\/jats:p>","DOI":"10.1115\/1.4056297","type":"journal-article","created":{"date-parts":[[2022,11,22]],"date-time":"2022-11-22T03:10:35Z","timestamp":1669086635000},"update-policy":"https:\/\/doi.org\/10.1115\/crossmarkpolicy-asme","source":"Crossref","is-referenced-by-count":14,"title":["Reinforcement Learning for Efficient Design Space Exploration With Variable Fidelity Analysis Models"],"prefix":"10.1115","volume":"23","author":[{"given":"Akash","family":"Agrawal","sequence":"first","affiliation":[{"name":"Carnegie Mellon University Department of Mechanical Engineering, , 5000 Forbes Avenue, Pittsburgh, PA 15213"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Christopher","family":"McComb","sequence":"additional","affiliation":[{"name":"Carnegie Mellon University Department of Mechanical Engineering, , 5000 Forbes Avenue, Pittsburgh, PA 15213"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"33","published-online":{"date-parts":[[2023,1,9]]},"reference":[{"key":"2024010901101813600_CIT0001","article-title":"Learning to Optimize","volume-title":"arXiv preprint","author":"Li","year":"2016"},{"issue":"11","key":"2024010901101813600_CIT0002","doi-asserted-by":"publisher","first-page":"111401","DOI":"10.1115\/1.4044397","article-title":"A Case Study of Deep Reinforcement Learning for Engineering Design: Application to Microfluidic Devices for Flow Sculpting","volume":"141","author":"Lee","year":"2019","journal-title":"ASME J. Mech. Des."},{"key":"2024010901101813600_CIT0003","doi-asserted-by":"publisher","first-page":"101612","DOI":"10.1016\/j.aei.2022.101612","article-title":"Reinforcement Learning for Engineering Design Automation","volume":"52","author":"Dworschak","year":"2022","journal-title":"Adv. Eng. Inform."},{"issue":"2","key":"2024010901101813600_CIT0004","doi-asserted-by":"publisher","first-page":"021002","DOI":"10.1115\/1.4051598","article-title":"Design Synthesis Through a Markov Decision Process and Reinforcement Learning Framework","volume":"22","author":"Ororbia","year":"2022","journal-title":"ASME J. Comput. Inf. Sci. Eng."},{"key":"2024010901101813600_CIT0005","first-page":"610","article-title":"On the Dangers of Stochastic Parrots: Can Language Models Be Too Big?","author":"Bender","year":"2021"},{"key":"2024010901101813600_CIT0006","article-title":"The Computational Limits of Deep Learning","volume-title":"arXiv preprint","author":"Thompson","year":"2020"},{"key":"2024010901101813600_CIT0007","article-title":"Review of Multi-Fidelity Models","volume-title":"arXiv preprint","author":"Fern\u00e1ndez-Godino","year":"2016"},{"issue":"1","key":"2024010901101813600_CIT0008","doi-asserted-by":"publisher","first-page":"305","DOI":"10.1007\/s00158-017-1756-7","article-title":"Design as a Sequential Decision Process: A Method for Reducing Design Set Space Using Models to Bound Objectives","volume":"57","author":"Miller","year":"2018","journal-title":"Struct. Multidiscipl. Optim."},{"key":"2024010901101813600_CIT0009","doi-asserted-by":"crossref","first-page":"175","DOI":"10.1007\/978-3-319-18320-6_10","volume-title":"Engineering and Applied Sciences Optimization: Dedicated to the Memory of Professor","author":"Mehmani","year":"2015"},{"issue":"9","key":"2024010901101813600_CIT0010","doi-asserted-by":"publisher","first-page":"094501","DOI":"10.1115\/1.4040484","article-title":"Multidisciplinary and Multifidelity Design Optimization of Electric Vehicle Battery Thermal Management System","volume":"140","author":"Wang","year":"2018","journal-title":"ASME J. Mech. Des."},{"key":"2024010901101813600_CIT0011","article-title":"Report from the Fidelity Implementation Study Group","author":"Gross","year":"1999"},{"issue":"1","key":"2024010901101813600_CIT0012","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1093\/biomet\/87.1.1","article-title":"Predicting the Output from a Complex Computer Code When Fast Approximations Are Available","volume":"87","author":"Kennedy","year":"2000","journal-title":"Biometrika"},{"issue":"3","key":"2024010901101813600_CIT0013","doi-asserted-by":"publisher","first-page":"550","DOI":"10.1137\/16M1082469","article-title":"Survey of Multifidelity Methods in Uncertainty Propagation, Inference, and Optimization","volume":"60","author":"Peherstorfer","year":"2018","journal-title":"SIAM Rev."},{"key":"2024010901101813600_CIT0014","author":"Newmark","year":"1981"},{"issue":"3","key":"2024010901101813600_CIT0015","doi-asserted-by":"publisher","first-page":"1679","DOI":"10.1007\/s11069-013-0972-8","article-title":"Seismic Damage Simulation in Urban Areas Based on a High-Fidelity Structural Model and a Physics Engine","volume":"71","author":"Xu","year":"2014","journal-title":"Natural Hazards"},{"issue":"2","key":"2024010901101813600_CIT0016","doi-asserted-by":"publisher","first-page":"49","DOI":"10.1109\/64.585104","article-title":"Configuration-Design Problem Solving","volume":"12","author":"Wielinga","year":"1997","journal-title":"IEEE Expert"},{"key":"2024010901101813600_CIT0017","first-page":"1395","article-title":"Towards a Generic Model of Configuration Tasks","volume":"2","author":"Mittal","year":"1989","journal-title":"IJCAI"},{"key":"2024010901101813600_CIT0018","first-page":"8","article-title":"Design Space Exploration and Manipulation for Cyber Physical Systems","author":"Neema","year":"2014"},{"key":"2024010901101813600_CIT0019","doi-asserted-by":"crossref","DOI":"10.1115\/DETC2013-13098","article-title":"Preference Construction, Sequential Decision Making, and Trade Space Exploration","author":"Miller","year":"2013"},{"issue":"2\u20133","key":"2024010901101813600_CIT0020","doi-asserted-by":"publisher","first-page":"213","DOI":"10.1016\/S0926-5805","article-title":"Satisficing in Engineering Design: Causes, Consequences and Implications for Design Support","volume":"7","author":"Ball","year":"1998","journal-title":"Autom. Constr."},{"key":"2024010901101813600_CIT0021","doi-asserted-by":"crossref","DOI":"10.1057\/978-1-349-95121-5_1767-2","volume-title":"Satisficing. In: The New Palgrave Dictionary of Economics","author":"Simon","year":"2008"},{"issue":"1","key":"2024010901101813600_CIT0022","doi-asserted-by":"publisher","first-page":"012003","DOI":"10.1063\/1.4939512","article-title":"Optimization of Micropillar Sequences for Fluid Flow Sculpting","volume":"28","author":"Stoecklein","year":"2016","journal-title":"Phys. Fluids"},{"issue":"3","key":"2024010901101813600_CIT0023","doi-asserted-by":"publisher","first-page":"1247","DOI":"10.1007\/s10898-012-9951-y","article-title":"Derivative-Free Optimization: A Review of Algorithms and Comparison of Software Implementations","volume":"56","author":"Rios","year":"2013","journal-title":"J. Glob. Optim."},{"key":"2024010901101813600_CIT0024","doi-asserted-by":"publisher","first-page":"1049","DOI":"10.1016\/j.applthermaleng.2017.08.052","article-title":"Choosing the Best Evolutionary Algorithm to Optimize the Multiobjective Shell-and-Tube Heat Exchanger Design Problem Using PROMETHEE","volume":"127","author":"Saldanha","year":"2017","journal-title":"Appl. Therm. Eng."},{"key":"2024010901101813600_CIT0025","volume-title":"Reinforcement Learning: An Introduction","author":"Sutton","year":"2018"},{"key":"2024010901101813600_CIT0026","doi-asserted-by":"crossref","first-page":"110672","DOI":"10.1016\/j.matdes.2022.110672","article-title":"Deep Reinforcement Learning for Engineering Design Through Topology Optimization of Elementally Discretized Design Domains","volume":"218","author":"Brown","year":"2022","journal-title":"Mater. Des."},{"key":"2024010901101813600_CIT0027","first-page":"490","article-title":"AutoCkt: Deep Reinforcement Learning of Analog Circuit Designs","author":"Settaluri","year":"2020"},{"issue":"7","key":"2024010901101813600_CIT0028","doi-asserted-by":"publisher","first-page":"071704","DOI":"10.1115\/1.4053859","article-title":"Deep Generative Models in Engineering Design: A Review","volume":"144","author":"Regenwetter","year":"2022","journal-title":"ASME J. Mech. Des."},{"key":"2024010901101813600_CIT0029","author":"DARPA Information Innovation Office","year":"2019"},{"key":"2024010901101813600_CIT0030","article-title":"Between Progress and Potential Impact of AI: the Neglected Dimensions","author":"Mart\u00ednez-Plumed","year":"2018"},{"key":"2024010901101813600_CIT0031","article-title":"ELF OpenGo: An Analysis and Open Reimplementation of AlphaZero","author":"Tian","year":"2019"},{"key":"2024010901101813600_CIT0032","first-page":"1149","article-title":"Using Continuous Action Spaces to Solve Discrete Problems","author":"van Hasselt","year":"2009"},{"key":"2024010901101813600_CIT0033","doi-asserted-by":"crossref","DOI":"10.1115\/DETC2020-22518","article-title":"Deriving Metamodels to Relate Machine Learning Quality to Design Repository Characteristics in the Context of Additive Manufacturing","author":"Williams","year":"2020"},{"issue":"11","key":"2024010901101813600_CIT0034","doi-asserted-by":"publisher","first-page":"111701","DOI":"10.1115\/1.4044199","article-title":"Design Repository Effectiveness for 3D Convolutional Neural Networks: Application to Additive Manufacturing","volume":"141","author":"Williams","year":"2019","journal-title":"ASME J. Mech. Des."},{"key":"2024010901101813600_CIT0035","doi-asserted-by":"crossref","DOI":"10.1115\/DETC2020-22256","article-title":"Comparing Attribute- and Form-Based Machine Learning Techniques for Component Prediction","author":"Williams","year":"2020"},{"key":"2024010901101813600_CIT0036","first-page":"1946","article-title":"Auto-Keras: An Efficient Neural Architecture Search System","author":"Jin","year":"2019"},{"key":"2024010901101813600_CIT0037","article-title":"Proximal Policy Optimization Algorithms","volume-title":"arXiv preprint","author":"Schulman","year":"2017"},{"key":"2024010901101813600_CIT0038","first-page":"64","article-title":"Comparing Strategies for Visualizing the High-Dimensional Exploration Behavior of CPS Design Agents","author":"Agrawal","year":"2022"},{"key":"2024010901101813600_CIT0039","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1017\/dsj.2019.12","article-title":"KABOOM: An Agent-Based Model for Simulating Cognitive Style in Team Problem Solving","volume":"5","author":"Lapp","year":"2019","journal-title":"Design Sci."},{"issue":"1","key":"2024010901101813600_CIT0040","doi-asserted-by":"publisher","first-page":"011003","DOI":"10.1115\/1.4038158","article-title":"Design of Complex Engineered Systems Using Multi-Agent Coordination","volume":"18","author":"Soria Zurita","year":"2018","journal-title":"ASME J. Comput. Inf. Sci. Eng."},{"issue":"3","key":"2024010901101813600_CIT0041","doi-asserted-by":"publisher","first-page":"761","DOI":"10.1007\/s00182-014-0453-7","article-title":"On the Number of Positions in Chess Without Promotion","volume":"44","author":"Steinerberger","year":"2015","journal-title":"Int. J. Game Theory"},{"key":"2024010901101813600_CIT0042","first-page":"7","article-title":"A Flight Dynamics Model for Exploring the Distributed Electrical EVTOL Cyber Physical Design Space","author":"Walker","year":"2022"},{"issue":"2","key":"2024010901101813600_CIT0043","doi-asserted-by":"publisher","first-page":"230","DOI":"10.1016\/j.aei.2012.12.004","article-title":"Design With Shape Grammars and Reinforcement Learning","volume":"27","author":"Ruiz-Montiel","year":"2013","journal-title":"Adv. Eng. Inform."},{"issue":"7862","key":"2024010901101813600_CIT0044","doi-asserted-by":"publisher","first-page":"207","DOI":"10.1038\/s41586-021-03544-w","article-title":"A Graph Placement Methodology for Fast Chip Design","volume":"594","author":"Mirhoseini","year":"2021","journal-title":"Nature"},{"key":"2024010901101813600_CIT0045","article-title":"Continuous Representation of Molecules Using Graph Variational Autoencoder","volume-title":"arXiv preprint","author":"Tavakoli","year":"2020"},{"issue":"5","key":"2024010901101813600_CIT0046","doi-asserted-by":"publisher","first-page":"427","DOI":"10.1016\/j.destud.2004.06.002","article-title":"Expertise in Design: An Overview","volume":"25","author":"Cross","year":"2004","journal-title":"Design Studies"},{"issue":"5","key":"2024010901101813600_CIT0047","doi-asserted-by":"publisher","first-page":"1521","DOI":"10.1007\/s00158-018-2145-6","article-title":"A Method for Model Selection Using Reinforcement Learning When Viewing Design as a Sequential Decision Process","volume":"59","author":"Chhabra","year":"2019","journal-title":"Struct. Multidiscip. Optim"},{"key":"2024010901101813600_CIT0048","first-page":"593","article-title":"Personalised Specific Curiosity for Computational Design Systems","author":"Grace","year":"2017"},{"key":"2024010901101813600_CIT0049","doi-asserted-by":"publisher","first-page":"161","DOI":"10.1017\/pds.2021.17","article-title":"A Multi-Agent Reinforcement Learning Framework for Intelligent Manufacturing With Autonomous Mobile Robots","volume":"1","author":"Agrawal","year":"2021","journal-title":"Proc. Des. Soc."}],"container-title":["Journal of Computing and Information Science in Engineering"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/asmedigitalcollection.asme.org\/computingengineering\/article-pdf\/23\/4\/041004\/6971823\/jcise_23_4_041004.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/asmedigitalcollection.asme.org\/computingengineering\/article-pdf\/23\/4\/041004\/6971823\/jcise_23_4_041004.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,1,9]],"date-time":"2024-01-09T01:10:34Z","timestamp":1704762634000},"score":1,"resource":{"primary":{"URL":"https:\/\/asmedigitalcollection.asme.org\/computingengineering\/article\/23\/4\/041004\/1150961\/Reinforcement-Learning-for-Efficient-Design-Space"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,1,9]]},"references-count":49,"journal-issue":{"issue":"4","published-print":{"date-parts":[[2023,8,1]]}},"URL":"https:\/\/doi.org\/10.1115\/1.4056297","relation":{},"ISSN":["1530-9827","1944-7078"],"issn-type":[{"value":"1530-9827","type":"print"},{"value":"1944-7078","type":"electronic"}],"subject":[],"published":{"date-parts":[[2023,1,9]]},"article-number":"041004"}}