{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2023,2,20]],"date-time":"2023-02-20T23:53:13Z","timestamp":1676937193583},"reference-count":27,"publisher":"ASME International","issue":"4","content-domain":{"domain":["asmedigitalcollection.asme.org"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2007,12,1]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:p>In traditional optimal control and design problems, the control gains and design parameters are usually derived to minimize a cost function reflecting the system performance and control effort. One major challenge of such approaches is the selection of weighting matrices in the cost function, which are usually determined via trial-and-error and human intuition. While various techniques have been proposed to automate the weight selection process, they either can not address complex design problems or suffer from slow convergence rate and high computational costs. We propose a layered approach based on Q-learning, a reinforcement learning technique, on top of genetic algorithms (GA) to determine the best weightings for optimal control and design problems. The layered approach allows for reuse of knowledge. Knowledge obtained via Q-learning in a design problem can be used to speed up the convergence rate of a similar design problem. Moreover, the layered approach allows for solving optimizations that cannot be solved by GA alone. To test the proposed method, we perform numerical experiments on a sample active-passive hybrid vibration control problem, namely adaptive structures with active-passive hybrid piezoelectric networks. These numerical experiments show that the proposed Q-learning scheme is a promising approach for automation of weight selection for complex design problems.<\/jats:p>","DOI":"10.1115\/1.2739502","type":"journal-article","created":{"date-parts":[[2007,11,29]],"date-time":"2007-11-29T23:24:02Z","timestamp":1196378642000},"page":"302-308","update-policy":"http:\/\/dx.doi.org\/10.1115\/crossmarkpolicy-asme","source":"Crossref","is-referenced-by-count":6,"title":["Using Q-Learning and Genetic Algorithms to Improve the Efficiency of Weight Adjustments for Optimal Control and Design Problems"],"prefix":"10.1115","volume":"7","author":[{"given":"Kaivan","family":"Kamali","sequence":"first","affiliation":[{"name":"Laboratory for Intelligent Agents, College of Information Sciences and Technology, The Pennsylvania State University, University Park, PA 16802"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"L. J.","family":"Jiang","sequence":"additional","affiliation":[{"name":"Structural Dynamics and Control Laboratory, Department of Mechanical and Nuclear Engineering, The Pennsylvania State University, University Park, PA 16802"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"John","family":"Yen","sequence":"additional","affiliation":[{"name":"Laboratory for Intelligent Agents, College of Information Sciences and Technology, The Pennsylvania State University, University Park, PA 16802"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"K. W.","family":"Wang","sequence":"additional","affiliation":[{"name":"Structural Dynamics and Control Laboratory, Department of Mechanical and Nuclear Engineering, The Pennsylvania State University, University Park, PA 16802"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"33","published-online":{"date-parts":[[2007,4,8]]},"reference":[{"key":"2021070115242560800_c1","doi-asserted-by":"crossref","volume-title":"Linear Optimal Control Systems","author":"Kwakernaak","DOI":"10.1115\/1.3426828"},{"issue":"1","key":"2021070115242560800_c2","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1006\/jsvi.1998.1841","article-title":"On the Structural Damping Characteristics of Active Piezoelectric Actuators With Passive Shunt","volume":"221","author":"Tsai","journal-title":"J. Sound Vib.","ISSN":"http:\/\/id.crossref.org\/issn\/0022-460X","issn-type":"print"},{"issue":"4","key":"2021070115242560800_c3","doi-asserted-by":"publisher","first-page":"794","DOI":"10.1088\/0964-1726\/10\/4\/325","article-title":"Active-Passive Hybrid Piezoelectric Networks for Vibration Control: Comparisons and Improvement","volume":"10","author":"Tang","journal-title":"Smart Mater. Struct.","ISSN":"http:\/\/id.crossref.org\/issn\/0964-1726","issn-type":"print"},{"key":"2021070115242560800_c4","first-page":"187","article-title":"Structural Vibration Control via Piezoelectric Materials With Active-Passive Hybrid Networks","volume-title":"Proceedings of the 1994 International Mechanical Engineering Congress and Exposition","author":"Kahn"},{"issue":"4","key":"2021070115242560800_c5","doi-asserted-by":"publisher","first-page":"698","DOI":"10.1088\/0964-1726\/13\/4\/007","article-title":"Active\/Passive Reduction of Vibration of Periodic One-Dimensional Structures Using Piezoelectric Actuators","volume":"13","author":"Singh","journal-title":"Smart Mater. Struct.","ISSN":"http:\/\/id.crossref.org\/issn\/0964-1726","issn-type":"print"},{"issue":"4","key":"2021070115242560800_c6","doi-asserted-by":"crossref","first-page":"482","DOI":"10.1177\/1045389X9500600405","article-title":"Development of a Modal Model for Simultaneous Active and Passive Piezoelectric Vibration Suppression","volume":"6","author":"Agnes","journal-title":"J. Intell. Mater. Syst. Struct.","ISSN":"http:\/\/id.crossref.org\/issn\/1045-389X","issn-type":"print"},{"issue":"2","key":"2021070115242560800_c7","doi-asserted-by":"crossref","first-page":"149","DOI":"10.1002\/j.1099-1514.1995.tb00011.x","article-title":"Design of Optimal Control Systems With Eigenvalue Placement in a Specified Region","volume":"16","author":"Arar","journal-title":"Opt. Control Appl. Methods","ISSN":"http:\/\/id.crossref.org\/issn\/0143-2087","issn-type":"print"},{"issue":"4","key":"2021070115242560800_c8","doi-asserted-by":"crossref","first-page":"714","DOI":"10.2514\/3.49018","article-title":"Optimal Selection of Weighting Matrices in Integrated Design Of Structures\/Controls","volume":"31","author":"Sunar","journal-title":"AIAA J.","ISSN":"http:\/\/id.crossref.org\/issn\/0001-1452","issn-type":"print"},{"issue":"1","key":"2021070115242560800_c9","doi-asserted-by":"crossref","first-page":"9","DOI":"10.1016\/0045-7906(93)90028-P","article-title":"Finding the Best Optimal Control Using Global Search","volume":"19","author":"Stuckman","journal-title":"Comput. Electr. Eng.","ISSN":"http:\/\/id.crossref.org\/issn\/0045-7906","issn-type":"print"},{"key":"2021070115242560800_c10","first-page":"1331","article-title":"An Approach for Selecting the Weighting Matrices of lq Optimal Controller Design Based on Genetic Algorithms","volume-title":"Proceedings of the IEEE Conference on Decision and Control","author":"Zhang"},{"issue":"4","key":"2021070115242560800_c11","first-page":"263","article-title":"Optimal Feedback Control Design Using Genetic Algorithm in Multimachine Power System","volume":"23","author":"Robandi","journal-title":"Am. J. Optom. Physiol. Opt.","ISSN":"http:\/\/id.crossref.org\/issn\/0093-7002","issn-type":"print"},{"issue":"1","key":"2021070115242560800_c12","first-page":"454","article-title":"Space Structure Control Design by Variance Assignment","volume":"8","author":"Skelton","journal-title":"J. Guidance"},{"issue":"1","key":"2021070115242560800_c13","doi-asserted-by":"publisher","first-page":"341","DOI":"10.1137\/S0363012994263974","article-title":"A Convergent Algorithm for the Output Covariance Constraint Control Problem","volume":"35","author":"Zhu","journal-title":"SIAM J. Control Optim.","ISSN":"http:\/\/id.crossref.org\/issn\/0363-0129","issn-type":"print"},{"key":"2021070115242560800_c14","first-page":"4044","article-title":"Fuzzy quadratic Weights for Variance Constrained lqg Design","volume-title":"Proceedings of the IEEE Conference on Decision and Control","author":"Collins"},{"issue":"1","key":"2021070115242560800_c15","doi-asserted-by":"publisher","first-page":"32","DOI":"10.1109\/87.974336","article-title":"A Fuzzy Logic Approach to lqg Design With Variance Constraints","volume":"10","author":"Collins","journal-title":"IEEE Trans. Control Syst. Technol.","ISSN":"http:\/\/id.crossref.org\/issn\/1063-6536","issn-type":"print"},{"issue":"4","key":"2021070115242560800_c16","doi-asserted-by":"publisher","first-page":"15","DOI":"10.1016\/0005-1098(84)90061-X","article-title":"Constrained Linear Quadratic Control With Process Application","volume":"20","author":"Makila","journal-title":"Automatica","ISSN":"http:\/\/id.crossref.org\/issn\/0005-1098","issn-type":"print"},{"key":"2021070115242560800_c17","volume-title":"Markov Decision Processes: Discrete Stochastic Dynamic Programming","author":"Puterman"},{"key":"2021070115242560800_c18","doi-asserted-by":"crossref","volume-title":"Reinforcement Learning: An Introduction","author":"Sutton","DOI":"10.1007\/978-1-4615-3618-5"},{"key":"2021070115242560800_c19","volume-title":"Dynamic Programming","author":"Bellman"},{"key":"2021070115242560800_c20","volume-title":"Dynamic Programming and Markov Processes","author":"Howard"},{"issue":"3","key":"2021070115242560800_c21","doi-asserted-by":"publisher","first-page":"58","DOI":"10.1145\/203330.203343","article-title":"Temporal Difference Learning and td-Gammon","volume":"38","author":"Tesauro","journal-title":"Commun. ACM","ISSN":"http:\/\/id.crossref.org\/issn\/0001-0782","issn-type":"print"},{"key":"2021070115242560800_c22","first-page":"539","article-title":"Learning and Sequential Decision Making","volume-title":"Learning and Computational Neuroscience","author":"Barto"},{"issue":"1","key":"2021070115242560800_c23","first-page":"237","article-title":"Reinforcement Learning: A Survey","volume":"4","author":"Kaelbling","journal-title":"J. Artif. Intell. Res.","ISSN":"http:\/\/id.crossref.org\/issn\/1076-9757","issn-type":"print"},{"key":"2021070115242560800_c24","unstructured":"Watkins, C. J.\n          , 1989, Learning with delayed rewards, Ph.D. thesis, Cambridge University, Cambridge, UK."},{"key":"2021070115242560800_c25","unstructured":"Dejong, K.\n          , 1975, Analysis of the Behavior of a Class of Genetic Adaptive Systems, Ph.D. thesis, University of Michigan, Ann Arbor."},{"issue":"2","key":"2021070115242560800_c26","doi-asserted-by":"publisher","first-page":"173","DOI":"10.1109\/3477.662758","article-title":"A Hybrid Approach to Modeling Metabolic Systems Using Genetic Algorithms and Simplex Method","volume":"28","author":"Yen","journal-title":"IEEE Trans. Syst., Man, Cybern., Part B: Cybern.","ISSN":"http:\/\/id.crossref.org\/issn\/1083-4419","issn-type":"print"},{"key":"2021070115242560800_c27","unstructured":"Dixon, K., Malak, R., and Khosla, P., 2002, \u201cIncorporating Prior Knowledge and Previously Learned Information Into Reinforcement Learning Agents,\u201d Tech. Rep. 1, Carnegie Mellon University, Pittsburgh, PA."}],"container-title":["Journal of Computing and Information Science in Engineering"],"original-title":[],"language":"en","link":[{"URL":"http:\/\/asmedigitalcollection.asme.org\/computingengineering\/article-pdf\/7\/4\/302\/6725846\/302_1.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"http:\/\/asmedigitalcollection.asme.org\/computingengineering\/article-pdf\/7\/4\/302\/6725846\/302_1.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2021,7,1]],"date-time":"2021-07-01T15:25:01Z","timestamp":1625153101000},"score":1,"resource":{"primary":{"URL":"https:\/\/asmedigitalcollection.asme.org\/computingengineering\/article\/7\/4\/302\/475203\/Using-Q-Learning-and-Genetic-Algorithms-to-Improve"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2007,4,8]]},"references-count":27,"journal-issue":{"issue":"4","published-print":{"date-parts":[[2007,12,1]]}},"URL":"https:\/\/doi.org\/10.1115\/1.2739502","relation":{},"ISSN":["1530-9827","1944-7078"],"issn-type":[{"value":"1530-9827","type":"print"},{"value":"1944-7078","type":"electronic"}],"subject":[],"published":{"date-parts":[[2007,4,8]]}}}