{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,5,9]],"date-time":"2026-05-09T05:44:50Z","timestamp":1778305490775,"version":"3.51.4"},"reference-count":51,"publisher":"IOP Publishing","issue":"2","license":[{"start":{"date-parts":[[2024,6,3]],"date-time":"2024-06-03T00:00:00Z","timestamp":1717372800000},"content-version":"vor","delay-in-days":2,"URL":"http:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2024,6,3]],"date-time":"2024-06-03T00:00:00Z","timestamp":1717372800000},"content-version":"tdm","delay-in-days":2,"URL":"https:\/\/iopscience.iop.org\/info\/page\/text-and-data-mining"}],"funder":[{"DOI":"10.13039\/100000146","name":"Division of Chemical, Bioengineering, Environmental, and Transport Systems","doi-asserted-by":"crossref","award":["2138938"],"award-info":[{"award-number":["2138938"]}],"id":[{"id":"10.13039\/100000146","id-type":"DOI","asserted-by":"crossref"}]}],"content-domain":{"domain":["iopscience.iop.org"],"crossmark-restriction":false},"short-container-title":["Mach. Learn.: Sci. Technol."],"published-print":{"date-parts":[[2024,6,1]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:p>Symbolic regression (SR) can generate interpretable, concise expressions that fit a given dataset, allowing for more human understanding of the structure than black-box approaches. The addition of background knowledge (in the form of symbolic mathematical constraints) allows for the generation of expressions that are meaningful with respect to theory while also being consistent with data. We specifically examine the addition of constraints to traditional genetic algorithm (GA) based SR (PySR) as well as a Markov-chain Monte Carlo (MCMC) based Bayesian SR architecture (Bayesian Machine Scientist), and apply these to rediscovering adsorption equations from experimental, historical datasets. We find that, while hard constraints prevent GA and MCMC SR from searching, soft constraints can lead to improved performance both in terms of search effectiveness and model meaningfulness, with computational costs increasing by about an order of magnitude. If the constraints do not correlate well with the dataset or expected models, they can hinder the search of expressions. We find incorporating these constraints in Bayesian SR (as the Bayesian prior) is better than by modifying the fitness function in the GA.<\/jats:p>","DOI":"10.1088\/2632-2153\/ad4a1e","type":"journal-article","created":{"date-parts":[[2024,5,11]],"date-time":"2024-05-11T02:14:35Z","timestamp":1715393675000},"page":"025057","update-policy":"https:\/\/doi.org\/10.1088\/crossmark-policy","source":"Crossref","is-referenced-by-count":2,"title":["Incorporating background knowledge in symbolic regression using a computer algebra system"],"prefix":"10.1088","volume":"5","author":[{"given":"Charles","family":"Fox","sequence":"first","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-9033-1759","authenticated-orcid":false,"given":"Neil D","family":"Tran","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"F Nikki","family":"Nacion","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-6379-9206","authenticated-orcid":false,"given":"Samiha","family":"Sharlin","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-0100-0227","authenticated-orcid":true,"given":"Tyler R","family":"Josephson","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"266","published-online":{"date-parts":[[2024,6,3]]},"reference":[{"key":"mlstad4a1ebib1","author":"Koza","year":"1992"},{"key":"mlstad4a1ebib2","doi-asserted-by":"publisher","first-page":"597","DOI":"10.1021\/accountsmr.1c00244","article-title":"Interpretable and explainable machine learning for materials science and chemistry","volume":"3","author":"Oviedo","year":"2022","journal-title":"Acc. Mater. Res."},{"key":"mlstad4a1ebib3","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1038\/s41524-022-00884-7","article-title":"Explainable machine learning in materials science","volume":"8","author":"Zhong","year":"2022","journal-title":"npj Comput. Mater."},{"key":"mlstad4a1ebib4","doi-asserted-by":"publisher","first-page":"175","DOI":"10.1038\/s41929-022-00744-z","article-title":"Interpretable machine learning for knowledge generation in heterogeneous catalysis","volume":"5","author":"Esterhuizen","year":"2022","journal-title":"Nat. Catal."},{"key":"mlstad4a1ebib5","first-page":"pp 241","article-title":"Application issues of genetic programming in industry","author":"Kordon","year":"2006"},{"key":"mlstad4a1ebib6","doi-asserted-by":"publisher","first-page":"219","DOI":"10.1023\/A:1008132509589","article-title":"A genetic programming approach to rainfall-runoff modelling","volume":"13","author":"Savic","year":"1999","journal-title":"Water Res. Manage."},{"key":"mlstad4a1ebib7","doi-asserted-by":"publisher","first-page":"81","DOI":"10.1126\/science.1165893","article-title":"Distilling free-form natural laws from experimental data","volume":"324","author":"Schmidt","year":"2009","journal-title":"Science"},{"key":"mlstad4a1ebib8","doi-asserted-by":"crossref","DOI":"10.1038\/s41524-019-0249-1","article-title":"Fast, accurate, and transferable many-body interatomic potentials by symbolic regression","author":"Hernandez","year":"2019"},{"key":"mlstad4a1ebib9","doi-asserted-by":"publisher","DOI":"10.1002\/aic.17695","article-title":"Iterative symbolic regression for learning transport equations","volume":"68","author":"Ansari","year":"2022","journal-title":"AIChE J."},{"key":"mlstad4a1ebib10","first-page":"pp 17429","article-title":"Discovering symbolic models from deep learning with inductive biases","author":"Cranmer","year":"2020"},{"key":"mlstad4a1ebib11","doi-asserted-by":"publisher","DOI":"10.1103\/PhysRevMaterials.2.083802","article-title":"SISSO: a compressed-sensing method for identifying the best low-dimensional descriptor in an immensity of offered candidates","volume":"2","author":"Ouyang","year":"2018","journal-title":"Phys. Rev. Mater."},{"key":"mlstad4a1ebib12","doi-asserted-by":"publisher","DOI":"10.1016\/j.compchemeng.2021.107470","article-title":"AI-DARWIN: a first principles-based model discovery engine using machine learning","volume":"154","author":"Chakraborty","year":"2021","journal-title":"Comput. Chem. Eng."},{"key":"mlstad4a1ebib13","author":"Goldberg","year":"1989"},{"key":"mlstad4a1ebib14","doi-asserted-by":"publisher","first-page":"75","DOI":"10.1162\/evco_a_00294","article-title":"Shape-constrained symbolic regression \u2013 improving extrapolation with prior knowledge","volume":"30","author":"Kronberger","year":"2022","journal-title":"Evol. Comput."},{"key":"mlstad4a1ebib15","doi-asserted-by":"publisher","DOI":"10.1016\/j.asoc.2022.109855","article-title":"Shape-constrained multi-objective genetic programming for symbolic regression","volume":"132","author":"Haider","year":"2023","journal-title":"Appl. Soft Comput."},{"key":"mlstad4a1ebib16","doi-asserted-by":"crossref","DOI":"10.3847\/1538-4357\/ad014c","article-title":"Deep symbolic regression for physics guided by units constraints: toward the automated discovery of physical laws","author":"Tenachi","year":"2023"},{"key":"mlstad4a1ebib17","doi-asserted-by":"publisher","first-page":"eaay2631","DOI":"10.1126\/sciadv.aay2631","article-title":"AI Feynman: a physics-inspired method for symbolic regression","volume":"6","author":"Udrescu","year":"2020","journal-title":"Sci. Adv."},{"key":"mlstad4a1ebib18","doi-asserted-by":"publisher","first-page":"1249","DOI":"10.1038\/s41598-023-28328-2","article-title":"A computational framework for physics-informed symbolic regression with straightforward integration of domain knowledge","volume":"13","author":"Simon Keren","year":"2023","journal-title":"Sci. Rep."},{"key":"mlstad4a1ebib19","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1155\/2016\/1021378","article-title":"Using genetic programming with prior formula knowledge to solve symbolic regression problem","volume":"2016","author":"Lu","year":"2016","journal-title":"Comput. Intell. Neurosci."},{"key":"mlstad4a1ebib20","doi-asserted-by":"publisher","DOI":"10.1016\/j.eswa.2021.115210","article-title":"Multi-objective symbolic regression for physics-aware dynamic modeling","volume":"182","author":"Kubal\u00edk","year":"2021","journal-title":"Expert Syst. Appl."},{"key":"mlstad4a1ebib21","article-title":"Active learning in symbolic regression performance with physical constraints","author":"Medina","year":"2023"},{"key":"mlstad4a1ebib22","doi-asserted-by":"publisher","first-page":"590","DOI":"10.1063\/1.475421","article-title":"Fitting potential-energy surfaces: a search in the function space by directed genetic programming","volume":"108","author":"Makarov","year":"1998","journal-title":"J. Chem. Phys."},{"key":"mlstad4a1ebib23","first-page":"pp 300","article-title":"Incorporating a-priori expert knowledge in genetic algorithms","author":"Akbarzadeh-T","year":"1997"},{"key":"mlstad4a1ebib24","first-page":"pp 1091","article-title":"Incorporating expert knowledge in evolutionary search: a study of seeding methods","author":"Schmidt","year":"2009"},{"key":"mlstad4a1ebib25","doi-asserted-by":"publisher","DOI":"10.1002\/aic.17457","article-title":"Deterministic symbolic regression with derivative information: general methodology and application to equations of state","volume":"68","author":"Engle","year":"2022","journal-title":"AIChE J."},{"key":"mlstad4a1ebib26","doi-asserted-by":"publisher","first-page":"eaav6971","DOI":"10.1126\/sciadv.aav6971","article-title":"A Bayesian machine scientist to aid in the solution of challenging scientific problems","volume":"6","author":"Guimer\u00e1","year":"2020","journal-title":"Sci. Adv."},{"key":"mlstad4a1ebib27","doi-asserted-by":"publisher","first-page":"1777","DOI":"10.1038\/s41467-023-37236-y","article-title":"Combining data and theory for derivable scientific discovery with AI-Descartes","volume":"14","author":"Cornelio","year":"2023","journal-title":"Nat. Commun."},{"key":"mlstad4a1ebib28","doi-asserted-by":"publisher","first-page":"pp 15753","DOI":"10.1609\/aaai.v35i18.17873","article-title":"Logic guided genetic algorithms","volume":"vol 35","author":"Ashok","year":"2021"},{"key":"mlstad4a1ebib29","doi-asserted-by":"publisher","first-page":"225","DOI":"10.1016\/j.apenergy.2015.10.011","article-title":"Carbon capture by physical adsorption: materials, experimental investigations and numerical modeling and simulations - a review","volume":"161","author":"Ben-Mansour","year":"2016","journal-title":"Appl. Energy"},{"key":"mlstad4a1ebib30","doi-asserted-by":"publisher","first-page":"1123","DOI":"10.1080\/01496390701242194","article-title":"State of the art adsorption and membrane separation processes for hydrogen production in the chemical and petrochemical industries","volume":"42","author":"Ritter","year":"2007","journal-title":"Sep. Sci. Technol."},{"key":"mlstad4a1ebib31","first-page":"4","article-title":"Remove organics by activated carbon adsorption","volume":"89","author":"Stenzel","year":"1993","journal-title":"Chem. Eng. Prog."},{"key":"mlstad4a1ebib32","author":"Ruthven","year":"1984"},{"key":"mlstad4a1ebib33","doi-asserted-by":"publisher","first-page":"249","DOI":"10.1016\/j.apgeochem.2006.09.010","article-title":"Sorption isotherms: a review on physical bases, modeling and measurement","volume":"22","author":"Limousin","year":"2007","journal-title":"Appl. Geochem."},{"key":"mlstad4a1ebib34","doi-asserted-by":"publisher","first-page":"2","DOI":"10.1016\/j.cej.2009.09.013","article-title":"Insights into the modeling of adsorption isotherm systems","volume":"156","author":"Yuen Foo","year":"2010","journal-title":"Chem. Eng. J."},{"key":"mlstad4a1ebib35","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1155\/2017\/3039817","article-title":"Modelling and interpretation of adsorption isotherms","volume":"2017","author":"Ayawei","year":"2017","journal-title":"J. Chem."},{"key":"mlstad4a1ebib36","doi-asserted-by":"publisher","DOI":"10.1016\/j.chemosphere.2020.127279","article-title":"Adsorption isotherm models: classification, physical meaning, application and solving method","volume":"258","author":"Wang","year":"2020","journal-title":"Chemosphere"},{"key":"mlstad4a1ebib37","author":"Freundlich","year":"1906"},{"key":"mlstad4a1ebib38","doi-asserted-by":"publisher","first-page":"1361","DOI":"10.1021\/ja02242a004","article-title":"The adsorption of gases on plane surfaces of glass, mica and platinum","volume":"40","author":"Langmuir","year":"1918","journal-title":"J. Am. Chem. Soc."},{"key":"mlstad4a1ebib39","doi-asserted-by":"publisher","first-page":"309","DOI":"10.1021\/ja01269a023","article-title":"Adsorption of gases in multimolecular layers","volume":"60","author":"Brunauer","year":"1938","journal-title":"J. Am. Chem. Soc."},{"key":"mlstad4a1ebib40","doi-asserted-by":"publisher","first-page":"490","DOI":"10.1063\/1.1746922","article-title":"On the structure of a catalyst surface","volume":"16","author":"Sips","year":"1948","journal-title":"J. Chem. Phys."},{"key":"mlstad4a1ebib41","doi-asserted-by":"publisher","first-page":"1887","DOI":"10.1002\/aic.690341114","article-title":"Rigorous thermodynamic treatment of gas adsorption","volume":"34","author":"Talu","year":"1988","journal-title":"AIChE J."},{"key":"mlstad4a1ebib42","doi-asserted-by":"publisher","first-page":"228","DOI":"10.1006\/jcis.1996.4562","article-title":"Some consequences of the application of incorrect gas\/solid adsorption isotherm equations","volume":"185","author":"Toth","year":"1997","journal-title":"J. Colloid Interface Sci."},{"key":"mlstad4a1ebib43","article-title":"MilesCranmer\/PySR: v0.6.0","author":"Cranmer","year":"2021"},{"key":"mlstad4a1ebib44","doi-asserted-by":"publisher","first-page":"p 162","DOI":"10.1109\/IPDPS.2004.1303155","article-title":"Parallel genetic algorithms: advances, computing trends, applications and perspectives","author":"Konfrst","year":"2004"},{"key":"mlstad4a1ebib45","doi-asserted-by":"publisher","first-page":"e103","DOI":"10.7717\/peerj-cs.103","article-title":"Sympy: symbolic computing in python","volume":"3","author":"Meurer","year":"2017","journal-title":"PeerJ Comput. Sci."},{"key":"mlstad4a1ebib46","article-title":"Interpretable machine learning for science with PySR and symbolic regression.jl","author":"Cranmer","year":"2023"},{"key":"mlstad4a1ebib47","doi-asserted-by":"publisher","first-page":"5599","DOI":"10.1021\/ja974336t","article-title":"Adsorption of linear and branched alkanes in the zeolite silicalite-1","volume":"120","author":"Vlugt","year":"1998","journal-title":"J. Am. Chem. Soc."},{"key":"mlstad4a1ebib48","doi-asserted-by":"publisher","first-page":"1102","DOI":"10.1021\/jp982736c","article-title":"Molecular simulations of adsorption isotherms for linear and branched alkanes and their mixtures in silicalite","volume":"103","author":"Vlugt","year":"1999","journal-title":"J. Phys. Chem. B"},{"key":"mlstad4a1ebib49","doi-asserted-by":"publisher","first-page":"305","DOI":"10.1002\/(SICI)1234-981X(199707)5:33.0.CO;2-4","article-title":"improving ratings\u2019: audit in the British university system","volume":"5","author":"Strathern","year":"1997","journal-title":"Eur. Rev."},{"key":"mlstad4a1ebib50","first-page":"pp 285","article-title":"The identity problem for elementary functions and constants","author":"Richardson","year":"1994"},{"key":"mlstad4a1ebib51","article-title":"Underspecification presents challenges for credibility in modern machine learning","author":"D\u2019Amour","year":"2020"}],"container-title":["Machine Learning: Science and Technology"],"original-title":[],"link":[{"URL":"https:\/\/iopscience.iop.org\/article\/10.1088\/2632-2153\/ad4a1e","content-type":"text\/html","content-version":"am","intended-application":"text-mining"},{"URL":"https:\/\/iopscience.iop.org\/article\/10.1088\/2632-2153\/ad4a1e\/pdf","content-type":"application\/pdf","content-version":"am","intended-application":"text-mining"},{"URL":"https:\/\/iopscience.iop.org\/article\/10.1088\/2632-2153\/ad4a1e","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/iopscience.iop.org\/article\/10.1088\/2632-2153\/ad4a1e\/pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/iopscience.iop.org\/article\/10.1088\/2632-2153\/ad4a1e\/pdf","content-type":"application\/pdf","content-version":"am","intended-application":"syndication"},{"URL":"https:\/\/iopscience.iop.org\/article\/10.1088\/2632-2153\/ad4a1e\/pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/iopscience.iop.org\/article\/10.1088\/2632-2153\/ad4a1e\/pdf","content-type":"application\/pdf","content-version":"am","intended-application":"similarity-checking"},{"URL":"https:\/\/iopscience.iop.org\/article\/10.1088\/2632-2153\/ad4a1e\/pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,6,3]],"date-time":"2024-06-03T03:34:39Z","timestamp":1717385679000},"score":1,"resource":{"primary":{"URL":"https:\/\/iopscience.iop.org\/article\/10.1088\/2632-2153\/ad4a1e"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,6,1]]},"references-count":51,"journal-issue":{"issue":"2","published-online":{"date-parts":[[2024,6,3]]},"published-print":{"date-parts":[[2024,6,1]]}},"URL":"https:\/\/doi.org\/10.1088\/2632-2153\/ad4a1e","relation":{},"ISSN":["2632-2153"],"issn-type":[{"value":"2632-2153","type":"electronic"}],"subject":[],"published":{"date-parts":[[2024,6,1]]},"assertion":[{"value":"Incorporating background knowledge in symbolic regression using a computer algebra system","name":"article_title","label":"Article Title"},{"value":"Machine Learning: Science and Technology","name":"journal_title","label":"Journal Title"},{"value":"paper","name":"article_type","label":"Article Type"},{"value":"\u00a9 2024 The Author(s). Published by IOP Publishing Ltd","name":"copyright_information","label":"Copyright Information"},{"value":"2023-07-10","name":"date_received","label":"Date Received","group":{"name":"publication_dates","label":"Publication dates"}},{"value":"2024-05-10","name":"date_accepted","label":"Date Accepted","group":{"name":"publication_dates","label":"Publication dates"}},{"value":"2024-06-03","name":"date_epub","label":"Online publication date","group":{"name":"publication_dates","label":"Publication dates"}}]}}