{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T01:15:27Z","timestamp":1760145327529,"version":"build-2065373602"},"reference-count":35,"publisher":"MDPI AG","issue":"3","license":[{"start":{"date-parts":[[2024,7,12]],"date-time":"2024-07-12T00:00:00Z","timestamp":1720742400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"Ministero dell\u2019Universit\u00e0 e della Ricerca (MUR)","award":["1561"],"award-info":[{"award-number":["1561"]}]},{"name":"European Union\u2014NextGenerationEU","award":["1561"],"award-info":[{"award-number":["1561"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["MAKE"],"abstract":"<jats:p>We assume that a sufficiently large database is available, where a physical property of interest and a number of associated ruling primitive variables or observables are stored. We introduce and test two machine learning approaches to discover possible groups or combinations of primitive variables, regardless of data origin, being it numerical or experimental: the first approach is based on regression models, whereas the second on classification models. The variable group (here referred to as the new effective good variable) can be considered as successfully found when the physical property of interest is characterized by the following effective invariant behavior: in the first method, invariance of the group implies invariance of the property up to a given accuracy; in the other method, upon partition of the physical property values into two or more classes, invariance of the group implies invariance of the class. For the sake of illustration, the two methods are successfully applied to two popular empirical correlations describing the convective heat transfer phenomenon and to the Newton\u2019s law of universal gravitation.<\/jats:p>","DOI":"10.3390\/make6030077","type":"journal-article","created":{"date-parts":[[2024,7,16]],"date-time":"2024-07-16T11:31:57Z","timestamp":1721129517000},"page":"1597-1618","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":2,"title":["Learning Effective Good Variables from Physical Data"],"prefix":"10.3390","volume":"6","author":[{"ORCID":"https:\/\/orcid.org\/0009-0007-9112-3186","authenticated-orcid":false,"given":"Giulio","family":"Barletta","sequence":"first","affiliation":[{"name":"Department of Energy, Politecnico di Torino, C.so Duca degli Abruzzi 24, 10129 Torino, Italy"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-0601-6292","authenticated-orcid":false,"given":"Giovanni","family":"Trezza","sequence":"additional","affiliation":[{"name":"Department of Energy, Politecnico di Torino, C.so Duca degli Abruzzi 24, 10129 Torino, Italy"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-6165-7434","authenticated-orcid":false,"given":"Eliodoro","family":"Chiavazzo","sequence":"additional","affiliation":[{"name":"Department of Energy, Politecnico di Torino, C.so Duca degli Abruzzi 24, 10129 Torino, Italy"}]}],"member":"1968","published-online":{"date-parts":[[2024,7,12]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","unstructured":"Rappaz, M., Bellet, M., Deville, M.O., and Snyder, R. (2003). Numerical Modeling in Materials Science and Engineering, Springer.","DOI":"10.1007\/978-3-642-11821-0"},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"433","DOI":"10.1038\/s43588-022-00281-6","article-title":"Automated discovery of fundamental variables hidden in experimental data","volume":"2","author":"Chen","year":"2022","journal-title":"Nat. Comput. Sci."},{"key":"ref_3","doi-asserted-by":"crossref","first-page":"1113","DOI":"10.1038\/s42256-022-00575-4","article-title":"Data-driven discovery of intrinsic dynamics","volume":"4","author":"Floryan","year":"2022","journal-title":"Nat. Mach. Intell."},{"key":"ref_4","doi-asserted-by":"crossref","first-page":"185","DOI":"10.1007\/s11023-022-09619-5","article-title":"How a Minimal Learning Agent can Infer the Existence of Unobserved Variables in a Complex Environment","volume":"33","author":"Eva","year":"2023","journal-title":"Minds Mach."},{"key":"ref_5","doi-asserted-by":"crossref","first-page":"1751","DOI":"10.1016\/j.jcp.2011.11.007","article-title":"Approximation of slow and fast dynamics in multiscale dynamical systems by the linearized Relaxation Redistribution Method","volume":"231","author":"Chiavazzo","year":"2012","journal-title":"J. Comput. Phys."},{"key":"ref_6","doi-asserted-by":"crossref","first-page":"5535","DOI":"10.1016\/j.jcp.2008.02.006","article-title":"Quasi-equilibrium grid algorithm: Geometric construction for model reduction","volume":"227","author":"Chiavazzo","year":"2008","journal-title":"J. Comput. Phys."},{"key":"ref_7","doi-asserted-by":"crossref","first-page":"59","DOI":"10.1080\/14786449208620167","article-title":"On the question of the stability of the flow of fluids","volume":"34","author":"Rayleigh","year":"1892","journal-title":"Lond. Edinb. Dublin Philos. Mag. J. Sci."},{"key":"ref_8","doi-asserted-by":"crossref","first-page":"345","DOI":"10.1103\/PhysRev.4.345","article-title":"On physically similar systems; illustrations of the use of dimensional equations","volume":"4","author":"Buckingham","year":"1914","journal-title":"Phys. Rev."},{"key":"ref_9","doi-asserted-by":"crossref","first-page":"117","DOI":"10.1016\/0024-3795(82)90229-4","article-title":"Dimensional analysis and the pi theorem","volume":"47","author":"Curtis","year":"1982","journal-title":"Linear Algebra Its Appl."},{"key":"ref_10","doi-asserted-by":"crossref","first-page":"E5494","DOI":"10.1073\/pnas.1621481114","article-title":"Intrinsic map dynamics exploration for uncharted effective free-energy landscapes","volume":"114","author":"Chiavazzo","year":"2017","journal-title":"Proc. Natl. Acad. Sci. USA"},{"key":"ref_11","doi-asserted-by":"crossref","first-page":"112","DOI":"10.3390\/pr2010112","article-title":"Reduced models in chemical kinetics via nonlinear data-mining","volume":"2","author":"Chiavazzo","year":"2014","journal-title":"Processes"},{"key":"ref_12","doi-asserted-by":"crossref","first-page":"109864","DOI":"10.1016\/j.jcp.2020.109864","article-title":"Data-driven model reduction, Wiener projections, and the Koopman-Mori-Zwanzig formalism","volume":"424","author":"Lin","year":"2021","journal-title":"J. Comput. Phys."},{"key":"ref_13","doi-asserted-by":"crossref","unstructured":"McRee, R.K. (2010, January 7\u201311). Symbolic regression using nearest neighbor indexing. Proceedings of the 12th Annual Conference Companion on Genetic and Evolutionary Computation, New York, NY, USA.","DOI":"10.1145\/1830761.1830841"},{"key":"ref_14","doi-asserted-by":"crossref","unstructured":"Stijven, S., Minnebo, W., and Vladislavleva, K. (2011, January 12\u201316). Separating the wheat from the chaff: On feature selection and feature importance in regression random forests and symbolic regression. Proceedings of the 13th Annual Conference Companion on Genetic and Evolutionary Computation, Dublin, Ireland.","DOI":"10.1145\/2001858.2002059"},{"key":"ref_15","doi-asserted-by":"crossref","unstructured":"McConaghy, T. (2011). FFX: Fast, scalable, deterministic symbolic regression technology. Genetic Programming Theory and Practice IX, Springer.","DOI":"10.1007\/978-1-4614-1770-5_13"},{"key":"ref_16","doi-asserted-by":"crossref","unstructured":"Arnaldo, I., O\u2019Reilly, U.M., and Veeramachaneni, K. (2015, January 11\u201315). Building predictive models via feature synthesis. Proceedings of the 2015 Annual Conference on Genetic and Evolutionary Computation, New York, NY, USA.","DOI":"10.1145\/2739480.2754693"},{"key":"ref_17","doi-asserted-by":"crossref","first-page":"3932","DOI":"10.1073\/pnas.1517384113","article-title":"Discovering governing equations from data by sparse identification of nonlinear dynamical systems","volume":"113","author":"Brunton","year":"2016","journal-title":"Proc. Natl. Acad. Sci. USA"},{"key":"ref_18","doi-asserted-by":"crossref","first-page":"063116","DOI":"10.1063\/1.5027470","article-title":"Sparse identification of nonlinear dynamics for rapid model recovery","volume":"28","author":"Quade","year":"2018","journal-title":"Chaos Interdiscip. J. Nonlinear Sci."},{"key":"ref_19","unstructured":"Searson, D.P., Leahy, D.E., and Willis, M.J. (2010, January 17\u201319). GPTIPS: An open source genetic programming toolbox for multigene symbolic regression. Proceedings of the International Multiconference of Engineers and Computer Scientists, Hong Kong, China. Citeseer."},{"key":"ref_20","unstructured":"Dub\u010d\u00e1kov\u00e1, R. (2024, May 05). Eureqa: Software Review. Available online: https:\/\/www.researchgate.net\/publication\/220286070_Eureqa_software_review."},{"key":"ref_21","doi-asserted-by":"crossref","first-page":"81","DOI":"10.1126\/science.1165893","article-title":"Distilling free-form natural laws from experimental data","volume":"324","author":"Schmidt","year":"2009","journal-title":"Science"},{"key":"ref_22","doi-asserted-by":"crossref","first-page":"eaay2631","DOI":"10.1126\/sciadv.aay2631","article-title":"AI Feynman: A physics-inspired method for symbolic regression","volume":"6","author":"Udrescu","year":"2020","journal-title":"Sci. Adv."},{"key":"ref_23","doi-asserted-by":"crossref","first-page":"106579","DOI":"10.1016\/j.mtcomm.2023.106579","article-title":"Leveraging composition-based energy material descriptors for machine learning models","volume":"36","author":"Trezza","year":"2023","journal-title":"Mater. Today Commun."},{"key":"ref_24","doi-asserted-by":"crossref","first-page":"15648","DOI":"10.1021\/jacs.4c01305","article-title":"Multi-Variable Multi-Metric Optimization of Self-Assembled Photocatalytic CO2 Reduction Performance Using Machine Learning Algorithms","volume":"146","author":"Bonke","year":"2024","journal-title":"J. Am. Chem. Soc."},{"key":"ref_25","first-page":"4765","article-title":"A unified approach to interpreting model predictions","volume":"30","author":"Lundberg","year":"2017","journal-title":"Adv. Neural Inf. Process. Syst."},{"key":"ref_26","doi-asserted-by":"crossref","first-page":"2269","DOI":"10.1109\/TETCI.2024.3369407","article-title":"Genetic Programming for Feature Selection Based on Feature Removal Impact in High-Dimensional Symbolic Regression","volume":"8","author":"Chen","year":"2024","journal-title":"IEEE Trans. Emerg. Top. Comput. Intell."},{"key":"ref_27","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1137\/S1064827595289108","article-title":"A subspace, interior, and conjugate gradient method for large-scale bound-constrained minimization problems","volume":"21","author":"Branch","year":"1999","journal-title":"SIAM J. Sci. Comput."},{"key":"ref_28","first-page":"99","article-title":"On a measure of divergence between two statistical populations defined by their probability distribution","volume":"35","author":"Bhattacharyya","year":"1943","journal-title":"Bull. Calcutta Math. Soc."},{"key":"ref_29","first-page":"401","article-title":"On a measure of divergence between two multinomial populations","volume":"7","author":"Bhattacharyya","year":"1946","journal-title":"Sankhy\u0101 Indian J. Stat."},{"key":"ref_30","doi-asserted-by":"crossref","unstructured":"Villani, C. (2009). Optimal Transport: Old and New, Springer.","DOI":"10.1007\/978-3-540-71050-9"},{"key":"ref_31","doi-asserted-by":"crossref","unstructured":"Lide, D.R., and Kehiaian, H.V. (2020). CRC Handbook of Thermophysical and Thermochemical Data, CRC Press.","DOI":"10.1201\/9781003067719"},{"key":"ref_32","doi-asserted-by":"crossref","first-page":"261","DOI":"10.1038\/s41592-019-0686-2","article-title":"SciPy 1.0: Fundamental Algorithms for Scientific Computing in Python","volume":"17","author":"Virtanen","year":"2020","journal-title":"Nat. Methods"},{"key":"ref_33","doi-asserted-by":"crossref","first-page":"21356","DOI":"10.1039\/D0TA00143K","article-title":"Recent progress in morphology optimization in perovskite solar cell","volume":"8","author":"Tailor","year":"2020","journal-title":"J. Mater. Chem. A"},{"key":"ref_34","doi-asserted-by":"crossref","first-page":"288","DOI":"10.1016\/j.jcp.2012.08.013","article-title":"Simulation-based optimal Bayesian experimental design for nonlinear systems","volume":"232","author":"Huan","year":"2013","journal-title":"J. Comput. Phys."},{"key":"ref_35","doi-asserted-by":"crossref","first-page":"108405","DOI":"10.1016\/j.cpc.2022.108405","article-title":"Bayesian optimization package: PHYSBO","volume":"278","author":"Motoyama","year":"2022","journal-title":"Comput. Phys. Commun."}],"container-title":["Machine Learning and Knowledge Extraction"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2504-4990\/6\/3\/77\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,10]],"date-time":"2025-10-10T15:15:56Z","timestamp":1760109356000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2504-4990\/6\/3\/77"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,7,12]]},"references-count":35,"journal-issue":{"issue":"3","published-online":{"date-parts":[[2024,9]]}},"alternative-id":["make6030077"],"URL":"https:\/\/doi.org\/10.3390\/make6030077","relation":{},"ISSN":["2504-4990"],"issn-type":[{"type":"electronic","value":"2504-4990"}],"subject":[],"published":{"date-parts":[[2024,7,12]]}}}