{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,10]],"date-time":"2025-10-10T01:29:57Z","timestamp":1760059797205,"version":"build-2065373602"},"reference-count":71,"publisher":"MDPI AG","issue":"7","license":[{"start":{"date-parts":[[2025,7,11]],"date-time":"2025-07-11T00:00:00Z","timestamp":1752192000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"Ministry of Science, Technological Development, and Innovation of the Republic of Serbia","award":["451-03-136\/2025-03\/200135"],"award-info":[{"award-number":["451-03-136\/2025-03\/200135"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Computation"],"abstract":"<jats:p>Machine learning (ML) is transforming computational chemistry by accelerating molecular simulations, property prediction, and inverse design. Central to this transformation is mathematical optimization, which underpins nearly every stage of model development, from training neural networks and tuning hyperparameters to navigating chemical space for molecular discovery. This review presents a structured overview of optimization techniques used in ML for computational chemistry, including gradient-based methods (e.g., SGD and Adam), probabilistic approaches (e.g., Monte Carlo sampling and Bayesian optimization), and spectral methods. We classify optimization targets into model parameter optimization, hyperparameter selection, and molecular optimization and analyze their application across supervised, unsupervised, and reinforcement learning frameworks. Additionally, we examine key challenges such as data scarcity, limited generalization, and computational cost, outlining how mathematical strategies like active learning, meta-learning, and hybrid physics-informed models can address these issues. By bridging optimization methodology with domain-specific challenges, this review highlights how tailored optimization strategies enhance the accuracy, efficiency, and scalability of ML models in computational chemistry.<\/jats:p>","DOI":"10.3390\/computation13070169","type":"journal-article","created":{"date-parts":[[2025,7,11]],"date-time":"2025-07-11T13:44:19Z","timestamp":1752241459000},"page":"169","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":0,"title":["Mathematical Optimization in Machine Learning for Computational Chemistry"],"prefix":"10.3390","volume":"13","author":[{"ORCID":"https:\/\/orcid.org\/0000-0001-9314-8984","authenticated-orcid":false,"given":"Ana","family":"Zeki\u0107","sequence":"first","affiliation":[{"name":"Department of Mathematical Sciences, Faculty of Technology and Metallurgy, University of Belgrade, 11000 Belgrade, Serbia"}]}],"member":"1968","published-online":{"date-parts":[[2025,7,11]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","unstructured":"Bottou, L. (2010, January 22\u201327). Large-Scale Machine Learning with Stochastic Gradient Descent. Proceedings of the COMPSTAT\u20192010, Paris, France.","DOI":"10.1007\/978-3-7908-2604-3_16"},{"key":"ref_2","unstructured":"Sutskever, I., Martens, J., Dahl, G., and Hinton, G. (2013, January 17\u201319). On the importance of initialization and momentum in deep learning. Proceedings of the 30th International Conference on Machine Learning, Atlanta, GA, USA. Available online: https:\/\/proceedings.mlr.press\/v28\/sutskever13.html."},{"key":"ref_3","doi-asserted-by":"crossref","first-page":"058301","DOI":"10.1103\/PhysRevLett.108.058301","article-title":"Fast and Accurate Modeling of Molecular Atomization Energies with Machine Learning","volume":"108","author":"Rupp","year":"2012","journal-title":"Phys. Rev. Lett."},{"key":"ref_4","unstructured":"Ruder, S. (2016). An overview of gradient descent optimization algorithms. arXiv."},{"key":"ref_5","unstructured":"Kingma, D.P., and Ba, J. (2015, January 7\u20139). Adam: A Method for Stochastic Optimization. Proceedings of the 3rd International Conference for Learning Representations, San Diego, CA, USA."},{"key":"ref_6","doi-asserted-by":"crossref","first-page":"13890","DOI":"10.1038\/ncomms13890","article-title":"Quantum-Chemical Insights from Deep Tensor Neural Networks","volume":"8","author":"Arbabzadah","year":"2017","journal-title":"Nat. Commun."},{"key":"ref_7","doi-asserted-by":"crossref","first-page":"513","DOI":"10.1039\/C7SC02664A","article-title":"MoleculeNet: A Benchmark for Molecular Machine Learning","volume":"9","author":"Wu","year":"2018","journal-title":"Chem. Sci."},{"key":"ref_8","unstructured":"Reddi, S.J., Kale, S., and Kumar, S. (2019). On the Convergence of Adam and Beyond. arXiv."},{"key":"ref_9","unstructured":"Zhuang, J., Tang, T., Ding, Y., Wang, S., Liu, Z., Castro, C.D., Dvornek, N., Papademetris, X., and Duncan, J.S. (2020). AdaBelief Optimizer: Adapting Stepsizes by the Belief in Observed Gradients. arXiv."},{"key":"ref_10","unstructured":"Ma, N.Q., Yarats, D., and Kapturowski, S. (2018). Quasi-Hyperbolic Momentum and Adam for Deep Learning. arXiv."},{"key":"ref_11","doi-asserted-by":"crossref","unstructured":"Kollmannsberger, S., D\u2019Angella, D., Jokeit, M., and Herrmann, L. (2021). Deep Learning in Computational Mechanics, Springer.","DOI":"10.1007\/978-3-030-76587-3"},{"key":"ref_12","doi-asserted-by":"crossref","unstructured":"Kiyani, E., Shukla, K., Urb\u00e1n, J.F., Darbon, J., and Karniadakis, G.E. (2025). Which Optimizer Works Best for Physics-Informed Neural Networks and Kolmogorov-Arnold Networks?. arXiv.","DOI":"10.2139\/ssrn.5261377"},{"key":"ref_13","doi-asserted-by":"crossref","first-page":"533","DOI":"10.1038\/323533a0","article-title":"Learning representations by back-propagating errors","volume":"323","author":"Rumelhart","year":"1986","journal-title":"Nature"},{"key":"ref_14","doi-asserted-by":"crossref","first-page":"253002","DOI":"10.1103\/PhysRevLett.108.253002","article-title":"Finding Density Functionals with Machine Learning","volume":"108","author":"Snyder","year":"2012","journal-title":"Phys. Rev. Lett."},{"key":"ref_15","doi-asserted-by":"crossref","first-page":"1087","DOI":"10.1063\/1.1699114","article-title":"Equation of State Calculations by Fast Computing Machines","volume":"21","author":"Metropolis","year":"1953","journal-title":"J. Chem. Phys."},{"key":"ref_16","doi-asserted-by":"crossref","first-page":"041124","DOI":"10.1103\/PhysRevB.102.041124","article-title":"Self-learning hybrid Monte Carlo: A first-principles approach","volume":"102","author":"Nagai","year":"2020","journal-title":"Phys. Rev. B"},{"key":"ref_17","doi-asserted-by":"crossref","first-page":"8861","DOI":"10.1021\/acs.jctc.3c00822","article-title":"Evolutionary Monte Carlo of QM Properties in Chemical Space: Electrolyte Design","volume":"19","author":"Karandashev","year":"2023","journal-title":"J. Chem. Theory Comput."},{"key":"ref_18","doi-asserted-by":"crossref","first-page":"148","DOI":"10.1109\/JPROC.2015.2494218","article-title":"Taking the human out of the loop: A review of Bayesian optimization","volume":"104","author":"Shahriari","year":"2016","journal-title":"Proc. IEEE"},{"key":"ref_19","unstructured":"Hern\u00e1ndez-Lobato, J.M., Requeima, J., Pyzer-Knapp, E.O., and Aspuru-Guzik, A. (2017, January 6\u201311). Parallel and Distributed Thompson Sampling for Large-scale Accelerated Exploration of Chemical Space. Proceedings of the 34th International Conference on Machine Learning, Sydney, Australia. Available online: https:\/\/arxiv.org\/abs\/1706.01825."},{"key":"ref_20","doi-asserted-by":"crossref","first-page":"268","DOI":"10.1021\/acscentsci.7b00572","article-title":"Automatic Chemical Design Using a Data-Driven Continuous Representation of Molecules","volume":"4","year":"2018","journal-title":"ACS Cent. Sci."},{"key":"ref_21","doi-asserted-by":"crossref","first-page":"48","DOI":"10.1186\/s13321-017-0235-x","article-title":"Molecular de-novo design through deep reinforcement learning","volume":"9","author":"Olivecrona","year":"2017","journal-title":"J. Cheminform."},{"key":"ref_22","doi-asserted-by":"crossref","first-page":"957","DOI":"10.1021\/acscentsci.3c00050","article-title":"Accelerated Chemical Reaction Optimization Using Multi-Task Learning","volume":"9","author":"Wigh","year":"2023","journal-title":"ACS Cent. Sci."},{"key":"ref_23","unstructured":"Kipf, T.N., and Welling, M. (2017). Semi-Supervised Classification with Graph Convolutional Networks. arXiv."},{"key":"ref_24","unstructured":"Chung, F.R.K. (1997). Spectral Graph Theory, American Mathematical Society. Available online: https:\/\/bookstore.ams.org\/cbms-92."},{"key":"ref_25","unstructured":"Zhou, D., Bousquet, O., Lal, T.N., Weston, J., and Sch\u00f6lkopf, B. (2004, January 9\u201311). Learning with Local and Global Consistency. Proceedings of the 17th International Conference on Neural Information Processing Systems, Whistler, BC, Canada. Advances in Neural Information Processing Systems."},{"key":"ref_26","first-page":"7569","article-title":"Regularized by Physics: Graph Neural Network Parametrized Differentiable Force Field Models","volume":"18","author":"Riniker","year":"2022","journal-title":"J. Chem. Theory Comput."},{"key":"ref_27","doi-asserted-by":"crossref","first-page":"102026","DOI":"10.1016\/j.apr.2023.102026","article-title":"Aerosol classification by application of machine learning spectral clustering algorithm","volume":"15","author":"Ningombam","year":"2024","journal-title":"Atmos. Pollut. Res."},{"key":"ref_28","doi-asserted-by":"crossref","first-page":"108","DOI":"10.1016\/j.jmgm.2011.12.006","article-title":"Nonlinear dimensionality reduction and mapping of compound libraries for drug discovery","volume":"34","author":"Reutlinger","year":"2012","journal-title":"J. Mol. Graph. Model."},{"key":"ref_29","unstructured":"Gill, J., Chakraborty, R., Gubba, R., Liu, A., Jain, S., Iyer, C., Khwaja, O., and Kumar, S. (2023). Unsupervised Learning of Molecular Embeddings for Enhanced Clustering and Emergent Properties for Chemical Compounds. arXiv."},{"key":"ref_30","doi-asserted-by":"crossref","unstructured":"Yu, S., Dong, H., Wang, P., Wu, C., and Guo, Y. (2018). Generative Creativity: Adversarial Learning for Bionic Design. arXiv.","DOI":"10.1007\/978-3-030-30508-6_42"},{"key":"ref_31","unstructured":"Gilmer, J., Schoenholz, S.S., Riley, P.F., Vinyals, O., and Dahl, G.E. (2017). Neural Message Passing for Quantum Chemistry. arXiv."},{"key":"ref_32","unstructured":"Gasteiger, J., Gro\u00df, J., and G\u00fcnnemann, S. (2020). Directional Message Passing for Molecular Graphs. arXiv."},{"key":"ref_33","first-page":"241722","article-title":"SchNet\u2014A continuous-filter convolutional neural network for modeling quantum interactions","volume":"148","author":"Kindermans","year":"2018","journal-title":"J. Chem. Phys."},{"key":"ref_34","unstructured":"Sch\u00fctt, K.T., Unke, O.T., and Gastegger, M. (2021). Equivariant message passing for the prediction of tensorial properties and molecular spectra. arXiv."},{"key":"ref_35","doi-asserted-by":"crossref","first-page":"2453","DOI":"10.1038\/s41467-022-29939-5","article-title":"E(3)-equivariant graph neural networks for data-efficient and accurate interatomic potentials","volume":"13","author":"Batzner","year":"2022","journal-title":"Nat. Commun."},{"key":"ref_36","doi-asserted-by":"crossref","first-page":"170","DOI":"10.1126\/science.abn3445","article-title":"The central role of density functional theory in the AI age","volume":"81","author":"Huang","year":"2023","journal-title":"Science"},{"key":"ref_37","doi-asserted-by":"crossref","unstructured":"Nandy, A., Duan, C., and Kulik, H.J. (2021). Audacity of huge: Overcoming challenges of data scarcity and data quality for machine learning in computational materials discovery. arXiv.","DOI":"10.1016\/j.coche.2021.100778"},{"key":"ref_38","doi-asserted-by":"crossref","unstructured":"Liu, Y.-Y., and Kashima, H. (2022). Chemical property prediction under experimental biases. Sci. Rep., 12.","DOI":"10.1038\/s41598-022-12116-5"},{"key":"ref_39","doi-asserted-by":"crossref","unstructured":"Demir-Kavuk, O., Kamada, M., Akutsu, T., and Knapp, E.W. (2011). Prediction using step-wise L1, L2 regularization and feature selection for small data sets with large number of features. BMC Bioinform., 12.","DOI":"10.1186\/1471-2105-12-412"},{"key":"ref_40","doi-asserted-by":"crossref","first-page":"1538","DOI":"10.1016\/j.drudis.2018.05.010","article-title":"Machine learning in chemoinformatics and drug discovery","volume":"23","author":"Lo","year":"2018","journal-title":"Drug Discov. Today"},{"key":"ref_41","doi-asserted-by":"crossref","first-page":"249","DOI":"10.1038\/s42004-023-01054-6","article-title":"Variational autoencoder-based chemical latent space for large molecular structures with 3D complexity","volume":"6","author":"Ochiai","year":"2023","journal-title":"Commun. Chem."},{"key":"ref_42","doi-asserted-by":"crossref","first-page":"3564","DOI":"10.1021\/acs.chemmater.9b01294","article-title":"Graph Networks as a Universal Machine Learning Framework for Molecules and Crystals","volume":"31","author":"Chen","year":"2019","journal-title":"Chem. Mater."},{"key":"ref_43","doi-asserted-by":"crossref","first-page":"786","DOI":"10.1038\/s43588-024-00697-2","article-title":"Traversing chemical space with active deep learning for low-data drug discovery","volume":"4","author":"Grisoni","year":"2024","journal-title":"Nat. Comput. Sci."},{"key":"ref_44","doi-asserted-by":"crossref","first-page":"6259","DOI":"10.1021\/acs.jctc.2c00752","article-title":"Chemical Space Exploration with Active Learning and Alchemical Free Energies","volume":"18","author":"Khalak","year":"2022","journal-title":"J. Chem. Theory Comput."},{"key":"ref_45","doi-asserted-by":"crossref","first-page":"1086","DOI":"10.1039\/D3DD00234A","article-title":"Race to the bottom: Bayesian optimisation for chemical problems","volume":"3","author":"Wu","year":"2024","journal-title":"Digit. Discov."},{"key":"ref_46","doi-asserted-by":"crossref","first-page":"e1701816","DOI":"10.1126\/sciadv.1701816","article-title":"Machine learning unifies the modeling of materials and molecules","volume":"3","author":"De","year":"2017","journal-title":"Sci. Adv."},{"key":"ref_47","doi-asserted-by":"crossref","first-page":"697","DOI":"10.1021\/acs.jcim.3c01774","article-title":"Real-World Molecular Out-of-Distribution: Specification and Investigation","volume":"64","author":"Tossou","year":"2024","journal-title":"J. Chem. Inf. Model."},{"key":"ref_48","doi-asserted-by":"crossref","first-page":"e1603","DOI":"10.1002\/wcms.1603","article-title":"A review of molecular representation in the age of machine learning","volume":"12","author":"Wigh","year":"2022","journal-title":"WIREs Comput. Mol. Sci."},{"key":"ref_49","doi-asserted-by":"crossref","first-page":"216","DOI":"10.1021\/acs.jctc.7b01157","article-title":"Machine Learning of Dynamic Electron Correlation Energies from Topological Atoms","volume":"14","author":"McDonagh","year":"2018","journal-title":"J. Chem. Theory Comput."},{"key":"ref_50","doi-asserted-by":"crossref","first-page":"3662","DOI":"10.1021\/acs.jpclett.1c00578","article-title":"Transfer Learning from Simulation to Experimental Data: NMR Chemical Shift Predictions","volume":"12","author":"Han","year":"2021","journal-title":"J. Phys. Chem. Lett."},{"key":"ref_51","unstructured":"Gani, Y., Ustinova, E., Ajakan, H., Germain, P., Larochelle, H., Laviolette, F., March, M., and Lempitsky, V. (2015). Domain-Adversarial Training of Neural Networks. arXiv."},{"key":"ref_52","doi-asserted-by":"crossref","first-page":"2848","DOI":"10.1093\/bioinformatics\/btaa063","article-title":"Domain-adversarial multi-task framework for novel therapeutic property prediction of compounds","volume":"36","author":"Xie","year":"2020","journal-title":"Bioinformatics"},{"key":"ref_53","doi-asserted-by":"crossref","first-page":"23940","DOI":"10.1021\/acsomega.4c02147","article-title":"Meta Learning with Attention Based FP-GNNs for Few-Shot Molecular Property Prediction","volume":"9","author":"Qian","year":"2024","journal-title":"ACS Omega"},{"key":"ref_54","doi-asserted-by":"crossref","first-page":"9816","DOI":"10.1021\/acs.chemrev.1c00107","article-title":"Combining Machine Learning and Computational Chemistry for Predictive Insights Into Chemical Systems","volume":"121","author":"Keith","year":"2021","journal-title":"Chem. Rev."},{"key":"ref_55","unstructured":"Wang, X., and Zhang, M. (2022, January 9\u201312). Graph Neural Network with Local Frame for Molecular Potential Energy Surface. Proceedings of the First Learning on Graphs Conference, Virtual Event. Available online: https:\/\/proceedings.mlr.press\/v198\/wang22d.html."},{"key":"ref_56","doi-asserted-by":"crossref","first-page":"422","DOI":"10.1038\/s42254-021-00314-5","article-title":"Physics-informed machine learning","volume":"3","author":"Karniadakis","year":"2021","journal-title":"Nat. Rev. Phys."},{"key":"ref_57","doi-asserted-by":"crossref","first-page":"5223","DOI":"10.1038\/s41467-020-19093-1","article-title":"Quantum chemical accuracy from density functional approximations via machine learning","volume":"11","author":"Bogojeski","year":"2020","journal-title":"Nat. Commun."},{"key":"ref_58","unstructured":"Snoek, J., Larochelle, H., and Adams, R.P. (2012). Practical Bayesian Optimization of Machine Learning Algorithms. arXiv."},{"key":"ref_59","doi-asserted-by":"crossref","first-page":"732","DOI":"10.1039\/D2DD00028H","article-title":"Bayesian optimization with known experimental and design constraints for chemistry applications","volume":"1","author":"Hickman","year":"2022","journal-title":"Digit. Discov."},{"key":"ref_60","doi-asserted-by":"crossref","first-page":"10975","DOI":"10.1039\/D2NR07147A","article-title":"Bayesian optimisation for efficient material discovery: A mini review","volume":"15","author":"Jin","year":"2023","journal-title":"Nanoscale"},{"key":"ref_61","doi-asserted-by":"crossref","first-page":"1963","DOI":"10.1039\/D0RE00232A","article-title":"Iterative Experimental Design Based on Active Machine Learning Reduces the Experimental Burden Associated with Reaction Screening","volume":"5","author":"Eyke","year":"2020","journal-title":"React. Chem. Eng."},{"key":"ref_62","doi-asserted-by":"crossref","first-page":"148","DOI":"10.1038\/s43588-023-00406-5","article-title":"Uncertainty-driven dynamics for active learning of interatomic potentials","volume":"3","author":"Kulichenko","year":"2023","journal-title":"Nat. Comput. Sci."},{"key":"ref_63","doi-asserted-by":"crossref","first-page":"65","DOI":"10.1038\/s41524-025-01550-4","article-title":"Data-efficient construction of high-fidelity graph deep learning interatomic potentials","volume":"11","author":"Ko","year":"2025","journal-title":"npj Comput. Mater."},{"key":"ref_64","first-page":"554","article-title":"Molecular Sets (MOSES): A Benchmarking Platform for Molecular Generation Models","volume":"2","author":"Polykovskiy","year":"2020","journal-title":"Nat. Mach. Intell."},{"key":"ref_65","doi-asserted-by":"crossref","first-page":"e1608","DOI":"10.1002\/wcms.1608","article-title":"Generative models for molecular discovery: Recent advances and challenges","volume":"12","author":"Bilodeau","year":"2022","journal-title":"Wiley Interdiscip. Rev. Comput. Mol. Sci."},{"key":"ref_66","doi-asserted-by":"crossref","unstructured":"Blanchard, A.E., Zhang, P., Bhowmik, D., Mehta, K., Gounley, J., Reeve, S.T., Irle, S., and Pasini, M.L. (2023). Computational Workflow for Accelerated Molecular Design Using Quantum Chemical Simulations and Deep Learning Models. Commun. Comput. Inf. Sci.","DOI":"10.26434\/chemrxiv-2022-gw1n3"},{"key":"ref_67","doi-asserted-by":"crossref","unstructured":"Fiedler, L., Hoffmann, N., Mohammed, P., Popoola, G.A., Yovell, T., Oles, V., Ellis, J.A., Rajamanickam, S., and Cangi, A. (2022). Training-free hyperparameter optimization of neural networks for electronic structures in matter. arXiv.","DOI":"10.1088\/2632-2153\/ac9956"},{"key":"ref_68","doi-asserted-by":"crossref","first-page":"104","DOI":"10.1038\/s41524-020-00367-7","article-title":"Machine-learned interatomic potentials by active learning: Amorphous and liquid hafnium dioxide","volume":"6","author":"Sivaraman","year":"2020","journal-title":"npj Comput. Mater."},{"key":"ref_69","doi-asserted-by":"crossref","first-page":"163","DOI":"10.3390\/e16010163","article-title":"Enhanced Sampling in Molecular Dynamics Using Metadynamics, Replica-Exchange, and Temperature-Acceleration","volume":"16","author":"Abrams","year":"2014","journal-title":"Entropy"},{"key":"ref_70","unstructured":"Ahmad, W., Simon, E., Chithrananda, S., Grand, G., and Ramsundar, B. (2022). ChemBERTa-2: Towards Chemical Foundation Models. arXiv."},{"key":"ref_71","doi-asserted-by":"crossref","first-page":"6059","DOI":"10.1021\/acscatal.0c04525","article-title":"Open Catalyst 2020 (OC20) Dataset and Community Challenges","volume":"11","author":"Chanussot","year":"2021","journal-title":"ACS Catal."}],"container-title":["Computation"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2079-3197\/13\/7\/169\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,9]],"date-time":"2025-10-09T18:08:40Z","timestamp":1760033320000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2079-3197\/13\/7\/169"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,7,11]]},"references-count":71,"journal-issue":{"issue":"7","published-online":{"date-parts":[[2025,7]]}},"alternative-id":["computation13070169"],"URL":"https:\/\/doi.org\/10.3390\/computation13070169","relation":{},"ISSN":["2079-3197"],"issn-type":[{"type":"electronic","value":"2079-3197"}],"subject":[],"published":{"date-parts":[[2025,7,11]]}}}