{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,6]],"date-time":"2026-02-06T00:40:54Z","timestamp":1770338454451,"version":"3.49.0"},"reference-count":76,"publisher":"Springer Science and Business Media LLC","issue":"1","license":[{"start":{"date-parts":[[2025,2,25]],"date-time":"2025-02-25T00:00:00Z","timestamp":1740441600000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2025,2,25]],"date-time":"2025-02-25T00:00:00Z","timestamp":1740441600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["J Cheminform"],"abstract":"<jats:title>Abstract<\/jats:title>\n          <jats:p>Machine learning is quickly becoming integral to drug discovery pipelines, particularly quantitative structure-activity relationship (QSAR) and absorption, distribution, metabolism, and excretion (ADME) tasks. Graph Convolutional Network (GCN) models have proven especially promising due to their inherent ability to model molecular structures using graph-based representations. However, maximizing the potential of such models in practice is challenging, as companies prioritize data privacy and security over collaboration initiatives to improve model performance and robustness. kMoL is an open-source machine learning library with integrated federated learning capabilities developed to address such challenges. Its key features include state-of-the-art model architectures, Bayesian optimization, explainability, and federated learning mechanisms. It demonstrates extensive customization possibilities, advanced security features, straightforward implementation of user-specific models, and high adaptability to custom datasets without additional programming requirements. kMoL is evaluated through locally trained benchmark settings and distributed federated learning experiments using various datasets to assess the features and flexibility of the library, as well as the ability to facilitate fast and practical experimentation. Additionally, results of these experiments provide further insights into the performance trade-offs associated with federated learning strategies, presenting valuable guidance for deploying machine learning models in a privacy-preserving manner within drug discovery pipelines. kMoL is available on GitHub at <jats:ext-link xmlns:xlink=\"http:\/\/www.w3.org\/1999\/xlink\" xlink:href=\"https:\/\/github.com\/elix-tech\/kmol\" ext-link-type=\"uri\">https:\/\/github.com\/elix-tech\/kmol<\/jats:ext-link>.<\/jats:p>\n          <jats:p>\n            <jats:bold>Scientific contribution<\/jats:bold> The primary scientific contribution of this research project is the introduction and evaluation of kMoL, an open-source machine learning library with integrated federated learning capabilities. By demonstrating advanced customization and security capabilities without additional programming requirements, kMoL represents an accessible yet secure open-source platform for collaborative drug discovery projects. Additionally, the experiment results provide further insights into the performance trade-offs associated with federated learning strategies, presenting valuable guidance for deploying machine learning models in a privacy-preserving manner within drug discovery pipelines.<\/jats:p>","DOI":"10.1186\/s13321-025-00967-9","type":"journal-article","created":{"date-parts":[[2025,2,25]],"date-time":"2025-02-25T09:39:00Z","timestamp":1740476340000},"update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":9,"title":["kMoL: an open-source machine and federated learning library for drug discovery"],"prefix":"10.1186","volume":"17","author":[{"given":"Romeo","family":"Cozac","sequence":"first","affiliation":[]},{"given":"Haris","family":"Hasic","sequence":"additional","affiliation":[]},{"given":"Jun Jin","family":"Choong","sequence":"additional","affiliation":[]},{"given":"Vincent","family":"Richard","sequence":"additional","affiliation":[]},{"given":"Loic","family":"Beheshti","sequence":"additional","affiliation":[]},{"given":"Cyrille","family":"Froehlich","sequence":"additional","affiliation":[]},{"given":"Takuto","family":"Koyama","sequence":"additional","affiliation":[]},{"given":"Shigeyuki","family":"Matsumoto","sequence":"additional","affiliation":[]},{"given":"Ryosuke","family":"Kojima","sequence":"additional","affiliation":[]},{"given":"Hiroaki","family":"Iwata","sequence":"additional","affiliation":[]},{"given":"Aki","family":"Hasegawa","sequence":"additional","affiliation":[]},{"given":"Takao","family":"Otsuka","sequence":"additional","affiliation":[]},{"given":"Yasushi","family":"Okuno","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2025,2,25]]},"reference":[{"issue":"3","key":"967_CR1","doi-asserted-by":"publisher","first-page":"1947","DOI":"10.1007\/s10462-021-10058-4","volume":"55","author":"S Dara","year":"2022","unstructured":"Dara S, Dhamercherla S, Jadav SS, Babu CM, Ahsan MJ (2022) Machine learning in drug discovery: A review. Artif Intell Rev 55(3):1947\u20131999. https:\/\/doi.org\/10.1007\/s10462-021-10058-4","journal-title":"Artif Intell Rev"},{"issue":"2","key":"967_CR2","doi-asserted-by":"publisher","first-page":"263","DOI":"10.1021\/ci500747n","volume":"55","author":"J Ma","year":"2015","unstructured":"Ma J, Sheridan RP, Liaw A, Dahl GE, Svetnik V (2015) Deep neural nets as a method for quantitative structure-activity relationships. J Chem Inf Model 55(2):263\u2013274. https:\/\/doi.org\/10.1021\/ci500747n","journal-title":"J Chem Inf Model"},{"issue":"10","key":"967_CR3","doi-asserted-by":"publisher","first-page":"2490","DOI":"10.1021\/acs.jcim.7b00087","volume":"57","author":"Y Xu","year":"2017","unstructured":"Xu Y, Ma J, Liaw A, Sheridan RP, Svetnik V (2017) Demystifying multitask deep neural networks for quantitative structure-activity relationships. J Chem Inf Model 57(10):2490\u20132504. https:\/\/doi.org\/10.1021\/acs.jcim.7b00087","journal-title":"J Chem Inf Model"},{"key":"967_CR4","unstructured":"Huang K, Fu T, Gao W, Zhao Y, Roohani Y, Leskovec J, Coley CW, Xiao C, Sun J, Zitnik M (2021) Therapeutics data commons: Machine learning datasets and tasks for drug discovery and development. Proceedings of Neural Information Processing Systems, NeurIPS Datasets and Benchmarks"},{"key":"967_CR5","doi-asserted-by":"crossref","unstructured":"Wu Z, Ramsundar B, Feinberg EN, Gomes J, Geniesse C, Pappu AS, Leswing K, Pande V (2018) MoleculeNet: A Benchmark for Molecular Machine Learning","DOI":"10.1039\/C7SC02664A"},{"issue":"1","key":"967_CR6","doi-asserted-by":"publisher","first-page":"10","DOI":"10.1186\/s13635-024-00158-3","volume":"2024","author":"A Paracha","year":"2024","unstructured":"Paracha A, Arshad J, Farah MB, Ismail K (2024) Machine learning security and privacy: a review of threats and countermeasures. EURASIP J Inf Secur 2024(1):10. https:\/\/doi.org\/10.1186\/s13635-024-00158-3","journal-title":"EURASIP J Inf Secur"},{"issue":"7","key":"967_CR7","doi-asserted-by":"publisher","first-page":"2331","DOI":"10.1021\/acs.jcim.3c00799","volume":"64","author":"W Heyndrickx","year":"2024","unstructured":"Heyndrickx W, Mervin L, Morawietz T, Sturm N, Friedrich L, Zalewski A, Pentina A, Humbeck L, Oldenhof M, Niwayama R, Schmidtke P, Fechner N, Simm J, Arany A, Drizard N, Jabal R, Afanasyeva A, Loeb R, Verma S, Harnqvist S, Holmes M, Pejo B, Telenczuk M, Holway N, Dieckmann A, Rieke N, Zumsande F, Clevert D-A, Krug M, Luscombe C, Green D, Ertl P, Antal P, Marcus D, Do Huu N, Fuji H, Pickett S, Acs G, Boniface E, Beck B, Sun Y, Gohier A, Rippmann F, Engkvist O, G\u00f6ller AH, Moreau Y, Galtier MN, Schuffenhauer A, Ceulemans H (2024) Melloddy: Cross-pharma federated learning at unprecedented scale unlocks benefits in qsar without compromising proprietary information. J Chem Inf Model 64(7):2331\u20132344. https:\/\/doi.org\/10.1021\/acs.jcim.3c00799","journal-title":"J Chem Inf Model"},{"key":"967_CR8","unstructured":"NVIDIA: NVIDIA Clara. Accessed: 2024-08-26. https:\/\/docs.nvidia.com\/clara\/index.html"},{"key":"967_CR9","unstructured":"Beutel DJ, Topal T, Mathur A, Qiu X, Fernandez-Marques J, Gao Y, Sani L, Kwing HL, Parcollet T, Gusm\u00e3o PPd, Lane ND (2020) Flower: A friendly federated learning research framework. arXiv preprint arXiv:2007.14390"},{"key":"967_CR10","first-page":"1","volume":"22","author":"Y Liu","year":"2021","unstructured":"Liu Y, Fan T, Chen T, Xu Q, Yang Q (2021) Fate: an industrial grade platform for collaborative learning with data protection. J Mach Learn Res 22:1","journal-title":"J Mach Learn Res"},{"key":"967_CR11","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-030-70604-3_5","volume-title":"PySyft A Library for Easy Federated Learning","author":"A Ziller","year":"2021","unstructured":"Ziller A, Trask A, Lopardo A, Szymkow B, Wagner B, Bluemke E, Nounahon J-M, Passerat-Palmbach J, Prakash K, Rose N, Ryffel T, Reza ZN, Kaissis G (2021) PySyft A Library for Easy Federated Learning. Springer, Cham"},{"issue":"22\u201323","key":"967_CR12","first-page":"5492","volume":"36","author":"S Chen","year":"2020","unstructured":"Chen S, Xue D, Chuai G, Yang Q, Liu Q (2020) Fl-qsar: a federated learning-based qsar prototype for collaborative drug discovery. Bioinformatics 36(22\u201323):5492\u20135498","journal-title":"Bioinformatics"},{"key":"967_CR13","doi-asserted-by":"publisher","DOI":"10.1016\/j.jbi.2024.104712","volume":"157","author":"Y Guo","year":"2024","unstructured":"Guo Y, Gao Y, Song J (2024) Molcfl: A personalized and privacy-preserving drug discovery framework based on generative clustered federated learning. J Biomed Inf 157:104712. https:\/\/doi.org\/10.1016\/j.jbi.2024.104712","journal-title":"J Biomed Inf"},{"issue":"13","key":"967_CR14","doi-asserted-by":"publisher","first-page":"15576","DOI":"10.1609\/aaai.v37i13.26847","volume":"37","author":"M Oldenhof","year":"2024","unstructured":"...Oldenhof M, \u00c1cs G, Pej\u00f3 B, Schuffenhauer A, Holway N, Sturm N, Dieckmann A, Fortmeier O, Boniface E, Mayer C, Gohier A, Schmidtke P, Niwayama R, Kopecky D, Mervin L, Rathi PC, Friedrich L, Formanek A, Antal P, Rahaman J, Zalewski A, Heyndrickx W, Oluoch E, St\u00f6\u00dfel M, Van\u010do M, Endico D, Gelus F, Boisfoss\u00e9 T, Darbier A, Nicollet A, Blotti\u00e8re M, Telenczuk M, Nguyen VT, Martinez T, Boillet C, Moutet K, Picosson A, Gasser A, Djafar I, Simon A, Arany \u00c1, Simm J, Moreau Y, Engkvist O, Ceulemans H, Marini C, Galtier M (2024) Industry-scale orchestrated federated learning for drug discovery. Proc AAAI Conf Artif Intell 37(13):15576\u201315584. https:\/\/doi.org\/10.1609\/aaai.v37i13.26847","journal-title":"Proc AAAI Conf Artif Intell"},{"key":"967_CR15","unstructured":"Landrum G RDKit: Open-source cheminformatics. http:\/\/www.rdkit.org"},{"key":"967_CR16","unstructured":"Todeschini, R., VC, Wang, R, YF, Ghose, AK, GC, Sharma, V, RG, Stanton, DT, PJ, Yap C, Cao, D-S, Q-SX, Cao, D-S, Y-ZL, Cao, D-S, NX, O\u2019Boyle, NM, GH al (1970) Mordred: a molecular descriptor calculator. Journal of Cheminformatics"},{"issue":"5","key":"967_CR17","doi-asserted-by":"publisher","first-page":"742","DOI":"10.1021\/ci100050t","volume":"50","author":"D Rogers","year":"2010","unstructured":"Rogers D, Hahn M (2010) Extended-connectivity fingerprints. J Chem Inf Model 50(5):742\u2013754. https:\/\/doi.org\/10.1021\/ci100050t.","journal-title":"J Chem Inf Model"},{"issue":"4","key":"967_CR18","doi-asserted-by":"publisher","first-page":"747","DOI":"10.1021\/ci9803381","volume":"39","author":"D Butina","year":"1999","unstructured":"Butina D (1999) Unsupervised data base clustering based on daylight\u2019s fingerprint and tanimoto similarity: A fast and automated way to cluster small and large data sets. J Chem Inf Comput Sci 39(4):747\u2013750. https:\/\/doi.org\/10.1021\/ci9803381","journal-title":"J Chem Inf Comput Sci"},{"key":"967_CR19","unstructured":"Sundararajan M, Taly A, Yan Q (2017) Axiomatic Attribution for Deep Networks"},{"key":"967_CR20","unstructured":"Kokhlikyan N, Miglani V, Martin M, Wang E, Alsallakh B, Reynolds J, Melnikov A, Kliushkina N, Araya C, Yan S, Reblitz-Richardson O (2020) Captum: A unified and generic model interpretability library for PyTorch"},{"key":"967_CR21","unstructured":"Paszke A, Gross S, Massa F, Lerer A, Bradbury J, Chanan G, Killeen T, Lin Z, Gimelshein N, Antiga L, Desmaison A, Kopf A, Yang E, DeVito Z, Raison M, Tejani A, Chilamkurthy S, Steiner B, Fang L, Bai J, Chintala S (2019) Pytorch: An imperative style, high-performance deep learning library. In: Wallach, H., Larochelle, H., Beygelzimer, A., Alch\u00e9-Buc, F., Fox, E., Garnett, R. (eds.) Advances in Neural Information Processing Systems 32, pp. 8024\u20138035. Curran Associates, Inc., Red Hook, NY. http:\/\/papers.neurips.cc\/paper\/9015-pytorch-an-imperative-style-high-performance-deep-learning-library.pdf"},{"key":"967_CR22","unstructured":"Fey M, Lenssen JE (2019) Fast Graph Representation Learning with PyTorch Geometric"},{"key":"967_CR23","unstructured":"Thomas N\u00a0Kipf, MW (2017) Semi-supervised classification with graph convolutional networks. Arxiv"},{"key":"967_CR24","unstructured":"Gilmer J, Schoenholz SS, Riley PF, Vinyals O, Dahl GE (2017) Neural Message Passing for Quantum Chemistry"},{"key":"967_CR25","unstructured":"Defferrard M, Bresson X, Vandergheynst P (2017) Convolutional Neural Networks on Graphs with Fast Localized Spectral Filtering"},{"key":"967_CR26","unstructured":"Hamilton WL, Ying R, Leskovec J (2018) Inductive Representation Learning on Large Graphs"},{"key":"967_CR27","doi-asserted-by":"crossref","unstructured":"Morris C, Ritzert M, Fey M, Hamilton WL, Lenssen JE, Rattan G, Grohe M (2020) Weisfeiler and Leman Go Neural: Higher-order Graph Neural Networks","DOI":"10.1609\/aaai.v33i01.33014602"},{"key":"967_CR28","unstructured":"Xu K, Hu W, Leskovec J, Jegelka S (2019) How Powerful are Graph Neural Networks?"},{"key":"967_CR29","doi-asserted-by":"crossref","unstructured":"Bianchi FM, Grattarola D, Livi L, Alippi C (2021) Graph Neural Networks with convolutional ARMA filters","DOI":"10.1109\/TPAMI.2021.3054830"},{"key":"967_CR30","doi-asserted-by":"crossref","unstructured":"Ranjan E, Sanyal S, Talukdar PP (2020) ASAP: Adaptive Structure Aware Pooling for Learning Hierarchical Graph Representations","DOI":"10.1609\/aaai.v34i04.5997"},{"key":"967_CR31","unstructured":"Li G, Xiong C, Thabet A, Ghanem B (2020) DeeperGCN: All You Need to Train Deeper GCNs"},{"key":"967_CR32","doi-asserted-by":"publisher","unstructured":"Chiang W-L, Liu X, Si S, Li Y, Bengio S, Hsieh C-J (2019) Cluster-gcn: An efficient algorithm for training deep and large graph convolutional networks. Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mininghttps:\/\/doi.org\/10.1145\/3292500.3330925","DOI":"10.1145\/3292500.3330925"},{"key":"967_CR33","doi-asserted-by":"crossref","unstructured":"Verma N, Boyer E, Verbeek J (2018) FeaStNet: Feature-Steered Graph Convolutions for 3D Shape Analysis","DOI":"10.1109\/CVPR.2018.00275"},{"key":"967_CR34","unstructured":"Veli\u010dkovi\u0107 P, Cucurull G, Casanova A, Romero A, Li\u00f2 P, Bengio Y (2018) Graph Attention Networks"},{"key":"967_CR35","unstructured":"Du J, Zhang S, Wu G, Moura JMF, Kar S (2018) Topology Adaptive Graph Convolutional Networks"},{"key":"967_CR36","unstructured":"Wu F, Zhang T, Souza Jr.\u00a0au2 AH, Fifty C, Yu T, Weinberger KQ (2019) Simplifying Graph Convolutional Networks"},{"key":"967_CR37","doi-asserted-by":"publisher","DOI":"10.1093\/bib\/bbaa266\/34130210\/bbaa266.pdf","author":"P Li","year":"2020","unstructured":"Li P, Li Y, Hsieh C-Y, Zhang S, Liu X, Liu H, Song S, Yao X (2020) TrimNet: learning molecular representation from triplet messages for biomedicine. Brief Bioinform. https:\/\/doi.org\/10.1093\/bib\/bbaa266\/34130210\/bbaa266.pdf","journal-title":"Brief Bioinform"},{"key":"967_CR38","doi-asserted-by":"crossref","unstructured":"He K, Zhang X, Ren S, Sun J (2015) Deep Residual Learning for Image Recognition","DOI":"10.1109\/CVPR.2016.90"},{"key":"967_CR39","unstructured":"Ioffe S, Szegedy C (2015) Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift"},{"key":"967_CR40","unstructured":"Ulyanov D, Vedaldi A, Lempitsky V (2017) Instance Normalization: The Missing Ingredient for Fast Stylization"},{"key":"967_CR41","unstructured":"Cai T, Luo S, Xu K, He D, Liu T-Y, Wang L (2021) GraphNorm: A Principled Approach to Accelerating Graph Neural Network Training"},{"issue":"56","key":"967_CR42","first-page":"1929","volume":"15","author":"N Srivastava","year":"2014","unstructured":"Srivastava N, Hinton G, Krizhevsky A, Sutskever I, Salakhutdinov R (2014) Dropout: A simple way to prevent neural networks from overfitting. J Mach Learn Res 15(56):1929\u20131958","journal-title":"J Mach Learn Res"},{"key":"967_CR43","unstructured":"Agarap AF (2019) Deep Learning using Rectified Linear Units (ReLU)"},{"issue":"D1","key":"967_CR44","doi-asserted-by":"publisher","first-page":"945","DOI":"10.1093\/nar\/gkw1074","volume":"45","author":"A Gaulton","year":"2016","unstructured":"Gaulton A, Hersey A, Nowotka M, Bento AP, Chambers J, Mendez D, Mutowo P, Atkinson F, Bellis LJ, Cibri\u00e1n-Uhalte E, Davies M, Dedman N, Karlsson A, Magari\u00f1os MP, Overington JP, Papadatos G, Smit I, Leach AR (2016) The chembl database in 2017. Nucleic Acids Res 45(D1):945\u2013954","journal-title":"Nucleic Acids Res"},{"key":"967_CR45","unstructured":"Amini A, Schwarting W, Soleimany A, Rus D (2020) Deep Evidential Regression. https:\/\/arxiv.org\/abs\/1910.02600"},{"key":"967_CR46","unstructured":"Sensoy M, Kaplan L, Kandemir M (2018) Evidential Deep Learning to Quantify Classification Uncertainty. https:\/\/arxiv.org\/abs\/1806.01768"},{"key":"967_CR47","unstructured":"Zhuang J, Tang T, Ding Y, Tatikonda S, Dvornek N, Papademetris X, Duncan JS (2020) AdaBelief Optimizer: Adapting Stepsizes by the Belief in Observed Gradients"},{"key":"967_CR48","unstructured":"Bergstra J, Bardenet R, Bengio Y, K\u00e9gl B (2011) Algorithms for hyper-parameter optimization. In: Proceedings of the 24th International Conference on Neural Information Processing Systems. NIPS\u201911, pp. 2546\u20132554. Curran Associates Inc., Red Hook, NY, USA"},{"key":"967_CR49","unstructured":"Bergstra J, Yamins D, Cox D (2013) Making a science of model search: Hyperparameter optimization in hundreds of dimensions for vision architectures. In: Dasgupta, S., McAllester, D. (eds.) Proceedings of the 30th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 28, pp. 115\u2013123. PMLR, Atlanta, Georgia, USA. http:\/\/proceedings.mlr.press\/v28\/bergstra13.html"},{"key":"967_CR50","doi-asserted-by":"crossref","unstructured":"Akiba T, Sano S, Yanase T, Ohta T, Koyama M (2019) Optuna: A Next-generation Hyperparameter Optimization Framework","DOI":"10.1145\/3292500.3330701"},{"key":"967_CR51","unstructured":"Gal Y, Ghahramani Z (2016) Dropout as a bayesian approximation: Representing model uncertainty in deep learning. In: Balcan, M.F., Weinberger, K.Q. (eds.) Proceedings of The 33rd International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 48, pp. 1050\u20131059. PMLR, New York, New York, USA. https:\/\/proceedings.mlr.press\/v48\/gal16.html"},{"key":"967_CR52","unstructured":"Sensoy M, Kaplan L, Kandemir M (2018) Evidential Deep Learning to Quantify Classification Uncertainty. https:\/\/arxiv.org\/abs\/1806.01768"},{"key":"967_CR53","unstructured":"Amini A, Schwarting W, Soleimany A, Rus D (2020) Deep Evidential Regression. https:\/\/arxiv.org\/abs\/1910.02600"},{"key":"967_CR54","unstructured":"gRPC: gRPC - An RPC library and framework. Accessed: 2024-08-26. https:\/\/grpc.io"},{"key":"967_CR55","unstructured":"Box I Box: The Intelligent Content Cloud. https:\/\/www.box.com"},{"issue":"3\u20134","key":"967_CR56","doi-asserted-by":"publisher","first-page":"211","DOI":"10.1561\/0400000042","volume":"9","author":"C Dwork","year":"2014","unstructured":"Dwork C, Roth A (2014) The algorithmic foundations of differential privacy. Found Trends Theor Comput Sci 9(3\u20134):211\u2013407. https:\/\/doi.org\/10.1561\/0400000042","journal-title":"Found Trends Theor Comput Sci"},{"key":"967_CR57","doi-asserted-by":"crossref","unstructured":"Dwork C, Roth A (2014) The Algorithmic Foundations of Differential Privacy. now Publishers Inc, Boston","DOI":"10.1561\/9781601988195"},{"key":"967_CR58","unstructured":"Opacus PyTorch library. Available from https:\/\/opacus.ai"},{"key":"967_CR59","unstructured":"Rigaki M, Garcia S (2020) A Survey of Privacy Attacks in Machine Learning"},{"key":"967_CR60","doi-asserted-by":"publisher","DOI":"10.3389\/fenvs.2015.00085","author":"R Huang","year":"2016","unstructured":"Huang R, Xia M, Nguyen D-T, Zhao T, Sakamuru S, Zhao J, Shahane S, Rossoshek A, Simeonov A (2016) Tox21challenge to build predictive models of nuclear receptor and stress response pathways as mediated by exposure to environmental chemicals and drugs. Front Environ Sci. https:\/\/doi.org\/10.3389\/fenvs.2015.00085","journal-title":"Front Environ Sci"},{"issue":"9","key":"967_CR61","doi-asserted-by":"publisher","first-page":"2077","DOI":"10.1021\/ci900161g","volume":"49","author":"K Hansen","year":"2009","unstructured":"Hansen K, Mika S, Schroeter T, Sutter A, Laak A, Steger-Hartmann T, Heinrich N, M\u00fcller K-R (2009) Benchmark data set for in silico prediction of ames mutagenicity. J Chem Inf Model 49(9):2077\u20132081. https:\/\/doi.org\/10.1021\/ci900161g","journal-title":"J Chem Inf Model"},{"key":"967_CR62","unstructured":"Ames Conclusions Data Collection. Accessed: 2020-12-02 (2020). ftp:\/\/anonftp.niehs.nih.gov\/ntp-cebs\/datatype\/NTP_Data_Collections\/Ames_Conclusions_DataCollection_2020-02-19.xlsx"},{"key":"967_CR63","unstructured":"PubChem Bioassay Record for AID 1259408, GENE-TOX mutagenicity studies. Accessed: 2024-9-25. https:\/\/pubchem.ncbi.nlm.nih.gov\/bioassay\/1259408"},{"issue":"14","key":"967_CR64","doi-asserted-by":"publisher","first-page":"9697","DOI":"10.1021\/acs.jmedchem.3c00481","volume":"66","author":"H Kawashima","year":"2023","unstructured":"Kawashima H, Watanabe R, Esaki T, Kuroda M, Nagao C, Natsume-Kitatani Y, Ohashi R, Komura H, Mizuguchi K (2023) Drumap: A novel drug metabolism and pharmacokinetics analysis platform. J Med Chem 66(14):9697\u20139709. https:\/\/doi.org\/10.1021\/acs.jmedchem.3c00481","journal-title":"J Med Chem"},{"issue":"2","key":"967_CR65","doi-asserted-by":"publisher","first-page":"127","DOI":"10.1038\/s42256-021-00438-4","volume":"4","author":"X Fang","year":"2022","unstructured":"Fang X, Liu L, Lei J, He D, Zhang S, Zhou J, Wang F, Wu H, Wang H (2022) Geometry-enhanced molecular representation learning for property prediction. Nat Mach Intell 4(2):127\u2013134. https:\/\/doi.org\/10.1038\/s42256-021-00438-4","journal-title":"Nat Mach Intell"},{"key":"967_CR66","doi-asserted-by":"crossref","unstructured":"Zhou G, Gao Z, Ding Q, Zheng H, Xu H, Wei Z, Zhang L, Ke G (2023) Uni-mol: A universal 3d molecular representation learning framework. In: The Eleventh International Conference on Learning Representations. https:\/\/openreview.net\/forum?id=6K2RM6wVqKu","DOI":"10.26434\/chemrxiv-2022-jjm0j-v4"},{"key":"967_CR67","doi-asserted-by":"publisher","DOI":"10.1016\/j.egyai.2022.100201","volume":"10","author":"X Han","year":"2022","unstructured":"Han X, Jia M, Chang Y, Li Y, Wu S (2022) Directed message passing neural network (d-mpnn) with graph edge attention (gea) for property prediction of biofuel-relevant species. Energy AI 10:100201. https:\/\/doi.org\/10.1016\/j.egyai.2022.100201","journal-title":"Energy AI"},{"issue":"16","key":"967_CR68","doi-asserted-by":"publisher","first-page":"8749","DOI":"10.1021\/acs.jmedchem.9b00959","volume":"63","author":"Z Xiong","year":"2020","unstructured":"Xiong Z, Wang D, Liu X, Zhong F, Wan X, Li X, Li Z, Luo X, Chen K, Jiang H, Zheng M (2020) Pushing the boundaries of molecular representation for drug discovery with the graph attention mechanism. J Med Chem 63(16):8749\u20138760. https:\/\/doi.org\/10.1021\/acs.jmedchem.9b00959","journal-title":"J Med Chem"},{"key":"967_CR69","unstructured":"Liu S, Demirel MF, Liang Y (2019) N-Gram Graph: Simple Unsupervised Representation for Graphs, with Applications to Molecules. https:\/\/arxiv.org\/abs\/1806.09206"},{"key":"967_CR70","unstructured":"Hu W, Liu B, Gomes J, Zitnik M, Liang P, Pande V, Leskovec J (2020) Strategies for Pre-training Graph Neural Networks. https:\/\/arxiv.org\/abs\/1905.12265"},{"key":"967_CR71","unstructured":"Rong Y, Bian Y, Xu T, Xie W, Wei Y, Huang W, Huang J (2020) Self-Supervised Graph Transformer on Large-Scale Molecular Data. https:\/\/arxiv.org\/abs\/2007.02835"},{"key":"967_CR72","unstructured":"Liu S, Wang H, Liu W, Lasenby J, Guo H, Tang J (2022) Pre-training Molecular Graph Representation with 3D Geometry. https:\/\/arxiv.org\/abs\/2110.07728"},{"issue":"3","key":"967_CR73","doi-asserted-by":"publisher","first-page":"279","DOI":"10.1038\/s42256-022-00447-x","volume":"4","author":"Y Wang","year":"2022","unstructured":"Wang Y, Wang J, Cao Z, Barati Farimani A (2022) Molecular contrastive learning of representations via graph neural networks. Nat Mach Intell 4(3):279\u2013287. https:\/\/doi.org\/10.1038\/s42256-022-00447-x","journal-title":"Nat Mach Intell"},{"key":"967_CR74","doi-asserted-by":"crossref","unstructured":"Xia J, Zhao C, Hu B, Gao Z, Tan C, Liu Y, Li S, Li SZ (2023) Mole-BERT: Rethinking pre-training graph neural networks for molecules. In: The Eleventh International Conference on Learning Representations. https:\/\/openreview.net\/forum?id=jevY-DtiZTR","DOI":"10.26434\/chemrxiv-2023-dngg4"},{"key":"967_CR75","unstructured":"Izmailov P, Podoprikhin D, Garipov T, Vetrov D, Wilson AG (2019) Averaging Weights Leads to Wider Optima and Better Generalization"},{"issue":"3","key":"967_CR76","doi-asserted-by":"publisher","first-page":"529","DOI":"10.1007\/s11427-021-1946-0","volume":"65","author":"Z Xiong","year":"2022","unstructured":"Xiong Z, Cheng Z, Lin X, Xu C, Liu X, Wang D, Luo X, Zhang Y, Jiang H, Qiao N, Zheng M (2022) Facing small and biased data dilemma in drug discovery with enhanced federated learning approaches. Sci China Life Sci 65(3):529\u2013539. https:\/\/doi.org\/10.1007\/s11427-021-1946-0","journal-title":"Sci China Life Sci"}],"container-title":["Journal of Cheminformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/s13321-025-00967-9.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1186\/s13321-025-00967-9\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/s13321-025-00967-9.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,2,25]],"date-time":"2025-02-25T09:39:21Z","timestamp":1740476361000},"score":1,"resource":{"primary":{"URL":"https:\/\/jcheminf.biomedcentral.com\/articles\/10.1186\/s13321-025-00967-9"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,2,25]]},"references-count":76,"journal-issue":{"issue":"1","published-online":{"date-parts":[[2025,12]]}},"alternative-id":["967"],"URL":"https:\/\/doi.org\/10.1186\/s13321-025-00967-9","relation":{},"ISSN":["1758-2946"],"issn-type":[{"value":"1758-2946","type":"electronic"}],"subject":[],"published":{"date-parts":[[2025,2,25]]},"assertion":[{"value":"6 November 2024","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"2 February 2025","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"25 February 2025","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Declarations"}},{"value":"The authors declare no Conflict of interest.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Competing interests"}}],"article-number":"22"}}