{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,5]],"date-time":"2026-03-05T23:28:39Z","timestamp":1772753319789,"version":"3.50.1"},"reference-count":52,"publisher":"Springer Science and Business Media LLC","issue":"1","license":[{"start":{"date-parts":[[2025,8,6]],"date-time":"2025-08-06T00:00:00Z","timestamp":1754438400000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2025,8,6]],"date-time":"2025-08-06T00:00:00Z","timestamp":1754438400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"DOI":"10.13039\/501100007601","name":"Horizon 2020","doi-asserted-by":"publisher","award":["Marie Sk\u0142odowska-Curie grant agreement No 956832"],"award-info":[{"award-number":["Marie Sk\u0142odowska-Curie grant agreement No 956832"]}],"id":[{"id":"10.13039\/501100007601","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100007601","name":"Horizon 2020","doi-asserted-by":"publisher","award":["Marie Sk\u0142odowska-Curie grant agreement No 956832"],"award-info":[{"award-number":["Marie Sk\u0142odowska-Curie grant agreement No 956832"]}],"id":[{"id":"10.13039\/501100007601","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100007601","name":"Horizon 2020","doi-asserted-by":"publisher","award":["Marie Sk\u0142odowska-Curie grant agreement No 956832"],"award-info":[{"award-number":["Marie Sk\u0142odowska-Curie grant agreement No 956832"]}],"id":[{"id":"10.13039\/501100007601","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100000266","name":"Engineering and Physical Sciences Research Council","doi-asserted-by":"publisher","award":["EP-W002973-1"],"award-info":[{"award-number":["EP-W002973-1"]}],"id":[{"id":"10.13039\/501100000266","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["J Cheminform"],"abstract":"<jats:title>Abstract<\/jats:title>\n          <jats:p>Drug-induced liver injury (DILI) presents a significant challenge due to its complexity, small datasets, and severe class imbalance. While unsupervised pretraining is a common approach to learn molecular representations for downstream tasks, it often lacks insights into how molecules interact with biological systems. We therefore introduce VitroBERT, a bidirectional encoder representations from transformers (BERT) model pretrained on large-scale in vitro assay profiles to generate biologically informed molecular embeddings. When leveraged to predict in vivo DILI endpoints, these embeddings delivered up to a 29% improvement in biochemistry-related tasks and a 16% gain in histopathology endpoints compared to unsupervised pretraining (MolBERT). However, no significant improvement was observed in clinical tasks. Furthermore, to address the critical issue of class imbalance, we evaluated multiple loss functions-including BCE, weighted BCE, Focal loss, and weighted Focal loss-and identified weighted Focal loss as the most effective. Our findings demonstrate the potential of integrating biological context into molecular models and highlight the importance of selecting appropriate loss functions in improving model performance of highly imbalanced DILI-related tasks.      <\/jats:p>","DOI":"10.1186\/s13321-025-01048-7","type":"journal-article","created":{"date-parts":[[2025,8,6]],"date-time":"2025-08-06T07:13:57Z","timestamp":1754464437000},"update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":2,"title":["VitroBert: modeling DILI by pretraining BERT on in vitro data"],"prefix":"10.1186","volume":"17","author":[{"given":"Muhammad Arslan","family":"Masood","sequence":"first","affiliation":[]},{"given":"Anamya Ajjolli","family":"Nagaraja","sequence":"additional","affiliation":[]},{"given":"Katia","family":"Belaid","sequence":"additional","affiliation":[]},{"given":"Natalie","family":"Mesens","sequence":"additional","affiliation":[]},{"given":"Hugo","family":"Ceulemans","sequence":"additional","affiliation":[]},{"given":"Samuel","family":"Kaski","sequence":"additional","affiliation":[]},{"given":"Dorota","family":"Herman","sequence":"additional","affiliation":[]},{"given":"Markus","family":"Heinonen","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2025,8,6]]},"reference":[{"key":"1048_CR1","doi-asserted-by":"publisher","first-page":"1239","DOI":"10.1111\/j.1476-5381.2010.01127.x","volume":"162","author":"J Hughes","year":"2011","unstructured":"Hughes J, Rees S, Kalindjian S, Philpott K (2011) Principles of early drug discovery. Br J Pharmacol 162:1239\u20131249","journal-title":"Br J Pharmacol"},{"key":"1048_CR2","volume-title":"Principles of clinical pharmacology","year":"2007","unstructured":"Atkinson AJ (ed) (2007) Principles of clinical pharmacology, 2nd edn. Academic Press, Amsterdam; Boston","edition":"2"},{"key":"1048_CR3","doi-asserted-by":"crossref","unstructured":"Markey SP (2007) Principles of clinical pharmacology. Elsevier, pp 163\u2013178","DOI":"10.1016\/B978-012369417-1\/50052-3"},{"key":"1048_CR4","doi-asserted-by":"publisher","first-page":"615","DOI":"10.1038\/282615a0","volume":"282","author":"DP Aden","year":"1979","unstructured":"Aden DP, Fogel A, Plotkin S, Damjanov I, Knowles BB (1979) Controlled synthesis of HBsAg in a differentiated human liver carcinoma-derived cell line. Nature 282:615\u2013616","journal-title":"Nature"},{"key":"1048_CR5","doi-asserted-by":"publisher","first-page":"261","DOI":"10.1007\/978-1-60761-688-7_13","volume":"640","author":"M-J Marion","year":"2010","unstructured":"Marion M-J, Hantz O, Durantel D (2010) The HepaRG cell line: biological properties and relevance as a tool for cell biology, drug metabolism, and virology studies. Methods Mol Biol (Clifton, N.J.) 640:261\u2013272","journal-title":"Methods Mol Biol (Clifton, N.J.)"},{"key":"1048_CR6","unstructured":"Chung TDY, Terry DB, Smith LH (2024) In: Markossian S (Eds) Assay Guidance Manual. Eli Lilly & Company and the National Center for Advancing Translational Sciences: Bethesda (MD)"},{"key":"1048_CR7","doi-asserted-by":"publisher","first-page":"10512","DOI":"10.3748\/wjg.v22.i48.10512","volume":"22","author":"C-Y Zhang","year":"2016","unstructured":"Zhang C-Y, Yuan W-G, He P, Lei J-H, Wang C-X (2016) Liver fibrosis and hepatic stellate cells: etiology, pathological hallmarks and therapeutic targets. World J Gastroenterol 22:10512\u201310522","journal-title":"World J Gastroenterol"},{"key":"1048_CR8","doi-asserted-by":"publisher","DOI":"10.1016\/j.dmpk.2023.100511","volume":"52","author":"A Takemura","year":"2023","unstructured":"Takemura A, Ishii S, Ikeyama Y, Esashika K, Takahashi J, Ito K (2023) New in vitro screening system to detect drug-induced liver injury using a culture plate with low drug sorption and high oxygen permeability. Drug Metab Pharmacokinet 52:100511","journal-title":"Drug Metab Pharmacokinet"},{"key":"1048_CR9","doi-asserted-by":"publisher","first-page":"2903","DOI":"10.1007\/s00204-023-03572-7","volume":"97","author":"S Ramirez-Hincapie","year":"2023","unstructured":"Ramirez-Hincapie S, Birk B, Ternes P, Giri V, Zickgraf FM, Haake V, Herold M, Kamp H, Driemert P, Landsiedel R, Richling E, Funk-Weyer D, van Ravenzwaay B (2023) Application of high throughput in vitro metabolomics for hepatotoxicity mode of action characterization and mechanistic-anchored point of departure derivation: a case study with nitrofurantoin. Arch Toxicol 97:2903\u20132917","journal-title":"Arch Toxicol"},{"key":"1048_CR10","doi-asserted-by":"publisher","first-page":"114","DOI":"10.1016\/j.cotox.2020.09.001","volume":"23\u201324","author":"WB Mattes","year":"2020","unstructured":"Mattes WB (2020) In vitro to in vivo translation. Curr Opin Toxicol 23\u201324:114\u2013118","journal-title":"Curr Opin Toxicol"},{"key":"1048_CR11","doi-asserted-by":"publisher","first-page":"18","DOI":"10.1186\/s41231-019-0050-7","volume":"4","author":"AA Seyhan","year":"2019","unstructured":"Seyhan AA (2019) Lost in translation: the valley of death across preclinical and clinical divide identification of problems and overcoming obstacles. Transl Med Commun 4:18","journal-title":"Transl Med Commun"},{"key":"1048_CR12","doi-asserted-by":"publisher","first-page":"833","DOI":"10.1016\/j.ejpb.2013.04.015","volume":"85","author":"HK Batchelor","year":"2013","unstructured":"Batchelor HK, Kendall R, Desset-Brethes S, Alex R, Ernest TB (2013) Application of in vitro biopharmaceutical methods in development of immediate release oral dosage forms intended for paediatric patients. Eur J Pharm Biopharm 85:833\u2013842","journal-title":"Eur J Pharm Biopharm"},{"key":"1048_CR13","doi-asserted-by":"publisher","first-page":"82","DOI":"10.1016\/j.pharmthera.2008.01.006","volume":"118","author":"JC Lipscomb","year":"2008","unstructured":"Lipscomb JC, Poet TS (2008) In vitro measurements of metabolism for application in pharmacokinetic modeling. Pharmacol Therapeutics 118:82\u2013103","journal-title":"Pharmacol Therapeutics"},{"key":"1048_CR14","doi-asserted-by":"publisher","first-page":"3473","DOI":"10.3390\/ijerph20043473","volume":"20","author":"D Deepika","year":"2023","unstructured":"Deepika D, Kumar V (2023) The role of \u201cphysiologically based pharmacokinetic model (PBPK)\u2019\u2019 new approach methodology (NAM) in pharmaceuticals and environmental chemical risk assessment. Int J Environ Res Public Health 20:3473","journal-title":"Int J Environ Res Public Health"},{"key":"1048_CR15","doi-asserted-by":"publisher","first-page":"2299","DOI":"10.1021\/acs.molpharmaceut.9b01294","volume":"17","author":"Y Kosugi","year":"2020","unstructured":"Kosugi Y, Direct Hosea N (2020) Comparison of total clearance prediction: computational machine learning model versus bottom-up approach using in vitro assay. Mol Pharm 17:2299\u20132309","journal-title":"Mol Pharm"},{"key":"1048_CR16","doi-asserted-by":"publisher","first-page":"1071","DOI":"10.1021\/acs.molpharmaceut.0c01009","volume":"18","author":"Y Kosugi","year":"2021","unstructured":"Kosugi Y, Hosea N (2021) Prediction of oral pharmacokinetics using a combination of in silico descriptors and in vitro ADME properties. Mol Pharm 18:1071\u20131079","journal-title":"Mol Pharm"},{"key":"1048_CR17","doi-asserted-by":"publisher","first-page":"5616","DOI":"10.1021\/acs.molpharmaceut.3c00502","volume":"20","author":"CE Keefer","year":"2023","unstructured":"Keefer CE, Chang G, Di L, Woody NA, Tess DA, Osgood SM, Kapinos B, Racich J, Carlo AA, Balesano A, Ferguson N, Orozco C, Zueva L, The Luo L (2023) Comparison of machine learning and mechanistic in vitro in vivo extrapolation models for the prediction of human intrinsic clearance. Mol Pharm 20:5616\u20135630","journal-title":"Mol Pharm"},{"key":"1048_CR18","doi-asserted-by":"publisher","first-page":"42","DOI":"10.3389\/fphar.2019.00042","volume":"10","author":"H Wang","year":"2019","unstructured":"Wang H, Liu R, Schyman P, Wallqvist A (2019) Deep neural network models for predicting chemically induced liver toxicity endpoints from transcriptomic responses. Front Pharmacol 10:42","journal-title":"Front Pharmacol"},{"key":"1048_CR19","doi-asserted-by":"publisher","first-page":"2302","DOI":"10.1124\/dmd.110.035113","volume":"38","author":"S Ekins","year":"2010","unstructured":"Ekins S, Williams AJ, Xu JJ (2010) A predictive ligand-based Bayesian model for human drug-induced liver injury. Drug Metab Dispos 38:2302\u20132308","journal-title":"Drug Metab Dispos"},{"key":"1048_CR20","doi-asserted-by":"publisher","first-page":"388","DOI":"10.1002\/hep.26208","volume":"58","author":"M Chen","year":"2013","unstructured":"Chen M, Borlak J, Tong W (2013) High lipophilicity and high daily dose of oral medications are associated with significant risk for drug-induced liver injury. Hepatology 58:388\u2013396","journal-title":"Hepatology"},{"key":"1048_CR21","doi-asserted-by":"publisher","first-page":"239","DOI":"10.1021\/acs.chemrestox.9b00264","volume":"33","author":"DP Williams","year":"2020","unstructured":"Williams DP, Lazic SE, Foster AJ, Semenova E, Morgan P (2020) Predicting drug-induced liver injury with Bayesian machine learning. Chem Res Toxicol 33:239\u2013248","journal-title":"Chem Res Toxicol"},{"key":"1048_CR22","doi-asserted-by":"publisher","first-page":"1290","DOI":"10.1021\/acs.chemrestox.4c00015","volume":"37","author":"S Seal","year":"2024","unstructured":"Seal S, Williams D, Hosseini-Gerami L, Mahale M, Carpenter AE, Spjuth O, Bender A (2024) Improved detection of drug-induced liver injury by integrating predicted in vivo and in vitro data. Chem Res Toxicol 37:1290\u20131305","journal-title":"Chem Res Toxicol"},{"key":"1048_CR23","doi-asserted-by":"publisher","first-page":"1238","DOI":"10.1021\/acs.chemrestox.2c00378","volume":"36","author":"M Moein","year":"2023","unstructured":"Moein M, Heinonen M, Mesens N, Chamanza R, Amuzie C, Will Y, Ceulemans H, Kaski S, Herman D (2023) Chemistry-based modeling on phenotype-based drug-induced liver injury annotation: from public to proprietary data. Chem Res Toxicol 36:1238\u20131247","journal-title":"Chem Res Toxicol"},{"key":"1048_CR24","doi-asserted-by":"publisher","first-page":"5052","DOI":"10.1039\/D4SC90043J","volume":"15","author":"Y Harnik","year":"2024","unstructured":"Harnik Y, Milo A (2024) A focus on molecular representation learning for the prediction of chemical properties. Chem Sci 15:5052\u20135055","journal-title":"Chem Sci"},{"key":"1048_CR25","doi-asserted-by":"publisher","DOI":"10.1016\/j.drudis.2022.103373","volume":"27","author":"Z Li","year":"2022","unstructured":"Li Z, Jiang M, Wang S, Zhang S (2022) Deep learning methods for molecular representation and property prediction. Drug Discov Today 27:103373","journal-title":"Drug Discov Today"},{"key":"1048_CR26","doi-asserted-by":"publisher","first-page":"1692","DOI":"10.1039\/C8SC04175J","volume":"10","author":"R Winter","year":"2019","unstructured":"Winter R, Montanari F, No F, Clevert D-A (2019) Learning continuous and data-driven molecular descriptors by translating equivalent chemical representations. Chem Sci 10:1692\u20131701","journal-title":"Chem Sci"},{"key":"1048_CR27","doi-asserted-by":"crossref","unstructured":"Yang K, Swanson K, Jin W, Coley C, Eiden P, Gao H, Guzman-Perez A, Hopper T, Kelley B, Mathea M, Palmer A, Settels V, Jaakkola T, Jensen K, Barzilay R (2019) Analyzing learned molecular representations for property prediction. arXiv:1904.01561 [cs, stat]","DOI":"10.26434\/chemrxiv.7940594.v3"},{"key":"1048_CR28","doi-asserted-by":"crossref","unstructured":"Wang S, Guo Y, Wang Y, Sun H, Huang J (2019) SMILES-BERT: large scale unsupervised pre-training for molecular property prediction. In: Proceedings of the 10th ACM international conference on bioinformatics, computational biology and health informatics. Niagara Falls NY USA, pp 429\u2013436","DOI":"10.1145\/3307339.3342186"},{"key":"1048_CR29","first-page":"27","volume":"12","author":"X Li","year":"2020","unstructured":"Li X, Fourches D (2020) Inductive transfer learning for molecular activity prediction: next-gen QSAR models with MolPMoFiT. J Chem 12:27","journal-title":"J Chem"},{"key":"1048_CR30","doi-asserted-by":"publisher","DOI":"10.1016\/j.jmgm.2022.108344","volume":"118","author":"Y Liu","year":"2023","unstructured":"Liu Y, Zhang R, Li T, Jiang J, Ma J, Wang P (2023) MolRoPE-BERT: an enhanced molecular representation with rotary position embedding for molecular property prediction. J Mol Graph Model 118:108344","journal-title":"J Mol Graph Model"},{"key":"1048_CR31","unstructured":"Chithrananda S, Grand G, Ramsundar B (2020) Chemberta: large-scale self-supervised pretraining for molecular property prediction. arXiv:2010.09885"},{"key":"1048_CR32","first-page":"1","volume":"2021","author":"J Li","year":"2021","unstructured":"Li J, Jiang X (2021) Mol-BERT: an effective molecular representation with BERT for molecular property prediction. Wirel Commun Mob Comput 2021:1\u20137","journal-title":"Wirel Commun Mob Comput"},{"key":"1048_CR33","unstructured":"Ahmad W, Simon E, Chithrananda S, Grand G, Ramsundar B (2022) ChemBERTa-2: towards chemical foundation models. arXiv:2209.01712 [cs, q-bio]"},{"key":"1048_CR34","doi-asserted-by":"publisher","DOI":"10.1088\/2632-2153\/ac3ffb","volume":"3","author":"R Irwin","year":"2022","unstructured":"Irwin R, Dimitriadis S, He J, Bjerrum EJ (2022) Chemformer: a pre-trained transformer for computational chemistry. Mach Learn Sci Technol 3:015022","journal-title":"Mach Learn Sci Technol"},{"key":"1048_CR35","unstructured":"Devlin J, Chang M-W, Lee K, Toutanova K (2019) BERT: pre-training of deep bidirectional transformers for language understanding, arXiv:1810.04805 [cs]"},{"key":"1048_CR36","doi-asserted-by":"publisher","first-page":"31","DOI":"10.1021\/ci00057a005","volume":"28","author":"D Weininger","year":"1988","unstructured":"Weininger D (1988) SMILES, a chemical language and information system. 1. Introduction to methodology and encoding rules. J Chem Inf Comput Sci 28:31\u201336","journal-title":"J Chem Inf Comput Sci"},{"key":"1048_CR37","unstructured":"Fabian B, Edlich T, Gaspar H, Segler M, Meyers J, Fiscato M, Ahmed M (2020) Molecular representation learning with language models and domain-relevant auxiliary tasks. arXiv:2011.13230 [cs]"},{"key":"1048_CR38","doi-asserted-by":"publisher","first-page":"56","DOI":"10.1186\/s13321-023-00723-x","volume":"15","author":"S Seal","year":"2023","unstructured":"Seal S, Yang H, Trapotsi M-A, Singh S, Carreras-Puigvert J, Spjuth O, Bender A (2023) Merging bioactivity predictions from cell morphology and chemical fingerprint models using similarity to training data. J Chemin 15:56","journal-title":"J Chemin"},{"key":"1048_CR39","doi-asserted-by":"publisher","first-page":"2261","DOI":"10.1021\/acs.chemrestox.9b00459","volume":"33","author":"S Chavan","year":"2020","unstructured":"Chavan S, Scherbak N, Engwall M, Repsilber D (2020) Predicting chemical-induced liver toxicity using high-content imaging phenotypes and chemical descriptors: a random forest approach. Chem Res Toxicol 33:2261\u20132275","journal-title":"Chem Res Toxicol"},{"key":"1048_CR40","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1038\/s42003-022-03763-5","volume":"5","author":"S Seal","year":"2022","unstructured":"Seal S, Carreras-Puigvert J, Trapotsi M-A, Yang H, Spjuth O, Bender A (2022) Integrating cell morphology with gene expression and chemical structure to aid mitochondrial toxicity detection. Commun Biol 5:1\u201315","journal-title":"Commun Biol"},{"key":"1048_CR41","doi-asserted-by":"publisher","first-page":"5441","DOI":"10.1039\/C8SC00148K","volume":"9","author":"A Mayr","year":"2018","unstructured":"Mayr A, Klambauer G, Unterthiner T, Steijaert M, Wegner JK, Ceulemans H, Clevert D-A, Hochreiter S (2018) Large-scale comparison of machine learning methods for drug target prediction on ChEMBL. Chem Sci 9:5441\u20135451","journal-title":"Chem Sci"},{"key":"1048_CR42","doi-asserted-by":"publisher","first-page":"D921","DOI":"10.1093\/nar\/gku955","volume":"43","author":"Y Igarashi","year":"2015","unstructured":"Igarashi Y, Nakatsu N, Yamashita T, Ono A, Ohno Y, Urushidani T, Yamada H (2015) Open TG-GATEs: a large-scale toxicogenomics database. Nucleic Acids Res 43:D921\u2013D927","journal-title":"Nucleic Acids Res"},{"key":"1048_CR43","doi-asserted-by":"publisher","first-page":"D1075","DOI":"10.1093\/nar\/gkv1075","volume":"44","author":"M Kuhn","year":"2016","unstructured":"Kuhn M, Letunic I, Jensen LJ, Bork P (2016) The SIDER database of drugs and side effects. Nucleic Acids Res 44:D1075\u2013D1079","journal-title":"Nucleic Acids Res"},{"key":"1048_CR44","doi-asserted-by":"publisher","first-page":"742","DOI":"10.1021\/ci100050t","volume":"50","author":"D Rogers","year":"2010","unstructured":"Rogers D, Hahn M (2010) Extended-connectivity fingerprints. J Chem Inf Model 50:742\u2013754","journal-title":"J Chem Inf Model"},{"key":"1048_CR45","doi-asserted-by":"crossref","unstructured":"Lin T-Y, Goyal P, Girshick R, He K, Doll r P (2018) Focal loss for dense object detection, arXiv:1708.02002 [cs]","DOI":"10.1109\/ICCV.2017.324"},{"key":"1048_CR46","doi-asserted-by":"publisher","first-page":"1096","DOI":"10.1021\/acs.jcim.8b00839","volume":"59","author":"N Brown","year":"2019","unstructured":"Brown N, Fiscato M, Segler MH, Vaucher AC (2019) GuacaMol: benchmarking models for de novo molecular design. J Chem Inf Model 59:1096\u20131108","journal-title":"J Chem Inf Model"},{"key":"1048_CR47","doi-asserted-by":"publisher","first-page":"7141","DOI":"10.1038\/s41467-023-42933-9","volume":"14","author":"X Chen","year":"2023","unstructured":"Chen X, Roberts R, Liu Z, Tong WA (2023) Generative adversarial network model alternative to animal studies for clinical pathology assessment. Nature Commun 14:7141","journal-title":"Nature Commun"},{"key":"1048_CR48","doi-asserted-by":"crossref","unstructured":"Davis J, Goadrich M (2006) The relationship between precision-recall and ROC curves. In: Proceedings of the 23rd international conference on machine learning. New York, NY, USA, pp 233\u2013240","DOI":"10.1145\/1143844.1143874"},{"key":"1048_CR49","doi-asserted-by":"publisher","first-page":"49","DOI":"10.1145\/1882471.1882479","volume":"12","author":"G Forman","year":"2010","unstructured":"Forman G, Scholz M (2010) Apples-to-apples in cross-validation studies: pitfalls in classifier performance measurement. ACM SIGKDD Expl Newsl 12:49\u201357","journal-title":"ACM SIGKDD Expl Newsl"},{"key":"1048_CR50","doi-asserted-by":"publisher","first-page":"1551","DOI":"10.1021\/acs.chemrestox.0c00131","volume":"33","author":"M Walles","year":"2020","unstructured":"Walles M, Brown AP, Zimmerlin A, End P (2020) New perspectives on drug-induced liver injury risk assessment of acyl glucuronides. Chem Res Toxicol 33:1551\u20131560","journal-title":"Chem Res Toxicol"},{"key":"1048_CR51","doi-asserted-by":"publisher","first-page":"283","DOI":"10.1021\/acscentsci.6b00367","volume":"3","author":"H Altae-Tran","year":"2017","unstructured":"Altae-Tran H, Ramsundar B, Pappu AS, Low Pande V (2017) Discovery data drug, with one-shot learning. ACS Central Sci 3:283\u2013293","journal-title":"ACS Central Sci"},{"key":"1048_CR52","doi-asserted-by":"publisher","first-page":"1256","DOI":"10.1038\/s42256-022-00580-7","volume":"4","author":"J Ross","year":"2022","unstructured":"Ross J, Belgodere B, Chenthamarakshan V, Padhi I, Mroueh Y, Das P (2022) Large-scale chemical language representations capture molecular structure and properties. Nat Mach Intell 4:1256\u20131264","journal-title":"Nat Mach Intell"}],"container-title":["Journal of Cheminformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/s13321-025-01048-7.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1186\/s13321-025-01048-7\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/s13321-025-01048-7.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,9,8]],"date-time":"2025-09-08T15:17:33Z","timestamp":1757344653000},"score":1,"resource":{"primary":{"URL":"https:\/\/jcheminf.biomedcentral.com\/articles\/10.1186\/s13321-025-01048-7"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,8,6]]},"references-count":52,"journal-issue":{"issue":"1","published-online":{"date-parts":[[2025,12]]}},"alternative-id":["1048"],"URL":"https:\/\/doi.org\/10.1186\/s13321-025-01048-7","relation":{},"ISSN":["1758-2946"],"issn-type":[{"value":"1758-2946","type":"electronic"}],"subject":[],"published":{"date-parts":[[2025,8,6]]},"assertion":[{"value":"1 October 2024","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"25 June 2025","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"6 August 2025","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}}],"article-number":"119"}}