{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,26]],"date-time":"2026-02-26T20:34:35Z","timestamp":1772138075583,"version":"3.50.1"},"reference-count":32,"publisher":"Oxford University Press (OUP)","issue":"8","license":[{"start":{"date-parts":[[2025,8,7]],"date-time":"2025-08-07T00:00:00Z","timestamp":1754524800000},"content-version":"vor","delay-in-days":6,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2025,8,2]]},"abstract":"<jats:title>Abstract<\/jats:title>\n                  <jats:sec>\n                    <jats:title>Motivation<\/jats:title>\n                    <jats:p>Ligands are biomolecules that bind to specific sites on target proteins, often inducing conformational changes important in the protein\u2019s function. Knowledge about ligand interactions with proteins are fundamental to understanding biological mechanisms and advancing drug discovery. Traditional protein language models focus on amino acid sequences and 3D structures, overlooking the structural and functional changes induced by protein-ligand interactions. We investigate the value of integrating ligand\u2013protein binding data in several predictive challenges and leverage findings to frame research directions and questions.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Results<\/jats:title>\n                    <jats:p>We show how the integration of protein-ligand interaction data in protein representation learning can increase predictive power. We evaluate the methodology across diverse biological tasks, demonstrating consistent improvements over state-of-the-art models. We further demonstrate how the study of the specific boosts in predictive capabilities coming with the introduction of the ligand modality can serve to focus attention and provide insights on biological mechanisms. By leveraging large pretrained protein language models and enriching them with interaction-specific features through a tailored learning process, we capture functional and structural nuances of proteins in their biochemical context.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Availability and implementation<\/jats:title>\n                    <jats:p>The full code and data are freely available at https:\/\/github.com\/kalifadan\/ProtLigand (DOI: https:\/\/doi.org\/10.5281\/zenodo.15808053).<\/jats:p>\n                  <\/jats:sec>","DOI":"10.1093\/bioinformatics\/btaf425","type":"journal-article","created":{"date-parts":[[2025,8,12]],"date-time":"2025-08-12T20:04:12Z","timestamp":1755029052000},"source":"Crossref","is-referenced-by-count":2,"title":["Beyond the leaderboard: leveraging predictive modeling for protein\u2013ligand insights and discovery"],"prefix":"10.1093","volume":"41","author":[{"ORCID":"https:\/\/orcid.org\/0000-0001-6459-6833","authenticated-orcid":false,"given":"Dan","family":"Kalifa","sequence":"first","affiliation":[{"name":"Department of Computer Science, Technion\u2014Israel Institute of Technology , Haifa 3200003,","place":["Israel"]}]},{"given":"Kira","family":"Radinsky","sequence":"additional","affiliation":[{"name":"Department of Computer Science, Technion\u2014Israel Institute of Technology , Haifa 3200003,","place":["Israel"]}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-8823-0614","authenticated-orcid":false,"given":"Eric","family":"Horvitz","sequence":"additional","affiliation":[{"name":"Office of the Chief Scientific Officer, Microsoft , Redmond, WA 14820,","place":["United States"]},{"name":"Department of Biomedical Informatics and Medical Education, University of Washington , Seattle, WA 98195,","place":["United States"]}]}],"member":"286","published-online":{"date-parts":[[2025,8,7]]},"reference":[{"key":"2025081319160761500_btaf425-B1","doi-asserted-by":"crossref","first-page":"343","DOI":"10.1007\/s11302-012-9305-8","article-title":"ATP synthesis and storage","volume":"8","author":"Bonora","year":"2012","journal-title":"Purinergic Signall"},{"key":"2025081319160761500_btaf425-B2","author":"Chithrananda","year":"2020"},{"key":"2025081319160761500_btaf425-B3","author":"Cohen","year":"969"},{"key":"2025081319160761500_btaf425-B4","author":"Corso","year":"2023"},{"key":"2025081319160761500_btaf425-B5","doi-asserted-by":"crossref","first-page":"1358","DOI":"10.1016\/j.bbamcr.2007.03.010","article-title":"p38 map-kinases pathway regulation, function and role in human diseases","volume":"1773","author":"Cuenda","year":"2007","journal-title":"Biochim Biophys Acta"},{"key":"2025081319160761500_btaf425-B6","author":"Devlin","year":"2019"},{"key":"2025081319160761500_btaf425-B7","doi-asserted-by":"crossref","first-page":"4959","DOI":"10.1093\/bioinformatics\/btac627","article-title":"Codnas-q: a database of conformational diversity of the native state of proteins with quaternary structure","volume":"38","author":"Escobedo","year":"2022","journal-title":"Bioinformatics"},{"key":"2025081319160761500_btaf425-B8","author":"Feng","year":"2024"},{"key":"2025081319160761500_btaf425-B9","doi-asserted-by":"crossref","first-page":"2709","DOI":"10.1021\/acs.biochem.5b00266","article-title":"The c-terminal heme regulatory motifs of heme oxygenase-2 are redox-regulated heme binding sites","volume":"54","author":"Fleischhacker","year":"2015","journal-title":"Biochemistry"},{"key":"2025081319160761500_btaf425-B10","doi-asserted-by":"crossref","first-page":"583","DOI":"10.1038\/s41586-021-03819-2","article-title":"Highly accurate protein structure prediction with alphafold","volume":"596","author":"Jumper","year":"2021","journal-title":"Nature"},{"key":"2025081319160761500_btaf425-B11","doi-asserted-by":"crossref","first-page":"373","DOI":"10.1038\/s42003-025-07791-9","article-title":"Afsample2 predicts multiple conformations and ensembles with alphafold2","volume":"8","author":"Kalakoti","year":"2025","journal-title":"Commun Biol"},{"key":"2025081319160761500_btaf425-B12","doi-asserted-by":"crossref","first-page":"e1002333","DOI":"10.1371\/journal.pcbi.1002333","article-title":"Global analysis of small molecule binding to related protein targets","volume":"8","author":"Kr\u00fcger","year":"2012","journal-title":"PLoS Comput Biol"},{"key":"2025081319160761500_btaf425-B13","doi-asserted-by":"crossref","first-page":"1123","DOI":"10.1126\/science.ade2574","article-title":"Evolutionary-scale prediction of atomic-level protein structure with a language model","volume":"379","author":"Lin","year":"2023","journal-title":"Science"},{"key":"2025081319160761500_btaf425-B14","doi-asserted-by":"crossref","first-page":"43","DOI":"10.1186\/s13321-021-00522-2","article-title":"Gnina 1.0: molecular docking with deep learning","volume":"13","author":"McNutt","year":"2021","journal-title":"J Cheminform"},{"key":"2025081319160761500_btaf425-B15","doi-asserted-by":"crossref","first-page":"e8794","DOI":"10.1371\/journal.pone.0008794","article-title":"A role for non-covalent sumo interaction motifs in pc2\/cbx4 e3 activity","volume":"5","author":"Merrill","year":"2010","journal-title":"PLoS One"},{"key":"2025081319160761500_btaf425-B16","doi-asserted-by":"crossref","first-page":"7051","DOI":"10.1021\/bi500918m","article-title":"Structural and functional assessment of perilipin 2 lipid binding domain(s)","volume":"53","author":"Najt","year":"2014","journal-title":"Biochemistry"},{"key":"2025081319160761500_btaf425-B17","doi-asserted-by":"crossref","first-page":"39517","DOI":"10.1074\/jbc.M300550200","article-title":"The mouse apg10 homologue, an e2-like enzyme for apg12p conjugation, facilitates map-lc3 modification","volume":"278","author":"Nemoto","year":"2003","journal-title":"J Biol Chem"},{"key":"2025081319160761500_btaf425-B18","doi-asserted-by":"crossref","first-page":"i295","DOI":"10.1093\/bioinformatics\/bty287","article-title":"A novel methodology on distributed representations of proteins using their interacting ligands","volume":"34","author":"\u00d6zt\u00fcrk","year":"2018","journal-title":"Bioinformatics"},{"key":"2025081319160761500_btaf425-B19","doi-asserted-by":"crossref","first-page":"2742","DOI":"10.1093\/bioinformatics\/btac202","article-title":"Impact of protein conformational diversity on alphafold predictions","volume":"38","author":"Salda\u00f1o","year":"2022","journal-title":"Bioinformatics"},{"key":"2025081319160761500_btaf425-B20","doi-asserted-by":"crossref","first-page":"591","DOI":"10.1093\/biomet\/52.3-4.591","article-title":"An analysis of variance test for normality (complete samples)","volume":"52","author":"Shapiro","year":"1965","journal-title":"Biometrika"},{"key":"2025081319160761500_btaf425-B21","doi-asserted-by":"crossref","first-page":"5269","DOI":"10.1093\/bioinformatics\/btaa1036","article-title":"On biases of attention in scientific discovery","volume":"36","author":"Singer","year":"2021","journal-title":"Bioinformatics"},{"key":"2025081319160761500_btaf425-B22","doi-asserted-by":"crossref","first-page":"255","DOI":"10.7555\/JBR.36.20220072","article-title":"RNA binding protein boule forms aggregates in mammalian testis","volume":"36","author":"Su","year":"2022","journal-title":"J Biomed Res"},{"key":"2025081319160761500_btaf425-B23","author":"Su","year":"2024"},{"key":"2025081319160761500_btaf425-B24","doi-asserted-by":"crossref","first-page":"88","DOI":"10.4161\/auto.8.1.18339","article-title":"The fap motif within human atg7, an autophagy-related e1-like enzyme, is essential for the e2-substrate reaction of lc3 lipidation","volume":"8","author":"Tanida","year":"2012","journal-title":"Autophagy"},{"key":"2025081319160761500_btaf425-B25","doi-asserted-by":"crossref","first-page":"lqad088","DOI":"10.1093\/nargab\/lqad088","article-title":"Graphpart: homology partitioning for biological sequence analysis","volume":"5","author":"Teufel","year":"2023","journal-title":"NAR Genom Bioinform"},{"key":"2025081319160761500_btaf425-B26","doi-asserted-by":"crossref","first-page":"D368","DOI":"10.1093\/nar\/gkad1011","article-title":"Alphafold protein structure database in 2024: providing structure coverage for over 214 million protein sequences","volume":"52","author":"V\u00e1radi","year":"2024","journal-title":"Nucleic Acids Res"},{"key":"2025081319160761500_btaf425-B27","author":"Vaswani","year":"2017"},{"key":"2025081319160761500_btaf425-B28","doi-asserted-by":"crossref","first-page":"4111","DOI":"10.1021\/jm048957q","article-title":"The pdbbind database: methodologies and updates","volume":"48","author":"Wang","year":"2005","journal-title":"J Med Chem"},{"key":"2025081319160761500_btaf425-B29","doi-asserted-by":"crossref","first-page":"31","DOI":"10.1021\/ci00057a005","article-title":"Smiles, a chemical language and information system. 1. introduction to methodology and encoding rules","volume":"28","author":"Weininger","year":"1988","journal-title":"J Chem Inf Comput Sci"},{"key":"2025081319160761500_btaf425-B30","first-page":"7075","article-title":"Hypoxia-inducible expression of tumor-associated carbonic anhydrases","volume":"60","author":"Wykoff","year":"2000","journal-title":"Cancer Res"},{"key":"2025081319160761500_btaf425-B31","doi-asserted-by":"crossref","first-page":"148","DOI":"10.1126\/science.aav7942","article-title":"Structure and dynamics of the active human parathyroid hormone receptor-1","volume":"364","author":"Zhao","year":"2019","journal-title":"Science"},{"key":"2025081319160761500_btaf425-B32","author":"Zhou","year":"2023"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/41\/8\/btaf425\/63978884\/btaf425.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/41\/8\/btaf425\/63978884\/btaf425.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,8,13]],"date-time":"2025-08-13T23:16:14Z","timestamp":1755126974000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/doi\/10.1093\/bioinformatics\/btaf425\/8225721"}},"subtitle":[],"editor":[{"given":"Jianlin","family":"Cheng","sequence":"additional","affiliation":[]}],"short-title":[],"issued":{"date-parts":[[2025,8]]},"references-count":32,"journal-issue":{"issue":"8","published-print":{"date-parts":[[2025,8,2]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btaf425","relation":{"has-preprint":[{"id-type":"doi","id":"10.1101\/2025.05.12.653449","asserted-by":"object"}]},"ISSN":["1367-4811"],"issn-type":[{"value":"1367-4811","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2025,8]]},"published":{"date-parts":[[2025,8]]},"article-number":"btaf425"}}