{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,6,5]],"date-time":"2026-06-05T15:46:48Z","timestamp":1780674408520,"version":"3.54.1"},"reference-count":24,"publisher":"Oxford University Press (OUP)","issue":"Supplement_1","license":[{"start":{"date-parts":[[2025,7,15]],"date-time":"2025-07-15T00:00:00Z","timestamp":1752537600000},"content-version":"vor","delay-in-days":14,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"German Federal Ministry of Education and Research","award":["031L0309A"],"award-info":[{"award-number":["031L0309A"]}]},{"DOI":"10.13039\/501100007316","name":"Klaus Tschira Stiftung","doi-asserted-by":"publisher","award":["00.003.2024"],"award-info":[{"award-number":["00.003.2024"]}],"id":[{"id":"10.13039\/501100007316","id-type":"DOI","asserted-by":"publisher"}]},{"name":"German Federal Ministry of Education and Research","award":["031L0305A"],"award-info":[{"award-number":["031L0305A"]}]},{"name":"German Federal Ministry of Education and Research","award":["DROP2AI"],"award-info":[{"award-number":["DROP2AI"]}]},{"DOI":"10.13039\/501100001659","name":"Deutsche Forschungsgemeinschaft","doi-asserted-by":"publisher","id":[{"id":"10.13039\/501100001659","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100001659","name":"German Research Foundation","doi-asserted-by":"publisher","award":["422216132"],"award-info":[{"award-number":["422216132"]}],"id":[{"id":"10.13039\/501100001659","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2025,7,1]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:sec>\n                  <jats:title>Motivation<\/jats:title>\n                  <jats:p>As most proteins interact with other proteins to perform their respective functions, methods to computationally predict these interactions have been developed. However, flawed evaluation schemes and data leakage in test sets have obscured the fact that sequence-based protein\u2013protein interaction (PPI) prediction is still an open problem. Recently, methods achieving better-than-random performance on leakage-reduced PPI data have been proposed.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Results<\/jats:title>\n                  <jats:p>Here, we show that the use of ESM-2 protein embeddings explains this performance gain irrespective of model architecture. We compared the performance of models with varying complexity, per-protein, and per-token embeddings, as well as the influence of self- or cross-attention, where all models plateaued at an accuracy of 0.65. Moreover, we show that the tested sequence-based models cannot implicitly learn a contact map as an intermediate layer. These results imply that other input types, such as structure, might be necessary for producing reliable PPI predictions.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Availability and implementation<\/jats:title>\n                  <jats:p>All code for models and execution of the models is available at https:\/\/github.com\/daisybio\/PPI_prediction_study. Python version 3.8.18 and PyTorch version 2.1.1 were used for this study. The environment containing the versions of all other packages used can be found in the GitHub repository. The used data are available at https:\/\/doi.org\/10.6084\/m9.figshare.21591618.v3.<\/jats:p>\n               <\/jats:sec>","DOI":"10.1093\/bioinformatics\/btaf192","type":"journal-article","created":{"date-parts":[[2025,7,15]],"date-time":"2025-07-15T13:03:26Z","timestamp":1752584606000},"page":"i590-i598","source":"Crossref","is-referenced-by-count":7,"title":["Deep learning models for unbiased sequence-based PPI prediction plateau at an accuracy of 0.65"],"prefix":"10.1093","volume":"41","author":[{"ORCID":"https:\/\/orcid.org\/0009-0002-7307-1347","authenticated-orcid":false,"given":"Timo","family":"Reim","sequence":"first","affiliation":[{"name":"Data Science in Systems Biology, TUM School of Life Sciences, Technical University of Munich , Freising, 85354,","place":["Germany"]},{"name":"Department Artificial Intelligence in Biomedical Engineering, Friedrich-Alexander-Universit\u00e4t Erlangen-N\u00fcrnberg , Erlangen, 91052,","place":["Germany"]}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-9172-3137","authenticated-orcid":false,"given":"Anne","family":"Hartebrodt","sequence":"additional","affiliation":[{"name":"Department Artificial Intelligence in Biomedical Engineering, Friedrich-Alexander-Universit\u00e4t Erlangen-N\u00fcrnberg , Erlangen, 91052,","place":["Germany"]}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-8651-750X","authenticated-orcid":false,"given":"David B","family":"Blumenthal","sequence":"additional","affiliation":[{"name":"Department Artificial Intelligence in Biomedical Engineering, Friedrich-Alexander-Universit\u00e4t Erlangen-N\u00fcrnberg , Erlangen, 91052,","place":["Germany"]}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-5812-8013","authenticated-orcid":false,"given":"Judith","family":"Bernett","sequence":"additional","affiliation":[{"name":"Data Science in Systems Biology, TUM School of Life Sciences, Technical University of Munich , Freising, 85354,","place":["Germany"]}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-0941-4168","authenticated-orcid":false,"given":"Markus","family":"List","sequence":"additional","affiliation":[{"name":"Data Science in Systems Biology, TUM School of Life Sciences, Technical University of Munich , Freising, 85354,","place":["Germany"]},{"name":"Munich Data Science Institute (MDSI), Technical University of Munich , Garching bei M\u00fcnchen, 85748,","place":["Germany"]}],"role":[{"vocabulary":"crossref","role":"author"}]}],"member":"286","published-online":{"date-parts":[[2025,7,15]]},"reference":[{"key":"2025071509032174100_btaf192-B1","doi-asserted-by":"crossref","first-page":"D408","DOI":"10.1093\/nar\/gkw985","article-title":"HIPPIE v2.0: enhancing meaningfulness and reliability of protein-protein interaction networks","volume":"45","author":"Alanis-Lobato","year":"2017","journal-title":"Nucleic Acids Res"},{"key":"2025071509032174100_btaf192-B2","doi-asserted-by":"publisher","author":"Bepler","year":"2019","DOI":"10.48550\/arXiv.1902.08661"},{"key":"2025071509032174100_btaf192-B3","doi-asserted-by":"crossref","first-page":"235","DOI":"10.1093\/nar\/28.1.235","article-title":"The protein data bank","volume":"28","author":"Berman","year":"2000","journal-title":"Nucleic Acids Res"},{"key":"2025071509032174100_btaf192-B4","doi-asserted-by":"publisher","author":"Bernett","year":"2022","DOI":"10.6084\/m9.figshare.21591618.v3"},{"key":"2025071509032174100_btaf192-B5","doi-asserted-by":"crossref","first-page":"bbae076","DOI":"10.1093\/bib\/bbae076","article-title":"Cracking the black box of deep sequence-based protein\u2013protein interaction prediction","volume":"25","author":"Bernett","year":"2024","journal-title":"Brief Bioinform"},{"key":"2025071509032174100_btaf192-B6","author":"Biewald","year":"2020"},{"key":"2025071509032174100_btaf192-B7","doi-asserted-by":"publisher","author":"Cornman","year":"2024","DOI":"10.1101\/2024.08.14.607850"},{"key":"2025071509032174100_btaf192-B8","author":"ESM Team","year":"2024"},{"key":"2025071509032174100_btaf192-B9","doi-asserted-by":"publisher","author":"Fan","year":"2025","DOI":"10.48550\/arXiv.2501.10282"},{"key":"2025071509032174100_btaf192-B10","doi-asserted-by":"crossref","first-page":"eads0018","DOI":"10.1126\/science.ads0018","article-title":"Simulating 500 million years of evolution with a language model","volume":"387","author":"Hayes","year":"2025","journal-title":"Science"},{"key":"2025071509032174100_btaf192-B11","doi-asserted-by":"crossref","first-page":"2050","DOI":"10.1002\/pmic.200500517","article-title":"An evaluation of in vitro protein\u2013protein interaction techniques: assessing contaminating background proteins","volume":"6","author":"Howell","year":"2006","journal-title":"Proteomics"},{"key":"2025071509032174100_btaf192-B12","doi-asserted-by":"crossref","first-page":"583","DOI":"10.1038\/s41586-021-03819-2","article-title":"Highly accurate protein structure prediction with AlphaFold","volume":"596","author":"Jumper","year":"2021","journal-title":"Nature"},{"key":"2025071509032174100_btaf192-B13","doi-asserted-by":"crossref","first-page":"bbae359","DOI":"10.1093\/bib\/bbae359","article-title":"TUnA: an uncertainty-aware transformer model for sequence-based protein-protein interaction prediction","volume":"25","author":"Ko","year":"2024","journal-title":"Brief. Bioinform"},{"key":"2025071509032174100_btaf192-B14","doi-asserted-by":"publisher","author":"Ko","year":"2024","DOI":"10.1101\/2024.08.24.609531"},{"key":"2025071509032174100_btaf192-B15","first-page":"6637","volume-title":"Science","author":"Lin","year":"2023"},{"key":"2025071509032174100_btaf192-B16","first-page":"1","volume-title":"Bioinformatics Advances","author":"NaderiAlizadeh","year":"2025"},{"key":"2025071509032174100_btaf192-B17","doi-asserted-by":"crossref","first-page":"147648","DOI":"10.1155\/2014\/147648","article-title":"Protein-protein interaction detection: methods and analysis","volume":"2014","author":"Rao","year":"2014","journal-title":"Int J Proteomics"},{"key":"2025071509032174100_btaf192-B18","doi-asserted-by":"publisher","author":"Richoux","year":"2019","DOI":"10.48550\/arXiv.1901.06268"},{"key":"2025071509032174100_btaf192-B19","first-page":"164","author":"Sanders","year":"2013"},{"key":"2025071509032174100_btaf192-B20","doi-asserted-by":"crossref","first-page":"969","DOI":"10.1016\/j.cels.2021.08.010","article-title":"D-SCRIPT translates genome to phenome with sequence-based, structure-aware, genome-scale predictions of protein-protein interactions","volume":"12","author":"Sledzieski","year":"2021","journal-title":"Cell Syst"},{"key":"2025071509032174100_btaf192-B21","doi-asserted-by":"crossref","first-page":"e2405840121","DOI":"10.1073\/pnas.2405840121","article-title":"Democratizing protein language models with parameter-efficient fine-tuning","volume":"121","author":"Sledzieski","year":"2024","journal-title":"Proc Natl Acad Sci U S A"},{"key":"2025071509032174100_btaf192-B22","doi-asserted-by":"publisher","author":"Tartici","year":"2024","DOI":"10.1101\/2024.10.04.616701"},{"key":"2025071509032174100_btaf192-B23","doi-asserted-by":"publisher","author":"Wu","year":"2024","DOI":"10.1101\/2024.05.14.594226"},{"key":"2025071509032174100_btaf192-B24","doi-asserted-by":"crossref","first-page":"738","DOI":"10.1002\/cmdc.201500495","article-title":"Current experimental methods for characterizing protein\u2013protein interactions","volume":"11","author":"Zhou","year":"2016","journal-title":"ChemMedChem"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/41\/Supplement_1\/i590\/63745496\/btaf192.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/41\/Supplement_1\/i590\/63745496\/btaf192.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,7,15]],"date-time":"2025-07-15T13:03:28Z","timestamp":1752584608000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/41\/Supplement_1\/i590\/8199378"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,7,1]]},"references-count":24,"journal-issue":{"issue":"Supplement_1","published-print":{"date-parts":[[2025,7,1]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btaf192","relation":{},"ISSN":["1367-4803","1367-4811"],"issn-type":[{"value":"1367-4803","type":"print"},{"value":"1367-4811","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2025,7]]},"published":{"date-parts":[[2025,7,1]]}}}