{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,7,30]],"date-time":"2025-07-30T11:42:44Z","timestamp":1753875764549,"version":"3.41.2"},"reference-count":55,"publisher":"Oxford University Press (OUP)","issue":"1","license":[{"start":{"date-parts":[[2024,12,13]],"date-time":"2024-12-13T00:00:00Z","timestamp":1734048000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/100010663","name":"European Research Council","doi-asserted-by":"publisher","id":[{"id":"10.13039\/100010663","id-type":"DOI","asserted-by":"publisher"}]},{"name":"European Union\u2019s Horizon 2020 research and innovation programme","award":["851173"],"award-info":[{"award-number":["851173"]}]},{"DOI":"10.13039\/501100000289","name":"Cancer Research UK","doi-asserted-by":"publisher","id":[{"id":"10.13039\/501100000289","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2024,12,26]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:sec>\n                  <jats:title>Motivation<\/jats:title>\n                  <jats:p>Identifying interacting partners from two sets of protein sequences has important applications in computational biology. Interacting partners share similarities across species due to their common evolutionary history, and feature correlations in amino acid usage due to the need to maintain complementary interaction interfaces. Thus, the problem of finding interacting pairs can be formulated as searching for a pairing of sequences that maximizes a sequence similarity or a coevolution score. Several methods have been developed to address this problem, applying different approximate optimization methods to different scores.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Results<\/jats:title>\n                  <jats:p>We introduce Differentiable Pairing using Soft Scores (DiffPaSS), a differentiable framework for flexible, fast, and hyperparameter-free optimization for pairing interacting biological sequences, which can be applied to a wide variety of scores. We apply it to a benchmark prokaryotic dataset, using mutual information and neighbor graph alignment scores. DiffPaSS outperforms existing algorithms for optimizing the same scores. We demonstrate the usefulness of our paired alignments for the prediction of protein complex structure. DiffPaSS does not require sequences to be aligned, and we also apply it to nonaligned sequences from T-cell receptors.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Availability and implementation<\/jats:title>\n                  <jats:p>A PyTorch implementation and installable Python package are available at https:\/\/github.com\/Bitbol-Lab\/DiffPaSS.<\/jats:p>\n               <\/jats:sec>","DOI":"10.1093\/bioinformatics\/btae738","type":"journal-article","created":{"date-parts":[[2024,12,14]],"date-time":"2024-12-14T02:48:57Z","timestamp":1734144537000},"source":"Crossref","is-referenced-by-count":1,"title":["DiffPaSS\u2014high-performance differentiable pairing of protein sequences using soft scores"],"prefix":"10.1093","volume":"41","author":[{"ORCID":"https:\/\/orcid.org\/0000-0001-6767-493X","authenticated-orcid":false,"given":"Umberto","family":"Lupo","sequence":"first","affiliation":[{"name":"Institute of Bioengineering, School of Life Sciences, \u00c9cole Polytechnique F\u00e9d\u00e9rale de Lausanne (EPFL) , Lausanne CH-1015,","place":["Switzerland"]},{"name":"SIB Swiss Institute of Bioinformatics , Lausanne CH-1015,","place":["Switzerland"]}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-7878-6061","authenticated-orcid":false,"given":"Damiano","family":"Sgarbossa","sequence":"additional","affiliation":[{"name":"Institute of Bioengineering, School of Life Sciences, \u00c9cole Polytechnique F\u00e9d\u00e9rale de Lausanne (EPFL) , Lausanne CH-1015,","place":["Switzerland"]},{"name":"SIB Swiss Institute of Bioinformatics , Lausanne CH-1015,","place":["Switzerland"]}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-4170-8796","authenticated-orcid":false,"given":"Martina","family":"Milighetti","sequence":"additional","affiliation":[{"name":"Division of Infection and Immunity, University College London , London WC1E 6BT,","place":["United Kingdom"]},{"name":"Cancer Institute, University College London , London WC1E 6DD,","place":["United Kingdom"]}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-1020-494X","authenticated-orcid":false,"given":"Anne-Florence","family":"Bitbol","sequence":"additional","affiliation":[{"name":"Institute of Bioengineering, School of Life Sciences, \u00c9cole Polytechnique F\u00e9d\u00e9rale de Lausanne (EPFL) , Lausanne CH-1015,","place":["Switzerland"]},{"name":"SIB Swiss Institute of Bioinformatics , Lausanne CH-1015,","place":["Switzerland"]}]}],"member":"286","published-online":{"date-parts":[[2024,12,13]]},"reference":[{"key":"2024122721054238600_btae738-B1","doi-asserted-by":"publisher","first-page":"315","DOI":"10.1186\/1471-2164-10-315","article-title":"P2CS: a two-component system resource for prokaryotic signal transduction research","volume":"10","author":"Barakat","year":"2009","journal-title":"BMC Genomics"},{"key":"2024122721054238600_btae738-B2","doi-asserted-by":"publisher","first-page":"D771","DOI":"10.1093\/nar\/gkq1023","article-title":"P2CS: a database of prokaryotic two-component systems","volume":"39","author":"Barakat","year":"2011","journal-title":"Nucleic Acids Res"},{"key":"2024122721054238600_btae738-B3","doi-asserted-by":"publisher","first-page":"e0161879","DOI":"10.1371\/journal.pone.0161879","article-title":"DockQ: a quality measure for protein\u2013protein docking models","volume":"11","author":"Basu","year":"2016","journal-title":"PLoS One"},{"key":"2024122721054238600_btae738-B4","doi-asserted-by":"publisher","first-page":"e1006401","DOI":"10.1371\/journal.pcbi.1006401","article-title":"Inferring interaction partners from protein sequences using mutual information","volume":"14","author":"Bitbol","year":"2018","journal-title":"PLoS Comput Biol"},{"key":"2024122721054238600_btae738-B5","doi-asserted-by":"publisher","first-page":"12180","DOI":"10.1073\/pnas.1606762113","article-title":"Inferring interaction partners from protein sequences","volume":"113","author":"Bitbol","year":"2016","journal-title":"Proc Natl Acad Sci USA"},{"key":"2024122721054238600_btae738-B6","doi-asserted-by":"publisher","first-page":"37009","DOI":"10.1209\/0295-5075\/89\/37009","article-title":"Aligning graphs and finding substructures by a cavity approach","volume":"89","author":"Bradde","year":"2010","journal-title":"Europhys Lett"},{"key":"2024122721054238600_btae738-B7","doi-asserted-by":"publisher","first-page":"1265","DOI":"10.1038\/s41467-022-28865-w","article-title":"Improved prediction of protein\u2013protein interactions using AlphaFold2","volume":"13","author":"Bryant","year":"2022","journal-title":"Nat Commun"},{"key":"2024122721054238600_btae738-B8","doi-asserted-by":"publisher","first-page":"165","DOI":"10.1038\/msb4100203","article-title":"Accurate prediction of protein\u2013protein interactions from sequence alignments using a Bayesian method","volume":"4","author":"Burger","year":"2008","journal-title":"Mol Syst Biol"},{"key":"2024122721054238600_btae738-B9","doi-asserted-by":"publisher","first-page":"bbad221","DOI":"10.1093\/bib\/bbad221","article-title":"Improved the heterodimer protein complex prediction with protein language models","volume":"24","author":"Chen","year":"2023","journal-title":"Brief Bioinform"},{"key":"2024122721054238600_btae738-B10","doi-asserted-by":"publisher","first-page":"E563","DOI":"10.1073\/pnas.1323734111","article-title":"Toward rationally redesigning bacterial two-component signaling systems using coevolutionary information","volume":"111","author":"Cheng","year":"2014","journal-title":"Proc Natl Acad Sci USA"},{"key":"2024122721054238600_btae738-B11","doi-asserted-by":"publisher","first-page":"185","DOI":"10.1126\/science.aaw6718","article-title":"Protein interaction networks revealed by proteome coevolution","volume":"365","author":"Cong","year":"2019","journal-title":"Science"},{"key":"2024122721054238600_btae738-B12","doi-asserted-by":"crossref","first-page":"89","DOI":"10.1038\/nature22383","article-title":"Quantifiable predictive features define epitope specific T cell receptor repertoires","volume":"547","author":"Dash","year":"2017","journal-title":"Nature"},{"key":"2024122721054238600_btae738-B13","doi-asserted-by":"crossref","first-page":"333","DOI":"10.1093\/bioinformatics\/btm604","article-title":"Mutual information without the influence of phylogeny or entropy dramatically improves residue contact prediction","volume":"24","author":"Dunn","year":"2008","journal-title":"Bioinformatics"},{"key":"2024122721054238600_btae738-B14","doi-asserted-by":"publisher","first-page":"S18","DOI":"10.1186\/1471-2105-14-S15-S18","article-title":"Mapping proteins in the presence of paralogs using units of coevolution","volume":"14","author":"El-Kebir","year":"2013","journal-title":"BMC Bioinformatics"},{"year":"2021","author":"Evans","key":"2024122721054238600_btae738-B15","doi-asserted-by":"publisher","DOI":"10.1101\/2021.10.04.463034"},{"key":"2024122721054238600_btae738-B16","doi-asserted-by":"publisher","first-page":"032413","DOI":"10.1103\/PhysRevE.101.032413","article-title":"Statistical physics of interacting proteins: impact of dataset size and quality assessed in synthetic sequences","volume":"101","author":"Gandarilla-P\u00e9rez","year":"2020","journal-title":"Phys Rev E"},{"key":"2024122721054238600_btae738-B17","doi-asserted-by":"publisher","first-page":"e1011010","DOI":"10.1371\/journal.pcbi.1011010","article-title":"Combining phylogeny and coevolution improves the inference of interaction partners among paralogous proteins","volume":"19","author":"Gandarilla-P\u00e9rez","year":"2023","journal-title":"PLoS Comput Biol"},{"key":"2024122721054238600_btae738-B18","doi-asserted-by":"publisher","first-page":"e1010147","DOI":"10.1371\/journal.pcbi.1010147","article-title":"Correlations from structure and phylogeny combine constructively in the inference of protein partners from sequences","volume":"18","author":"Gerardos","year":"2022","journal-title":"PLoS Comput Biol"},{"key":"2024122721054238600_btae738-B19","doi-asserted-by":"publisher","first-page":"2039","DOI":"10.1093\/bioinformatics\/btg278","article-title":"Inferring protein interactions from phylogenetic distance matrices","volume":"19","author":"Gertz","year":"2003","journal-title":"Bioinformatics"},{"key":"2024122721054238600_btae738-B20","doi-asserted-by":"crossref","first-page":"177","DOI":"10.1016\/S0022-2836(02)01038-0","article-title":"Co-evolutionary analysis reveals insights into protein\u2013protein interactions","volume":"324","author":"Goh","year":"2002","journal-title":"J Mol Biol"},{"key":"2024122721054238600_btae738-B21","doi-asserted-by":"publisher","first-page":"1396","DOI":"10.1038\/s41467-021-21636-z","article-title":"Large-scale discovery of protein interactions at residue resolution using co-evolution calculated from genomic sequences","volume":"12","author":"Green","year":"2021","journal-title":"Nat Commun"},{"key":"2024122721054238600_btae738-B22","doi-asserted-by":"publisher","first-page":"12186","DOI":"10.1073\/pnas.1607570113","article-title":"Simultaneous identification of specifically interacting paralogs and interprotein contacts by direct coupling analysis","volume":"113","author":"Gueudre","year":"2016","journal-title":"Proc Natl Acad Sci USA"},{"key":"2024122721054238600_btae738-B23","doi-asserted-by":"publisher","first-page":"1202","DOI":"10.1093\/bioinformatics\/bts109","article-title":"Mirroring co-evolving trees in the light of their topologies","volume":"28","author":"Hajirasouliha","year":"2012","journal-title":"Bioinformatics"},{"key":"2024122721054238600_btae738-B24","doi-asserted-by":"publisher","first-page":"eabm4805","DOI":"10.1126\/science.abm4805","article-title":"Computed structures of core eukaryotic protein complexes","volume":"374","author":"Humphreys","year":"2021","journal-title":"Science"},{"key":"2024122721054238600_btae738-B25","doi-asserted-by":"publisher","first-page":"W315","DOI":"10.1093\/nar\/gkl112","article-title":"TSEMA: interactive prediction of protein pairings between interacting families","volume":"34","author":"Izarzugaza","year":"2006","journal-title":"Nucleic Acids Res"},{"key":"2024122721054238600_btae738-B26","doi-asserted-by":"publisher","first-page":"35","DOI":"10.1186\/1471-2105-9-35","article-title":"Enhancing the prediction of protein pairings between interacting families using orthology information","volume":"9","author":"Izarzugaza","year":"2008","journal-title":"BMC Bioinformatics"},{"key":"2024122721054238600_btae738-B27","doi-asserted-by":"publisher","first-page":"121","DOI":"10.1146\/annurev.genet.41.042007.170548","article-title":"Specificity in two-component signal transduction pathways","volume":"41","author":"Laub","year":"2007","journal-title":"Annu Rev Genet"},{"key":"2024122721054238600_btae738-B28","doi-asserted-by":"publisher","first-page":"e2311887121","DOI":"10.1073\/pnas.2311887121","article-title":"Pairing interacting protein sequences using masked language modeling","volume":"121","author":"Lupo","year":"2024","journal-title":"Proc Natl Acad Sci USA"},{"key":"2024122721054238600_btae738-B29","doi-asserted-by":"publisher","first-page":"4626","DOI":"10.1093\/nar\/gki775","article-title":"Ancestral paralogs and pseudoparalogs and their role in the emergence of the eukaryotic cell","volume":"33","author":"Makarova","year":"2005","journal-title":"Nucleic Acids Res"},{"key":"2024122721054238600_btae738-B30","doi-asserted-by":"crossref","first-page":"e1003776","DOI":"10.1371\/journal.pcbi.1003776","article-title":"The fitness landscape of HIV-1 gag: advanced modeling approaches and validation of model predictions by in vitro testing","volume":"10","author":"Mann","year":"2014","journal-title":"PLoS Comput Biol"},{"key":"2024122721054238600_btae738-B31","first-page":"7710","article-title":"FGOT: graph distances based on filters and optimal transport","volume":"36","author":"Maretic","year":"2022","journal-title":"Proc AAAI Conf Artif Intell"},{"key":"2024122721054238600_btae738-B32","doi-asserted-by":"crossref","first-page":"353","DOI":"10.1109\/TSIPN.2022.3169632","article-title":"Wasserstein-based graph alignment","volume":"8","author":"Maretic","year":"2022","journal-title":"IEEE Trans Signal Inf Process Over Networks"},{"key":"2024122721054238600_btae738-B33","doi-asserted-by":"publisher","first-page":"e28766","DOI":"10.1371\/journal.pone.0028766","article-title":"Protein 3D structure computed from evolutionary sequence variation","volume":"6","author":"Marks","year":"2011","journal-title":"PLoS One"},{"key":"2024122721054238600_btae738-B34","doi-asserted-by":"publisher","first-page":"e1007179","DOI":"10.1371\/journal.pcbi.1007179","article-title":"Phylogenetic correlations can suffice to infer protein partners from sequences","volume":"15","author":"Marmier","year":"2019","journal-title":"PLoS Comput Biol"},{"first-page":"1","year":"2018","author":"Mena","key":"2024122721054238600_btae738-B35"},{"key":"2024122721054238600_btae738-B36","doi-asserted-by":"crossref","first-page":"100024","DOI":"10.1016\/j.immuno.2023.100024","article-title":"Benchmarking solutions to the T-cell receptor epitope prediction problem: IMMREP22 workshop report","volume":"9","author":"Meysman","year":"2023","journal-title":"ImmunoInformatics"},{"key":"2024122721054238600_btae738-B37","doi-asserted-by":"publisher","DOI":"10.1101\/2024.05.24.595718","article-title":"Intra- and inter-chain contacts determine TCR specificity: applying protein co-evolution methods to TCR pairing","author":"Milighetti","year":"2024"},{"key":"2024122721054238600_btae738-B38","doi-asserted-by":"publisher","first-page":"E1293","DOI":"10.1073\/pnas.1111471108","article-title":"Direct-coupling analysis of residue coevolution captures native contacts across many protein families","volume":"108","author":"Morcos","year":"2011","journal-title":"Proc Natl Acad Sci USA"},{"key":"2024122721054238600_btae738-B39","doi-asserted-by":"publisher","DOI":"10.48550\/arXiv.2406.06397,","article-title":"Contrastive learning of T cell receptor representations","author":"Nagano","year":"2024"},{"key":"2024122721054238600_btae738-B40","doi-asserted-by":"publisher","first-page":"1370","DOI":"10.1093\/bioinformatics\/btq137","article-title":"Studying the co-evolution of protein families with the Mirrortree web server","volume":"26","author":"Ochoa","year":"2010","journal-title":"Bioinformatics"},{"key":"2024122721054238600_btae738-B41","doi-asserted-by":"publisher","first-page":"2166","DOI":"10.1093\/bioinformatics\/btv102","article-title":"Detection of significant protein coevolution","volume":"31","author":"Ochoa","year":"2015","journal-title":"Bioinformatics"},{"key":"2024122721054238600_btae738-B42","doi-asserted-by":"crossref","first-page":"609","DOI":"10.1093\/protein\/14.9.609","article-title":"Similarity of phylogenetic trees as indicator of protein\u2013protein interaction","volume":"14","author":"Pazos","year":"2001","journal-title":"Protein Eng"},{"key":"2024122721054238600_btae738-B43","first-page":"13876","article-title":"Got: an optimal transport framework for graph comparison","volume":"32","author":"Petric Maretic","year":"2019","journal-title":"Adv Neural Inf Process Syst"},{"key":"2024122721054238600_btae738-B44","doi-asserted-by":"publisher","first-page":"273","DOI":"10.1016\/S0022-2836(03)00114-1","article-title":"Exploiting the co-evolution of interacting proteins to discover interaction specificity","volume":"327","author":"Ramani","year":"2003","journal-title":"J Mol Biol"},{"first-page":"8844","year":"2021","author":"Rao","key":"2024122721054238600_btae738-B45"},{"key":"2024122721054238600_btae738-B46","doi-asserted-by":"publisher","first-page":"22124","DOI":"10.1073\/pnas.0912100106","article-title":"High-resolution protein complexes from integrating genomic information with molecular simulation","volume":"106","author":"Schug","year":"2009","journal-title":"Proc Natl Acad Sci USA"},{"key":"2024122721054238600_btae738-B47","doi-asserted-by":"publisher","first-page":"876","DOI":"10.1214\/aoms\/1177703591","article-title":"A relationship between arbitrary positive matrices and doubly stochastic matrices","volume":"35","author":"Sinkhorn","year":"1964","journal-title":"Ann Math Statist"},{"key":"2024122721054238600_btae738-B48","doi-asserted-by":"crossref","first-page":"10340","DOI":"10.1073\/pnas.1207864109","article-title":"Genomics-aided structure prediction","volume":"109","author":"Su\u0142kowska","year":"2012","journal-title":"Proc Natl Acad Sci USA"},{"key":"2024122721054238600_btae738-B49","doi-asserted-by":"publisher","first-page":"241","DOI":"10.1186\/s12859-019-2864-8","article-title":"Balancing sensitivity and specificity in distinguishing TCR groups by CDR sequence similarity","volume":"20","author":"Thakkar","year":"2019","journal-title":"BMC Bioinformatics"},{"key":"2024122721054238600_btae738-B50","doi-asserted-by":"publisher","first-page":"3181","DOI":"10.1093\/bioinformatics\/btu523","article-title":"Tracking global changes induced in the CD4 T-cell receptor repertoire by immunization with a complex antigen using short stretches of CDR3 protein sequence","volume":"30","author":"Thomas","year":"2014","journal-title":"Bioinformatics"},{"key":"2024122721054238600_btae738-B51","doi-asserted-by":"publisher","first-page":"822","DOI":"10.1002\/prot.20948","article-title":"Codep: maximizing co-evolutionary interdependencies to discover interacting proteins","volume":"63","author":"Tillier","year":"2006","journal-title":"Proteins"},{"key":"2024122721054238600_btae738-B52","doi-asserted-by":"publisher","first-page":"1861","DOI":"10.1101\/gr.092452.109","article-title":"The human protein coevolution network","volume":"19","author":"Tillier","year":"2009","journal-title":"Genome Res"},{"key":"2024122721054238600_btae738-B53","doi-asserted-by":"publisher","first-page":"2166","DOI":"10.1016\/j.csbj.2020.06.041","article-title":"T cell receptor sequence clustering and antigen specificity","volume":"18","author":"Vujovic","year":"2020","journal-title":"Comput Struct Biotechnol J"},{"key":"2024122721054238600_btae738-B54","doi-asserted-by":"publisher","first-page":"67","DOI":"10.1073\/pnas.0805923106","article-title":"Identification of direct residue contacts in protein\u2013protein interaction by message passing","volume":"106","author":"Weigt","year":"2009","journal-title":"Proc Natl Acad Sci USA"},{"first-page":"194","article-title":"TCR-BERT: learning the grammar of T-cell receptors for flexible antigen-binding analyses","author":"Wu","key":"2024122721054238600_btae738-B55"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/advance-article-pdf\/doi\/10.1093\/bioinformatics\/btae738\/61156924\/btae738.pdf","content-type":"application\/pdf","content-version":"am","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/41\/1\/btae738\/61156924\/btae738.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/41\/1\/btae738\/61156924\/btae738.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,12,27]],"date-time":"2024-12-27T21:05:57Z","timestamp":1735333557000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/doi\/10.1093\/bioinformatics\/btae738\/7923417"}},"subtitle":[],"editor":[{"given":"Pier Luigi","family":"Martelli","sequence":"additional","affiliation":[]}],"short-title":[],"issued":{"date-parts":[[2024,12,13]]},"references-count":55,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2024,12,26]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btae738","relation":{},"ISSN":["1367-4811"],"issn-type":[{"type":"electronic","value":"1367-4811"}],"subject":[],"published-other":{"date-parts":[[2025,1]]},"published":{"date-parts":[[2024,12,13]]},"article-number":"btae738"}}