{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,7,30]],"date-time":"2025-07-30T11:44:56Z","timestamp":1753875896213,"version":"3.41.2"},"reference-count":42,"publisher":"Oxford University Press (OUP)","issue":"6","license":[{"start":{"date-parts":[[2025,5,21]],"date-time":"2025-05-21T00:00:00Z","timestamp":1747785600000},"content-version":"vor","delay-in-days":1,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"Center for Ocean Research in Hong Kong and Macau"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2025,6,2]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:sec>\n                  <jats:title>Motivation<\/jats:title>\n                  <jats:p>The three-dimensional protein tertiary structure alignment is a fundamental problem that seeks insights into functions and evolution. Previous structure alignment algorithms have adopted the sequential assumption and used dynamic programming solvers. However, many distantly related structures exhibit non-sequential similarities, and non-sequential alignment tools are less efficient and accurate than sequential ones. In this paper, we formulate the non-sequential alignment as the Entropy-regularized Partial Linear Sum Assignment Problem (epLSAP) and propose a solver based on Sinkhorn algorithms, referred to as epLSAP-Align.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Results<\/jats:title>\n                  <jats:p>Compared with existing non-sequential alignment solvers, our epLSAP-Align can explicitly model the gap penalty, efficiently achieve global optimality and balance coverage and fidelity. We show that epLSAP-Align can be easily integrated into the existing frameworks, such as TM-align and MICAN, resulting in the non-sequential alignment tool epLSAP-TM and epLSAP-MICAN, respectively. Both epLSAP-TM and epLSAP-MICAN achieve better performance than the existing non-sequential alignment tools in terms of biologically meaningful structure overlaps on two sequential alignment test sets MALIDUP and MALISAM, and four non-sequential alignment test sets MALIDUP-ns, MALISAM-ns, 64-difficult-case and RIPC datasets. Also, compared with the most recent non-sequential alignment tool USalign2, our epLSAP-TM is at least 22% faster under the same setting.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Availability and implementation<\/jats:title>\n                  <jats:p>Our source code is available at https:\/\/github.com\/xzhangem\/epLSAP-align.<\/jats:p>\n               <\/jats:sec>","DOI":"10.1093\/bioinformatics\/btaf309","type":"journal-article","created":{"date-parts":[[2025,5,21]],"date-time":"2025-05-21T04:42:12Z","timestamp":1747802532000},"source":"Crossref","is-referenced-by-count":0,"title":["epLSAP-Align: a non-sequential protein structural alignment solver with entropy-regularized partial linear sum assignment problem formulation"],"prefix":"10.1093","volume":"41","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-6294-0781","authenticated-orcid":false,"given":"Xuechen","family":"Zhang","sequence":"first","affiliation":[{"name":"Department of Electronic and Computational Engineering, The Hong Kong University of Science and Technology , Hong Kong SAR,","place":["China"]}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-5823-7698","authenticated-orcid":false,"given":"Zhuoyang","family":"Chen","sequence":"additional","affiliation":[{"name":"Data Science and Analytics Thrust, Information Hub, The Hong Kong University of Science and Technology (Guangzhou) , Guangzhou, Guangdong 511400,","place":["China"]}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-3005-3213","authenticated-orcid":false,"given":"Junyu","family":"Li","sequence":"additional","affiliation":[{"name":"Department of Ocean Science and Center for Ocean Research in Hong Kong and Macau, The Hong Kong University of Science and Technology , Hong Kong SAR,","place":["China"]}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-2861-9492","authenticated-orcid":false,"given":"Qiong","family":"Luo","sequence":"additional","affiliation":[{"name":"Data Science and Analytics Thrust, Information Hub, The Hong Kong University of Science and Technology (Guangzhou) , Guangzhou, Guangdong 511400,","place":["China"]},{"name":"Department of Computer Science and Engineering, The Hong Kong University of Science and Technology , Hong Kong SAR,","place":["China"]}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-7580-5503","authenticated-orcid":false,"given":"Longjun","family":"Wu","sequence":"additional","affiliation":[{"name":"Department of Ocean Science and Center for Ocean Research in Hong Kong and Macau, The Hong Kong University of Science and Technology , Hong Kong SAR,","place":["China"]}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-5510-6916","authenticated-orcid":false,"given":"Weichuan","family":"Yu","sequence":"additional","affiliation":[{"name":"Department of Electronic and Computational Engineering, The Hong Kong University of Science and Technology , Hong Kong SAR,","place":["China"]}]}],"member":"286","published-online":{"date-parts":[[2025,5,20]]},"reference":[{"key":"2025070408320898300_btaf309-B1","doi-asserted-by":"crossref","first-page":"78","DOI":"10.1186\/1472-6807-7-78","article-title":"A comprehensive analysis of non-sequential alignments between all protein structures","volume":"7","author":"Abyzov","year":"2007","journal-title":"BMC Struct Biol"},{"key":"2025070408320898300_btaf309-B2","first-page":"12191","article-title":"Screening sinkhorn algorithm for regularized optimal transport","volume":"32","author":"Alaya","year":"2019","journal-title":"Adv Neural Inform Process Syst"},{"key":"2025070408320898300_btaf309-B3","doi-asserted-by":"crossref","first-page":"637","DOI":"10.1038\/s41586-023-06510-w","article-title":"Clustering predicted structures at the scale of the known protein universe","volume":"622","author":"Barrio-Hernandez","year":"2023","journal-title":"Nature"},{"key":"2025070408320898300_btaf309-B4","doi-asserted-by":"crossref","first-page":"102526","DOI":"10.1016\/j.sbi.2022.102526","article-title":"Alphafold2 protein structure prediction: implications for drug discovery","volume":"78","author":"Borkakoti","year":"2023","journal-title":"Curr Opin Struct Biol"},{"key":"2025070408320898300_btaf309-B5","doi-asserted-by":"crossref","first-page":"370","DOI":"10.1093\/bioinformatics\/btv580","article-title":"Fast and accurate non-sequential protein structure alignment using a new asymmetric linear sum assignment heuristic","volume":"32","author":"Brown","year":"2016","journal-title":"Bioinformatics"},{"first-page":"3822","year":"2022","author":"Brun","key":"2025070408320898300_btaf309-B6"},{"key":"2025070408320898300_btaf309-B7","doi-asserted-by":"crossref","first-page":"75","DOI":"10.1007\/978-1-4757-3023-4_2","volume-title":"Handbook of Combinatorial Optimization: Supplement Volume A","author":"Burkard","year":"1999"},{"key":"2025070408320898300_btaf309-B8","doi-asserted-by":"crossref","first-page":"1005","DOI":"10.1093\/bioinformatics\/btw757","article-title":"Statistical inference of protein structural alignments using information and compression","volume":"33","author":"Collier","year":"2017","journal-title":"Bioinformatics"},{"key":"2025070408320898300_btaf309-B9","first-page":"2292","article-title":"Sinkhorn distances: lightspeed computation of optimal transport","volume":"26","author":"Cuturi","year":"2013","journal-title":"Adv Neural Inform Process Syst"},{"key":"2025070408320898300_btaf309-B10","doi-asserted-by":"crossref","first-page":"3168","DOI":"10.1038\/s41467-021-23303-9","article-title":"Structure-based protein function prediction using graph convolutional networks","volume":"12","author":"Gligorijevi\u0107","year":"2021","journal-title":"Nat Commun"},{"key":"2025070408320898300_btaf309-B11","doi-asserted-by":"crossref","first-page":"1374","DOI":"10.1110\/ps.035469.108","article-title":"Novel protein folds and their nonsequential structural analogs","volume":"17","author":"Guerler","year":"2008","journal-title":"Protein Sci"},{"key":"2025070408320898300_btaf309-B12","doi-asserted-by":"crossref","first-page":"341","DOI":"10.1016\/j.sbi.2009.04.003","article-title":"Advances and pitfalls of protein structural alignment","volume":"19","author":"Hasegawa","year":"2009","journal-title":"Curr Opin Struct Biol"},{"key":"2025070408320898300_btaf309-B13","doi-asserted-by":"crossref","first-page":"W210","DOI":"10.1093\/nar\/gkac387","article-title":"Dali server: structural unification of protein families","volume":"50","author":"Holm","year":"2022","journal-title":"Nucleic Acids Res"},{"key":"2025070408320898300_btaf309-B14","doi-asserted-by":"crossref","first-page":"2209","DOI":"10.1093\/bioinformatics\/bty081","article-title":"Ls-align: an atom-level, flexible ligand structural alignment algorithm for high-throughput virtual screening","volume":"34","author":"Hu","year":"2018","journal-title":"Bioinformatics"},{"key":"2025070408320898300_btaf309-B15","doi-asserted-by":"crossref","first-page":"499","DOI":"10.1002\/prot.22458","article-title":"Structure is three to ten times more conserved than sequence\u2013a study of structural response in protein cores","volume":"77","author":"Illerg\u00e5rd","year":"2009","journal-title":"Proteins"},{"key":"2025070408320898300_btaf309-B16","doi-asserted-by":"crossref","first-page":"2256","DOI":"10.1107\/S0907444904026460","article-title":"Secondary-structure matching (ssm), a new tool for fast protein structure alignment in three dimensions","volume":"60","author":"Krissinel","year":"2004","journal-title":"Acta Crystallogr D Biol Crystallogr"},{"key":"2025070408320898300_btaf309-B17","doi-asserted-by":"crossref","first-page":"117","DOI":"10.1021\/ar00041a001","article-title":"Structure-based molecular design","volume":"27","author":"Kuntz","year":"1994","journal-title":"Acc Chem Res"},{"key":"2025070408320898300_btaf309-B18","doi-asserted-by":"crossref","first-page":"121","DOI":"10.1016\/B978-0-12-800168-4.00005-6","article-title":"Algorithms, applications, and challenges of protein structure alignment","volume":"94","author":"Ma","year":"2014","journal-title":"Adv Protein Chem Struct Biol"},{"key":"2025070408320898300_btaf309-B19","doi-asserted-by":"crossref","first-page":"7305","DOI":"10.1038\/s41467-024-51669-z","article-title":"Gtalign: spatial index-driven protein structure alignment, superposition, and search","volume":"15","author":"Margelevi\u010dius","year":"2024","journal-title":"Nat Commun"},{"key":"2025070408320898300_btaf309-B20","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1186\/1471-2105-14-24","article-title":"Mican: a protein structure alignment algorithm that can handle multiple-chains, inverse alignments, c \u03b1 only models, alternative alignments, and non-sequential alignments","volume":"14","author":"Minami","year":"2013","journal-title":"BMC Bioinform"},{"key":"2025070408320898300_btaf309-B21","doi-asserted-by":"crossref","first-page":"3324","DOI":"10.1093\/bioinformatics\/bty369","article-title":"Mican-sq: a sequential protein structure alignment program that is applicable to monomers and all types of oligomers","volume":"34","author":"Minami","year":"2018","journal-title":"Bioinformatics"},{"key":"2025070408320898300_btaf309-B22","doi-asserted-by":"crossref","first-page":"536","DOI":"10.1016\/S0022-2836(05)80134-2","article-title":"SCOP: a structural classification of proteins database for the investigation of sequences and structures","volume":"247","author":"Murzin","year":"1995","journal-title":"J Mol Biol"},{"first-page":"1828","year":"2013","author":"Naiem","key":"2025070408320898300_btaf309-B23"},{"key":"2025070408320898300_btaf309-B24","first-page":"e94\u2013e94","article-title":"Biological insights from topology independent comparison of protein 3d structures","volume":"39","author":"Nguyen","year":"2011","journal-title":"Nucleic Acids Res"},{"key":"2025070408320898300_btaf309-B25","doi-asserted-by":"crossref","first-page":"3274","DOI":"10.1093\/bioinformatics\/bts618","article-title":"Fast protein structure alignment using gaussian overlap scoring of backbone peptide fragment similarity","volume":"28","author":"Ritchie","year":"2012","journal-title":"Bioinformatics"},{"key":"2025070408320898300_btaf309-B26","doi-asserted-by":"crossref","first-page":"12","DOI":"10.1186\/1748-7188-5-12","article-title":"Flexsnap: flexible non-sequential protein structure alignment","volume":"5","author":"Salem","year":"2010","journal-title":"Algorithms Mol Biol"},{"key":"2025070408320898300_btaf309-B27","doi-asserted-by":"crossref","first-page":"739","DOI":"10.1093\/protein\/11.9.739","article-title":"Protein structure alignment by incremental combinatorial extension (CE) of the optimal path","volume":"11","author":"Shindyalov","year":"1998","journal-title":"Protein Eng"},{"key":"2025070408320898300_btaf309-B28","doi-asserted-by":"crossref","first-page":"402","DOI":"10.2307\/2314570","article-title":"Diagonal equivalence to matrices with prescribed row and column sums","volume":"74","author":"Sinkhorn","year":"1967","journal-title":"Amer Math Monthly"},{"key":"2025070408320898300_btaf309-B29","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1145\/2766963","article-title":"Convolutional wasserstein distances: efficient optimal transportation on geometric domains","volume":"34","author":"Solomon","year":"2015","journal-title":"ACM Trans Graph"},{"key":"2025070408320898300_btaf309-B30","doi-asserted-by":"crossref","first-page":"389","DOI":"10.1287\/opre.1040.0189","article-title":"Optimal protein structure alignment using maximum cliques","volume":"53","author":"Strickland","year":"2005","journal-title":"Oper Res"},{"key":"2025070408320898300_btaf309-B31","doi-asserted-by":"crossref","first-page":"243","DOI":"10.1038\/s41587-023-01773-0","article-title":"Fast and accurate protein structure search with foldseek","volume":"42","author":"Van Kempen","year":"2024","journal-title":"Nat Biotechnol"},{"key":"2025070408320898300_btaf309-B32","doi-asserted-by":"crossref","first-page":"347","DOI":"10.1142\/S0219720008003461","article-title":"Clepaps: fast pair alignment of protein structures based on conformational letters","volume":"6","author":"Wang","year":"2008","journal-title":"J Bioinform Comput Biol"},{"key":"2025070408320898300_btaf309-B33","doi-asserted-by":"crossref","first-page":"1448","DOI":"10.1038\/srep01448","article-title":"Protein structure alignment beyond spatial proximity","volume":"3","author":"Wang","year":"2013","journal-title":"Sci Rep"},{"key":"2025070408320898300_btaf309-B34","doi-asserted-by":"crossref","first-page":"478","DOI":"10.1093\/bioinformatics\/btz609","article-title":"Topology-independent and global protein structure alignment through an fft-based algorithm","volume":"36","author":"Wen","year":"2020","journal-title":"Bioinformatics"},{"key":"2025070408320898300_btaf309-B35","doi-asserted-by":"crossref","first-page":"3646","DOI":"10.1093\/nar\/gkl395","article-title":"Protein structure database search and evolutionary classification","volume":"34","author":"Yang","year":"2006","journal-title":"Nucleic Acids Res"},{"key":"2025070408320898300_btaf309-B36","doi-asserted-by":"crossref","first-page":"2080","DOI":"10.1002\/prot.24100","article-title":"A new size-independent score for pairwise protein structure alignment and its application to structure classification and nucleic-acid binding prediction","volume":"80","author":"Yang","year":"2012","journal-title":"Proteins"},{"key":"2025070408320898300_btaf309-B37","doi-asserted-by":"crossref","first-page":"1010","DOI":"10.1093\/bioinformatics\/bti128","article-title":"Non-sequential structure-based alignments reveal topology-independent core packing arrangements in proteins","volume":"21","author":"Yuan","year":"2005","journal-title":"Bioinformatics"},{"key":"2025070408320898300_btaf309-B38","doi-asserted-by":"crossref","first-page":"3370","DOI":"10.1093\/nar\/gkg571","article-title":"Lga: a method for finding 3d similarities in protein structures","volume":"31","author":"Zemla","year":"2003","journal-title":"Nucleic Acids Res"},{"key":"2025070408320898300_btaf309-B39","doi-asserted-by":"crossref","first-page":"105218","DOI":"10.1016\/j.isci.2022.105218","article-title":"A unified approach to sequential and non-sequential structure alignment of proteins, rnas, and dnas","volume":"25","author":"Zhang","year":"2022","journal-title":"iScience"},{"key":"2025070408320898300_btaf309-B40","doi-asserted-by":"crossref","first-page":"1109","DOI":"10.1038\/s41592-022-01585-1","article-title":"Us-align: universal structure alignments of proteins, nucleic acids, and macromolecular complexes","volume":"19","author":"Zhang","year":"2022","journal-title":"Nat Methods"},{"key":"2025070408320898300_btaf309-B41","doi-asserted-by":"crossref","first-page":"702","DOI":"10.1002\/prot.20264","article-title":"Scoring function for automated assessment of protein structure template quality","volume":"57","author":"Zhang","year":"2004","journal-title":"Proteins"},{"key":"2025070408320898300_btaf309-B42","doi-asserted-by":"crossref","first-page":"2302","DOI":"10.1093\/nar\/gki524","article-title":"Tm-align: a protein structure alignment algorithm based on the tm-score","volume":"33","author":"Zhang","year":"2005","journal-title":"Nucleic Acids Res"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/advance-article-pdf\/doi\/10.1093\/bioinformatics\/btaf309\/63255979\/btaf309.pdf","content-type":"application\/pdf","content-version":"am","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/41\/6\/btaf309\/63255979\/btaf309.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/41\/6\/btaf309\/63255979\/btaf309.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,7,4]],"date-time":"2025-07-04T12:32:22Z","timestamp":1751632342000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/doi\/10.1093\/bioinformatics\/btaf309\/8139639"}},"subtitle":[],"editor":[{"given":"Arne","family":"Elofsson","sequence":"additional","affiliation":[]}],"short-title":[],"issued":{"date-parts":[[2025,5,20]]},"references-count":42,"journal-issue":{"issue":"6","published-print":{"date-parts":[[2025,6,2]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btaf309","relation":{},"ISSN":["1367-4811"],"issn-type":[{"type":"electronic","value":"1367-4811"}],"subject":[],"published-other":{"date-parts":[[2025,6]]},"published":{"date-parts":[[2025,5,20]]},"article-number":"btaf309"}}