{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,6,1]],"date-time":"2026-06-01T20:24:55Z","timestamp":1780345495509,"version":"3.54.1"},"reference-count":52,"publisher":"Oxford University Press (OUP)","issue":"11","license":[{"start":{"date-parts":[[2025,10,17]],"date-time":"2025-10-17T00:00:00Z","timestamp":1760659200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"Institute of Information & communications Technology Planning & Evaluation","award":["RS-2021-II212068"],"award-info":[{"award-number":["RS-2021-II212068"]}]},{"name":"Institute of Information & communications Technology Planning & Evaluation","award":["RS-2023\u201300220628"],"award-info":[{"award-number":["RS-2023\u201300220628"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2025,11,1]]},"abstract":"<jats:title>Abstract<\/jats:title>\n                  <jats:sec>\n                    <jats:title>Motivation<\/jats:title>\n                    <jats:p>Protein structure prediction has been revolutionized and generalized with the advent of cutting-edge AI methods such as AlphaFold, but reliance on computationally intensive multiple sequence alignments (MSA) remains a major limitation.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Results<\/jats:title>\n                    <jats:p>We introduce DeepFold-PLM, a novel framework that integrates advanced protein language models with vector embedding databases to enhance ultra-fast MSA construction, remote homology detection, and protein structure prediction. DeepFold-PLM utilizes high-dimensional embeddings and contrastive learning, significantly accelerate MSA generation, achieving 47 times faster than standard methods, while maintaining prediction accuracy comparable to AlphaFold. In addition, it enhances structure prediction by extending modeling capabilities to multimeric protein complexes, provides a scalable PyTorch-based implementation for efficient large-scale prediction. Our method also effectively increases sequence diversity (Neff = 8.65 versus 4.83 with JackHMMER) enriching coevolutionary information critical for accurate structure prediction. DeepFold-PLM thus represents a versatile and practical resource that enables high-throughput applications in computational structural biology.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Availability and implementation<\/jats:title>\n                    <jats:p>Source codes and user-friendly Python API of all modules of DeepFold-PLM publicly available at https:\/\/github.com\/DeepFoldProtein\/DeepFold-PLM.<\/jats:p>\n                  <\/jats:sec>","DOI":"10.1093\/bioinformatics\/btaf579","type":"journal-article","created":{"date-parts":[[2025,10,15]],"date-time":"2025-10-15T12:26:19Z","timestamp":1760531179000},"source":"Crossref","is-referenced-by-count":1,"title":["DeepFold-PLM: accelerating protein structure prediction via efficient homology search using protein language models"],"prefix":"10.1093","volume":"41","author":[{"given":"Minsoo","family":"Kim","sequence":"first","affiliation":[{"name":"Department of Physics, Sungkyunkwan University , Suwon 16419,","place":["Korea"]}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Hanjin","family":"Bae","sequence":"additional","affiliation":[{"name":"Department of Physics, Sungkyunkwan University , Suwon 16419,","place":["Korea"]}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Gyeongpil","family":"Jo","sequence":"additional","affiliation":[{"name":"Department of Physics, Sungkyunkwan University , Suwon 16419,","place":["Korea"]}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Kunwoo","family":"Kim","sequence":"additional","affiliation":[{"name":"Department of Physics, Sungkyunkwan University , Suwon 16419,","place":["Korea"]}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Sung Jong","family":"Lee","sequence":"additional","affiliation":[{"name":"Basic Science Research Institute, Changwon National University , Changwon 51140,","place":["Korea"]}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-7120-8464","authenticated-orcid":false,"given":"Jejoong","family":"Yoo","sequence":"additional","affiliation":[{"name":"School of Computational Sciences, Korea Institute for Advanced Study , Seoul 02455,","place":["Korea"]}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-4612-0927","authenticated-orcid":false,"given":"Keehyoung","family":"Joo","sequence":"additional","affiliation":[{"name":"Center for Advanced Computation, Korea Institute for Advanced Study , Seoul 02455,","place":["Korea"]}],"role":[{"vocabulary":"crossref","role":"author"}]}],"member":"286","published-online":{"date-parts":[[2025,10,17]]},"reference":[{"key":"2025110903495543900_btaf579-B1","doi-asserted-by":"publisher","first-page":"493","DOI":"10.1038\/s41586-024-07487-w","article-title":"Accurate structure prediction of biomolecular interactions with alphafold 3","volume":"630","author":"Abramson","year":"2024","journal-title":"Nature"},{"key":"2025110903495543900_btaf579-B2","doi-asserted-by":"publisher","first-page":"1514","DOI":"10.1038\/s41592-024-02272-z","article-title":"Openfold: retraining alphafold2 yields new insights into its learning mechanisms and capacity for generalization","volume":"21","author":"Ahdritz","year":"2024","journal-title":"Nat Methods"},{"key":"2025110903495543900_btaf579-B3","doi-asserted-by":"publisher","author":"Ahdritz","year":"2023","DOI":"10.48550\/arXiv.2308.05326,"},{"key":"2025110903495543900_btaf579-B4","doi-asserted-by":"publisher","first-page":"1571","DOI":"10.1002\/prot.26545","article-title":"Protein target highlights in casp15: analysis of models by structure providers","volume":"91","author":"Alexander","year":"2023","journal-title":"Proteins Struct Funct Bioinf"},{"key":"2025110903495543900_btaf579-B5","doi-asserted-by":"publisher","first-page":"547","DOI":"10.1038\/s41586-021-04184-w","article-title":"De novo protein design by deep network hallucination","volume":"600","author":"Anishchenko","year":"2021","journal-title":"Nature"},{"key":"2025110903495543900_btaf579-B6","doi-asserted-by":"publisher","first-page":"871","DOI":"10.1126\/science.abj8754","article-title":"Accurate prediction of protein structures and interactions using a three-track neural network","volume":"373","author":"Baek","year":"2021","journal-title":"Science"},{"key":"2025110903495543900_btaf579-B7","doi-asserted-by":"publisher","first-page":"D523","DOI":"10.1093\/nar\/gkac1052","article-title":"Uniprot: the universal protein knowledgebase in 2023","volume":"51","author":"Bateman","year":"2022","journal-title":"Nucleic Acids Res"},{"key":"2025110903495543900_btaf579-B8","doi-asserted-by":"publisher","first-page":"1265","DOI":"10.1038\/s41467-022-28865-w","article-title":"Improved prediction of protein-protein interactions using alphafold2","volume":"13","author":"Bryant","year":"2022","journal-title":"Nat Commun"},{"key":"2025110903495543900_btaf579-B9","doi-asserted-by":"publisher","first-page":"59","DOI":"10.1038\/nmeth.3176","article-title":"Fast and sensitive protein alignment using diamond","volume":"12","author":"Buchfink","year":"2015","journal-title":"Nat Methods"},{"key":"2025110903495543900_btaf579-B10","doi-asserted-by":"publisher","first-page":"216","DOI":"10.1038\/s41594-022-00910-8","article-title":"Towards a structurally resolved human protein interaction network","volume":"30","author":"Burke","year":"2023","journal-title":"Nat Struct Mol Biol"},{"key":"2025110903495543900_btaf579-B11","doi-asserted-by":"publisher","first-page":"7400","DOI":"10.1038\/s41467-024-51776-x","article-title":"An end-to-end framework for the prediction of protein structure and fitness from single sequence","volume":"15","author":"Chen","year":"2024","journal-title":"Nat Commun"},{"key":"2025110903495543900_btaf579-B12","doi-asserted-by":"publisher","author":"Cheng","year":"2023","DOI":"10.48550\/arXiv.2203.00854,"},{"key":"2025110903495543900_btaf579-B13","doi-asserted-by":"publisher","author":"Dao","year":"2022","DOI":"10.48550\/arXiv.2205.14135,"},{"key":"2025110903495543900_btaf579-B14","doi-asserted-by":"publisher","author":"Douze","year":"2025","DOI":"10.48550\/arXiv.2401.08281,"},{"key":"2025110903495543900_btaf579-B15","doi-asserted-by":"publisher","first-page":"1035","DOI":"10.1038\/nbt0804-1035","article-title":"Where did the blosum62 alignment score matrix come from?","volume":"22","author":"Eddy","year":"2004","journal-title":"Nat Biotechnol"},{"key":"2025110903495543900_btaf579-B16","doi-asserted-by":"publisher","first-page":"e1002195","DOI":"10.1371\/journal.pcbi.1002195","article-title":"Accelerated profile HMM searches","volume":"7","author":"Eddy","year":"2011","journal-title":"PLoS Comput Biol"},{"key":"2025110903495543900_btaf579-B17","doi-asserted-by":"publisher","author":"Elnaggar","year":"2023","DOI":"10.48550\/arXiv.2301.06568,"},{"key":"2025110903495543900_btaf579-B18","doi-asserted-by":"publisher","author":"Elnaggar","year":"2021","DOI":"10.48550\/arXiv.2007.06225,"},{"key":"2025110903495543900_btaf579-B19","doi-asserted-by":"publisher","DOI":"10.1101\/2021.10.04.463034","article-title":"Protein complex prediction with alphafold-multimer","author":"Evans","year":"2022"},{"key":"2025110903495543900_btaf579-B20","doi-asserted-by":"publisher","first-page":"1087","DOI":"10.1038\/s42256-023-00721-6","article-title":"A method for multiple-sequence-alignment-free protein structure prediction using a protein language model","volume":"5","author":"Fang","year":"2023","journal-title":"Nat Mach Intell"},{"key":"2025110903495543900_btaf579-B21","doi-asserted-by":"publisher","first-page":"235","DOI":"10.1186\/gb-2008-9-10-235","article-title":"Large-scale assignment of orthology: back to phylogenetics?","volume":"9","author":"Gabald\u00f3n","year":"2008","journal-title":"Genome Biol"},{"key":"2025110903495543900_btaf579-B22","doi-asserted-by":"publisher","first-page":"1744","DOI":"10.1038\/s41467-022-29394-2","article-title":"Af2complex predicts direct physical interactions in multimeric proteins with deep learning","volume":"13","author":"Gao","year":"2022","journal-title":"Nat Commun"},{"key":"2025110903495543900_btaf579-B23","doi-asserted-by":"publisher","author":"Gao","year":"2022","DOI":"10.48550\/arXiv.2104.08821,"},{"key":"2025110903495543900_btaf579-B24","doi-asserted-by":"publisher","first-page":"975","DOI":"10.1038\/s41587-023-01917-2","article-title":"Protein remote homology detection and structural alignment using deep learning","volume":"42","author":"Hamamsy","year":"2024","journal-title":"Nat Biotechnol"},{"key":"2025110903495543900_btaf579-B25","doi-asserted-by":"publisher","first-page":"lqac043","DOI":"10.1093\/nargab\/lqac043","article-title":"Contrastive learning on protein embeddings enlightens midnight zone","volume":"4","author":"Heinzinger","year":"2022","journal-title":"NAR Genom Bioinform"},{"key":"2025110903495543900_btaf579-B26","doi-asserted-by":"publisher","first-page":"983","DOI":"10.1038\/s41587-024-02353-6","article-title":"Fast, sensitive detection of protein homologs using deep dense retrieval","volume":"43","author":"Hong","year":"2025","journal-title":"Nat Biotechnol"},{"key":"2025110903495543900_btaf579-B27","doi-asserted-by":"publisher","author":"Jiang","year":"2021","DOI":"10.48550\/arXiv.2008.02496"},{"key":"2025110903495543900_btaf579-B28","doi-asserted-by":"publisher","first-page":"583","DOI":"10.1038\/s41586-021-03819-2","article-title":"Highly accurate protein structure prediction with alphafold","volume":"596","author":"Jumper","year":"2021","journal-title":"Nature"},{"key":"2025110903495543900_btaf579-B29","doi-asserted-by":"publisher","first-page":"2024","DOI":"10.1038\/s41592-025-02819-8","article-title":"GPU-accelerated homology search with mmseqs2","volume":"22","author":"Kallenborn","year":"2025","journal-title":"Nat Methods"},{"key":"2025110903495543900_btaf579-B30","doi-asserted-by":"publisher","first-page":"btad712","DOI":"10.1093\/bioinformatics\/btad712","article-title":"Deepfold: enhancing protein structure prediction through optimized loss functions, improved template features, and re-optimized energy function","volume":"39","author":"Lee","year":"2023","journal-title":"Bioinformatics"},{"key":"2025110903495543900_btaf579-B31","doi-asserted-by":"publisher","DOI":"10.1101\/2022.07.20.500902","article-title":"Language models of protein sequences at the scale of evolution enable accurate structure prediction","author":"Lin","year":"2022"},{"key":"2025110903495543900_btaf579-B32","doi-asserted-by":"publisher","first-page":"1123","DOI":"10.1126\/science.ade2574","article-title":"Evolutionary-scale prediction of atomic-level protein structure with a language model","volume":"379","author":"Lin","year":"2023","journal-title":"Science"},{"key":"2025110903495543900_btaf579-B33","doi-asserted-by":"publisher","first-page":"2775","DOI":"10.1038\/s41467-024-46808-5","article-title":"PLMSearch: protein language model powers accurate and fast sequence search for remote homology","volume":"15","author":"Liu","year":"2024","journal-title":"Nat Commun"},{"key":"2025110903495543900_btaf579-B34","doi-asserted-by":"publisher","first-page":"1145","DOI":"10.1101\/gr.277675.123","article-title":"Leveraging protein language models for accurate multiple sequence alignments","volume":"33","author":"McWhite","year":"2023","journal-title":"Genome Res"},{"key":"2025110903495543900_btaf579-B35","doi-asserted-by":"publisher","first-page":"679","DOI":"10.1038\/s41592-022-01488-1","article-title":"Colabfold: making protein folding accessible to all","volume":"19","author":"Mirdita","year":"2022","journal-title":"Nat Methods"},{"key":"2025110903495543900_btaf579-B36","doi-asserted-by":"publisher","first-page":"btad786","DOI":"10.1093\/bioinformatics\/btad786","article-title":"Embedding-based alignment: combining protein language models with dynamic programming alignment to detect structural similarities in the twilight-zone","volume":"40","author":"Pantolini","year":"2024","journal-title":"Bioinformatics"},{"key":"2025110903495543900_btaf579-B37","doi-asserted-by":"publisher","first-page":"173","DOI":"10.1038\/nmeth.1818","article-title":"Hhblits: lightning-fast iterative protein sequence searching by hmm-hmm alignment","volume":"9","author":"Remmert","year":"2011","journal-title":"Nat Methods"},{"key":"2025110903495543900_btaf579-B38","doi-asserted-by":"publisher","first-page":"e2016239118","DOI":"10.1073\/pnas.2016239118","article-title":"Biological structure and function emerge from scaling unsupervised learning to 250 million protein sequences","volume":"118","author":"Rives","year":"2021","journal-title":"Proc Natl Acad Sci USA"},{"key":"2025110903495543900_btaf579-B39","doi-asserted-by":"publisher","first-page":"85","DOI":"10.1093\/protein\/12.2.85","article-title":"Twilight zone of protein sequence alignments","volume":"12","author":"Rost","year":"1999","journal-title":"Protein Eng"},{"key":"2025110903495543900_btaf579-B40","doi-asserted-by":"publisher","first-page":"951","DOI":"10.1093\/bioinformatics\/bti125","article-title":"Protein homology detection by hmm-hmm comparison","volume":"21","author":"S\u00f6ding","year":"2005","journal-title":"Bioinformatics"},{"key":"2025110903495543900_btaf579-B41","doi-asserted-by":"publisher","author":"Song","year":"2023","DOI":"10.48550\/arXiv.2310.04610"},{"key":"2025110903495543900_btaf579-B42","doi-asserted-by":"publisher","first-page":"473","DOI":"10.1186\/s12859-019-3019-7","article-title":"Hh-suite3 for fast remote homology detection and deep protein annotation","volume":"20","author":"Steinegger","year":"2019","journal-title":"BMC Bioinformatics"},{"key":"2025110903495543900_btaf579-B43","doi-asserted-by":"publisher","first-page":"1026","DOI":"10.1038\/nbt.3988","article-title":"Mmseqs2 enables sensitive protein sequence searching for the analysis of massive data sets","volume":"35","author":"Steinegger","year":"2017","journal-title":"Nat Biotechnol"},{"key":"2025110903495543900_btaf579-B44","doi-asserted-by":"publisher","first-page":"926","DOI":"10.1093\/bioinformatics\/btu739","article-title":"Uniref clusters: a comprehensive and scalable alternative for improving sequence similarity searches","volume":"31","author":"Suzek","year":"2015","journal-title":"Bioinformatics"},{"key":"2025110903495543900_btaf579-B45","doi-asserted-by":"publisher","first-page":"176","DOI":"10.1038\/s41467-021-27838-9","article-title":"Harnessing protein folding neural networks for peptide\u2013protein docking","volume":"13","author":"Tsaban","year":"2022","journal-title":"Nat Commun"},{"key":"2025110903495543900_btaf579-B46","doi-asserted-by":"publisher","first-page":"312","DOI":"10.1038\/s42003-022-03269-0","article-title":"Structural validation and assessment of alphafold2 predictions for centrosomal and centriolar proteins and their complexes","volume":"5","author":"van Breugel","year":"2022","journal-title":"Commun Biol"},{"key":"2025110903495543900_btaf579-B47","doi-asserted-by":"publisher","first-page":"1403","DOI":"10.1038\/s41467-023-37139-y","article-title":"Alphaflow: autonomous discovery and optimization of multi-step chemistry using a self-driven fluidic lab guided by reinforcement learning","volume":"14","author":"Volk","year":"2023","journal-title":"Nat Commun"},{"key":"2025110903495543900_btaf579-B48","doi-asserted-by":"publisher","first-page":"e11081","DOI":"10.15252\/msb.202211081","article-title":"Benchmarking alphafold-enabled molecular docking predictions for antibiotic discovery","volume":"18","author":"Wong","year":"2022","journal-title":"Mol Syst Biol"},{"key":"2025110903495543900_btaf579-B49","doi-asserted-by":"publisher","DOI":"10.1101\/2022.07.21.500999","article-title":"High-resolution de novo structure prediction from primary sequence","author":"Wu","year":"2022"},{"key":"2025110903495543900_btaf579-B50","doi-asserted-by":"publisher","first-page":"1109","DOI":"10.1038\/s41592-022-01585-1","article-title":"Us-align: universal structure alignments of proteins, nucleic acids, and macromolecular complexes","volume":"19","author":"Zhang","year":"2022","journal-title":"Nat Methods"},{"key":"2025110903495543900_btaf579-B51","doi-asserted-by":"publisher","DOI":"10.1101\/240754","article-title":"Deep learning reveals many more inter-protein residue-residue contacts than direct coupling analysis","author":"Zhou","year":"2018"},{"key":"2025110903495543900_btaf579-B52","doi-asserted-by":"publisher","first-page":"btad424","DOI":"10.1093\/bioinformatics\/btad424","article-title":"Evaluation of alphafold-multimer prediction on multi-chain protein complexes","volume":"39","author":"Zhu","year":"2023","journal-title":"Bioinformatics"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/advance-article-pdf\/doi\/10.1093\/bioinformatics\/btaf579\/64729797\/btaf579.pdf","content-type":"application\/pdf","content-version":"am","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/41\/11\/btaf579\/64729797\/btaf579.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/41\/11\/btaf579\/64729797\/btaf579.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,11,9]],"date-time":"2025-11-09T08:50:05Z","timestamp":1762678205000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/doi\/10.1093\/bioinformatics\/btaf579\/8290416"}},"subtitle":[],"editor":[{"given":"Arne","family":"Elofsson","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"editor"}]}],"short-title":[],"issued":{"date-parts":[[2025,10,17]]},"references-count":52,"journal-issue":{"issue":"11","published-print":{"date-parts":[[2025,11,1]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btaf579","relation":{},"ISSN":["1367-4803","1367-4811"],"issn-type":[{"value":"1367-4803","type":"print"},{"value":"1367-4811","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2025,11]]},"published":{"date-parts":[[2025,10,17]]},"article-number":"btaf579"}}