{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,12]],"date-time":"2026-03-12T11:36:48Z","timestamp":1773315408522,"version":"3.50.1"},"reference-count":37,"publisher":"Oxford University Press (OUP)","issue":"3","license":[{"start":{"date-parts":[[2024,4,12]],"date-time":"2024-04-12T00:00:00Z","timestamp":1712880000000},"content-version":"vor","delay-in-days":16,"URL":"https:\/\/creativecommons.org\/licenses\/by-nc\/4.0\/"}],"funder":[{"name":"Graduate Research Award of Computing and Software Systems Division"},{"DOI":"10.13039\/100019963","name":"University of Washington Bothell","doi-asserted-by":"publisher","id":[{"id":"10.13039\/100019963","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2024,3,27]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:p>Understanding the protein structures is invaluable in various biomedical applications, such as vaccine development. Protein structure model building from experimental electron density maps is a time-consuming and labor-intensive task. To address the challenge, machine learning approaches have been proposed to automate this process. Currently, the majority of the experimental maps in the database lack atomic resolution features, making it challenging for machine learning-based methods to precisely determine protein structures from cryogenic electron microscopy density maps. On the other hand, protein structure prediction methods, such as AlphaFold2, leverage evolutionary information from protein sequences and have recently achieved groundbreaking accuracy. However, these methods often require manual refinement, which is labor intensive and time consuming. In this study, we present DeepTracer-Refine, an automated method that refines AlphaFold predicted structures by aligning them to DeepTracers modeled structure. Our method was evaluated on 39 multi-domain proteins and we improved the average residue coverage from 78.2 to 90.0% and average local Distance Difference Test score from 0.67 to 0.71. We also compared DeepTracer-Refine with Phenixs AlphaFold refinement and demonstrated that our method not only performs better when the initial AlphaFold model is less precise but also surpasses Phenix in run-time performance.<\/jats:p>","DOI":"10.1093\/bib\/bbae118","type":"journal-article","created":{"date-parts":[[2024,4,13]],"date-time":"2024-04-13T01:12:15Z","timestamp":1712970735000},"source":"Crossref","is-referenced-by-count":14,"title":["Enhancing cryo-EM structure prediction with DeepTracer and AlphaFold2 integration"],"prefix":"10.1093","volume":"25","author":[{"ORCID":"https:\/\/orcid.org\/0000-0003-1552-4352","authenticated-orcid":false,"given":"Jason","family":"Chen","sequence":"first","affiliation":[{"name":"Division of Computing and Software Systems, University of Washington Bothell , Bothell, WA 98011 , USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Ayisha","family":"Zia","sequence":"additional","affiliation":[{"name":"Department of Biochemistry and Molecular Genetics, University of Alabama at Birmingham , Birmingham, AL 35233 , USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Albert","family":"Luo","sequence":"additional","affiliation":[{"name":"Division of Computing and Software Systems, University of Washington Bothell , Bothell, WA 98011 , USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Hanze","family":"Meng","sequence":"additional","affiliation":[{"name":"Department of Computer Science, Duke University , Durham, NC 27708 , USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Fengbin","family":"Wang","sequence":"additional","affiliation":[{"name":"Department of Biochemistry and Molecular Genetics, University of Alabama at Birmingham , Birmingham, AL 35233 , USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Jie","family":"Hou","sequence":"additional","affiliation":[{"name":"Department of Computer Science, Saint Louis University , Saint Louis, MO 63103 , USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-8345-343X","authenticated-orcid":false,"given":"Renzhi","family":"Cao","sequence":"additional","affiliation":[{"name":"Department of Computer Science, Pacific Lutheran University , Tacoma, WA 98447 , USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Dong","family":"Si","sequence":"additional","affiliation":[{"name":"Division of Computing and Software Systems, University of Washington Bothell , Bothell, WA 98011 , USA"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"286","published-online":{"date-parts":[[2024,4,12]]},"reference":[{"issue":"1","key":"2024041301120691500_ref1","doi-asserted-by":"crossref","first-page":"151","DOI":"10.1042\/ETLS20200295","article-title":"An overview of the recent advances in cryo-electron microscopy for life sciences","volume":"5","author":"Assaiya","year":"2021","journal-title":"Emerg Top Life Sci"},{"key":"2024041301120691500_ref2","volume-title":"Molecular Biology of the Cell","author":"Alberts","year":"2002","edition":"4th"},{"issue":"1","key":"2024041301120691500_ref3","doi-asserted-by":"crossref","first-page":"1618","DOI":"10.1038\/s41467-018-04053-7","article-title":"De novo main-chain modeling for EM maps using MAINMAST","volume":"9","author":"Terashi","year":"2018","journal-title":"Nat Commun"},{"issue":"10","key":"2024041301120691500_ref4","doi-asserted-by":"crossref","first-page":"861","DOI":"10.1107\/S2059798319011471","article-title":"Macromolecular structure determination using X-rays, neutrons and electrons: recent developments in Phenix","volume":"75","author":"Liebschner","year":"2019","journal-title":"Acta Cryst D"},{"key":"2024041301120691500_ref5","doi-asserted-by":"crossref","DOI":"10.1109\/BIBE50027.2020.00028","article-title":"Sequence-guided protein structure determination using graph convolutional and recurrent networks","volume-title":"2020 IEEE 20th International Conference on Bioinformatics and Bioengineering","author":"Li"},{"issue":"2","key":"2024041301120691500_ref6","doi-asserted-by":"crossref","DOI":"10.1073\/pnas.2017525118","article-title":"DeepTracer for fast de novo cryo-EM protein structure modeling and special studies on CoV-related complexes","volume":"118","author":"Pfab","year":"2021","journal-title":"Proc Natl Acad Sci U S A"},{"issue":"2","key":"2024041301120691500_ref7","doi-asserted-by":"crossref","first-page":"195","DOI":"10.1038\/s41592-021-01389-9","article-title":"CR-I-TASSER: assemble protein structures from cryo-EM density maps using deep convolutional neural networks","volume":"19","author":"Zhang","year":"2022","journal-title":"Nat Methods"},{"issue":"1","key":"2024041301120691500_ref8","doi-asserted-by":"crossref","first-page":"4066","DOI":"10.1038\/s41467-022-31748-9","article-title":"Model building of protein complexes from intermediate-resolution cryo-EM maps with deep learning-guided automatic assembly","volume":"13","author":"He","year":"2022","journal-title":"Nat Commun"},{"key":"2024041301120691500_ref9","article-title":"A graph neural network approach to automated model building in cryo-EM maps.","volume-title":"The Eleventh International Conference on Learning Representations","author":"Jamali"},{"issue":"1","key":"2024041301120691500_ref10","doi-asserted-by":"crossref","first-page":"4288","DOI":"10.1038\/s41467-019-12279-2","article-title":"The cryo-EM structure of the acid activatable pore-forming immune effector Macrophage-expressed gene 1","volume":"10","author":"Pang","year":"2019","journal-title":"Nat Commun"},{"issue":"4096","key":"2024041301120691500_ref11","doi-asserted-by":"crossref","first-page":"223","DOI":"10.1126\/science.181.4096.223","article-title":"Principles that govern the folding of protein chains","volume":"181","author":"Anfinsen","year":"1973","journal-title":"Science"},{"key":"2024041301120691500_ref12","article-title":"Learning protein sequence embeddings using information from structure","volume-title":"Proceedings of ICLR","author":"Bepler","year":"2019"},{"issue":"12","key":"2024041301120691500_ref13","doi-asserted-by":"crossref","first-page":"1315","DOI":"10.1038\/s41592-019-0598-1","article-title":"Unified rational protein engineering with sequence-based deep representation learning","volume":"16","author":"Alley","year":"2019","journal-title":"Nat Methods"},{"key":"2024041301120691500_ref14","article-title":"Amino acid encoding for deep learning applications","year":"2020"},{"issue":"15","key":"2024041301120691500_ref15","doi-asserted-by":"crossref","first-page":"e2016239118","DOI":"10.1073\/pnas.2016239118","article-title":"Biological structure and function emerge from scaling unsupervised learning to 250 million protein sequences","volume":"118","author":"Rives","year":"2021","journal-title":"Proc Natl Acad Sci U S A"},{"issue":"11","key":"2024041301120691500_ref16","doi-asserted-by":"crossref","first-page":"681","DOI":"10.1038\/s41580-019-0163-x","article-title":"Advances in protein structure prediction and design","volume":"20","author":"Kuhlman","year":"2019","journal-title":"Nat Rev Mol Cell Biol"},{"key":"2024041301120691500_ref17","first-page":"e1005324","volume-title":"Accurate de novo prediction of protein contact map by ultra-deep learning model","author":"Wang","year":"2017"},{"issue":"7873","key":"2024041301120691500_ref18","doi-asserted-by":"crossref","first-page":"590","DOI":"10.1038\/s41586-021-03828-1","article-title":"Highly accurate protein structure prediction for the human proteome","volume":"596","author":"Tunyasuvunakool","year":"2021","journal-title":"Nature"},{"issue":"7873","key":"2024041301120691500_ref19","doi-asserted-by":"crossref","first-page":"583","DOI":"10.1038\/s41586-021-03819-2","article-title":"Highly accurate protein structure prediction with AlphaFold","volume":"596","author":"Jumper","year":"2021","journal-title":"Nature"},{"issue":"6557","key":"2024041301120691500_ref20","doi-asserted-by":"crossref","first-page":"871","DOI":"10.1126\/science.abj8754","article-title":"Accurate prediction of protein structures and interactions using a three-track neural network","volume":"373","author":"Baek","year":"2021","journal-title":"Science"},{"key":"2024041301120691500_ref21","article-title":"Attention is all you need","volume-title":"Proceedings of the 31st International Conference on Neural Information Processing Systems","author":"Vaswani","year":"2017"},{"key":"2024041301120691500_ref22"},{"issue":"1","key":"2024041301120691500_ref23","doi-asserted-by":"crossref","first-page":"45","DOI":"10.1146\/annurev.biophys.31.082901.134314","article-title":"The natural history of protein domains","volume":"31","author":"Ponting","year":"2002","journal-title":"Annu Rev Biophys Biomol Struct"},{"issue":"20","key":"2024041301120691500_ref24","doi-asserted-by":"crossref","first-page":"167208","DOI":"10.1016\/j.jmb.2021.167208","article-title":"AlphaFold and implications for intrinsically disordered proteins","volume":"433","author":"Ruff","year":"2021","journal-title":"J Mol Biol"},{"key":"2024041301120691500_ref25","doi-asserted-by":"crossref","first-page":"ii246","DOI":"10.1093\/bioinformatics\/btg1086","article-title":"Flexible structure alignment by chaining aligned fragment pairs allowing twists","volume":"19","author":"Ye","year":"2003","journal-title":"Bioinformatics"},{"issue":"9","key":"2024041301120691500_ref26","doi-asserted-by":"crossref","first-page":"739","DOI":"10.1093\/protein\/11.9.739","article-title":"Protein structure alignment by incremental combinatorial extension (CE) of the optimal path","volume":"11","author":"Shindyalov","year":"1998","journal-title":"Protein Eng"},{"key":"2024041301120691500_ref27","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1186\/1471-2105-7-339","article-title":"Tools for integrated sequence-structure analysis with UCSF chimera","volume":"7","author":"Meng","year":"2006","journal-title":"BMC Bioinformatics"},{"key":"2024041301120691500_ref28","doi-asserted-by":"crossref","first-page":"e06980","DOI":"10.7554\/eLife.06980","article-title":"Measuring the optimal exposure for single particle cryo-EM using a 2.6 \u00c5 reconstruction of rotavirus VP6","volume":"4","author":"Grant","year":"2015","journal-title":"Elife"},{"issue":"6","key":"2024041301120691500_ref29","doi-asserted-by":"crossref","first-page":"679","DOI":"10.1038\/s41592-022-01488-1","article-title":"Colabfold: making protein folding accessible to all","volume":"19","author":"Mirdita","year":"2022","journal-title":"Nat Methods"},{"issue":"13","key":"2024041301120691500_ref30","doi-asserted-by":"crossref","first-page":"1605","DOI":"10.1002\/jcc.20084","article-title":"UCSF chimera\u2014a visualization system for exploratory research and analysis","volume":"25","author":"Pettersen","year":"2004","journal-title":"J Comput Chem"},{"issue":"1","key":"2024041301120691500_ref31","doi-asserted-by":"crossref","first-page":"235","DOI":"10.1093\/nar\/28.1.235","article-title":"The protein data bank","volume":"28","author":"Berman","year":"2000","journal-title":"Nucleic Acids Res"},{"issue":"21","key":"2024041301120691500_ref32","doi-asserted-by":"crossref","first-page":"2722","DOI":"10.1093\/bioinformatics\/btt473","article-title":"lDDT: a local superposition-free score for comparing protein structures and models using distance difference tests","volume":"29","author":"Mariani","year":"2013","journal-title":"Bioinformatics"},{"issue":"1","key":"2024041301120691500_ref33","doi-asserted-by":"crossref","first-page":"6714","DOI":"10.1038\/s41467-022-34284-8","article-title":"Structural basis of organic cation transporter-3 inhibition","volume":"13","author":"Khanppnavar","year":"2022","journal-title":"Nat Commun"},{"issue":"11","key":"2024041301120691500_ref34","doi-asserted-by":"crossref","first-page":"1376","DOI":"10.1038\/s41592-022-01645-6","article-title":"Improved AlphaFold modeling with implicit experimental information","volume":"19","author":"Terwilliger","year":"2022","journal-title":"Nat Methods"},{"issue":"13","key":"2024041301120691500_ref35","doi-asserted-by":"crossref","first-page":"2279","DOI":"10.1016\/j.cell.2022.05.019","article-title":"Structure, receptor recognition, and antigenicity of the human coronavirus CCoV-HuPn-2018 spike glycoprotein","volume":"185","author":"Tortorici","year":"2022","journal-title":"Cell"},{"issue":"1","key":"2024041301120691500_ref36","doi-asserted-by":"crossref","first-page":"5166","DOI":"10.1038\/s41467-022-32883-z","article-title":"Structural basis for Gemin5 decamer-mediated mRNA binding","volume":"13","author":"Guo","year":"2022","journal-title":"Nat Commun"},{"issue":"W1","key":"2024041301120691500_ref37","doi-asserted-by":"crossref","first-page":"W732","DOI":"10.1093\/nar\/gkac370","article-title":"SWORD2: hierarchical analysis of protein 3D structures","volume":"50","author":"Cretin","year":"2022","journal-title":"Nucleic Acids Res"}],"container-title":["Briefings in Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bib\/article-pdf\/25\/3\/bbae118\/57221549\/bbae118.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bib\/article-pdf\/25\/3\/bbae118\/57221549\/bbae118.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,4,13]],"date-time":"2024-04-13T01:12:35Z","timestamp":1712970755000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bib\/article\/doi\/10.1093\/bib\/bbae118\/7644531"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,3,27]]},"references-count":37,"journal-issue":{"issue":"3","published-print":{"date-parts":[[2024,3,27]]}},"URL":"https:\/\/doi.org\/10.1093\/bib\/bbae118","relation":{},"ISSN":["1467-5463","1477-4054"],"issn-type":[{"value":"1467-5463","type":"print"},{"value":"1477-4054","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2024,5]]},"published":{"date-parts":[[2024,3,27]]},"article-number":"bbae118"}}