{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,26]],"date-time":"2026-02-26T20:33:33Z","timestamp":1772138013212,"version":"3.50.1"},"reference-count":43,"publisher":"Oxford University Press (OUP)","issue":"3","license":[{"start":{"date-parts":[[2022,3,14]],"date-time":"2022-03-14T00:00:00Z","timestamp":1647216000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/academic.oup.com\/journals\/pages\/open_access\/funder_policies\/chorus\/standard_publication_model"}],"funder":[{"DOI":"10.13039\/501100001809","name":"Natural Science Foundation of China","doi-asserted-by":"publisher","award":["LZ20F030002"],"award-info":[{"award-number":["LZ20F030002"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100001809","name":"Natural Science Foundation of China","doi-asserted-by":"publisher","award":["61773346"],"award-info":[{"award-number":["61773346"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100001809","name":"Natural Science Foundation of China","doi-asserted-by":"publisher","award":["62173304"],"award-info":[{"award-number":["62173304"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]},{"name":"\u2018New Generation Artificial Intelligence\u2019 major project of Science and Technology Innovation 2030 of the Ministry of Science and Technology of the People\u2019s Republic of China","award":["2021ZD0150100"],"award-info":[{"award-number":["2021ZD0150100"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2022,5,13]]},"abstract":"<jats:title>Abstract<\/jats:title>\n                  <jats:p>Although remarkable achievements, such as AlphaFold2, have been made in end-to-end structure prediction, fragment libraries remain essential for de novo protein structure prediction, which can help explore and understand the protein-folding mechanism. In this work, we developed a variable-length fragment library (VFlib). In VFlib, a master structure database was first constructed from the Protein Data Bank through sequence clustering. The hidden Markov model (HMM) profile of each protein in the master structure database was generated by HHsuite, and the secondary structure of each protein was calculated by DSSP. For the query sequence, the HMM-profile was first constructed. Then, variable-length fragments were retrieved from the master structure database through dynamically variable-length profile\u2013profile comparison. A complete method for chopping the query HMM-profile during this process was proposed to obtain fragments with increased diversity. Finally, secondary structure information was used to further screen the retrieved fragments to generate the final fragment library of specific query sequence. The experimental results obtained with a set of 120 nonredundant proteins show that the global precision and coverage of the fragment library generated by VFlib were 55.04% and 94.95% at the RMSD cutoff of 1.5\u00a0\u00c5, respectively. Compared with the benchmark method of NNMake, the global precision of our fragment library had increased by 62.89% with equivalent coverage. Furthermore, the fragments generated by VFlib and NNMake were used to predict structure models through fragment assembly. Controlled experimental results demonstrate that the average TM-score of VFlib was 16.00% higher than that of NNMake.<\/jats:p>","DOI":"10.1093\/bib\/bbac086","type":"journal-article","created":{"date-parts":[[2022,2,21]],"date-time":"2022-02-21T07:08:34Z","timestamp":1645427314000},"source":"Crossref","is-referenced-by-count":5,"title":["Construct a variable-length fragment library for de novo protein structure prediction"],"prefix":"10.1093","volume":"23","author":[{"given":"Qiongqiong","family":"Feng","sequence":"first","affiliation":[{"name":"College of Information Engineering, Zhejiang University of Technology, Hangzhou 310023, China"}]},{"given":"Minghua","family":"Hou","sequence":"additional","affiliation":[{"name":"College of Information Engineering, Zhejiang University of Technology, Hangzhou 310023, China"}]},{"given":"Jun","family":"Liu","sequence":"additional","affiliation":[{"name":"College of Information Engineering, Zhejiang University of Technology, Hangzhou 310023, China"}]},{"given":"Kailong","family":"Zhao","sequence":"additional","affiliation":[{"name":"College of Information Engineering, Zhejiang University of Technology, Hangzhou 310023, China"}]},{"given":"Guijun","family":"Zhang","sequence":"additional","affiliation":[{"name":"College of Information Engineering, Zhejiang University of Technology, Hangzhou 310023, China"}]}],"member":"286","published-online":{"date-parts":[[2022,3,14]]},"reference":[{"key":"2022051813453545600_ref1","doi-asserted-by":"crossref","first-page":"145","DOI":"10.1016\/j.sbi.2009.02.005","article-title":"Protein structure prediction: is it useful?","volume":"19","author":"Zhang","year":"2009","journal-title":"Curr Opin Struct Biol"},{"key":"2022051813453545600_ref2","doi-asserted-by":"crossref","first-page":"16856","DOI":"10.1073\/pnas.1821309116","article-title":"Distance-based protein folding powered by deep learning","volume":"116","author":"Xu","year":"2019","journal-title":"Proc Natl Acad Sci U S A"},{"key":"2022051813453545600_ref3","doi-asserted-by":"crossref","first-page":"1496","DOI":"10.1073\/pnas.1914677117","article-title":"Improved protein structure prediction using predicted interresidue orientations","volume":"117","author":"Yang","year":"2020","journal-title":"Proc Natl Acad Sci U S A"},{"key":"2022051813453545600_ref4","doi-asserted-by":"crossref","first-page":"706","DOI":"10.1038\/s41586-019-1923-7","article-title":"Improved protein structure prediction using potentials from deep learning","volume":"577","author":"Senior","year":"2020","journal-title":"Nature"},{"key":"2022051813453545600_ref5","doi-asserted-by":"crossref","first-page":"292","DOI":"10.1016\/j.cels.2019.03.006","article-title":"End-to-end differentiable learning of protein structure","volume":"8","author":"AlQuraishi","year":"2019","journal-title":"Cell Syst"},{"key":"2022051813453545600_ref6","doi-asserted-by":"crossref","first-page":"583","DOI":"10.1038\/s41586-021-03819-2","article-title":"Highly accurate protein structure prediction with AlphaFold","volume":"596","author":"Jumper","year":"2021","journal-title":"Nature"},{"key":"2022051813453545600_ref7","doi-asserted-by":"crossref","first-page":"611","DOI":"10.1016\/S0968-0004(00)01707-2","article-title":"Protein folding: progress made and promises ahead","volume":"25","author":"Radford","year":"2000","journal-title":"Trends Biochem Sci"},{"key":"2022051813453545600_ref8","doi-asserted-by":"crossref","first-page":"21","DOI":"10.1146\/annurev-biochem-061516-044518","article-title":"Protein Misfolding diseases","volume":"86","author":"Hartl","year":"2017","journal-title":"Annu Rev Biochem"},{"key":"2022051813453545600_ref9","doi-asserted-by":"crossref","first-page":"203","DOI":"10.1186\/s12859-020-3504-z","article-title":"DeepFrag-k: a fragment-based deep learning approach for protein fold recognition","volume":"21","author":"Elhefnawy","year":"2020","journal-title":"BMC Bioinformatics"},{"key":"2022051813453545600_ref10","doi-asserted-by":"crossref","first-page":"66","DOI":"10.1016\/S0076-6879(04)83004-0","article-title":"Protein structure prediction using Rosetta","volume":"383","author":"Rohl","year":"2004","journal-title":"Methods Enzymol"},{"key":"2022051813453545600_ref11","first-page":"1715","article-title":"Ab initio protein structure assembly using continuous structure fragments and optimized knowledge-based force field: QUARK ab initio prediction method","author":"Xu","year":"2012"},{"key":"2022051813453545600_ref12","doi-asserted-by":"crossref","first-page":"229","DOI":"10.1002\/prot.24179","article-title":"Toward optimal fragment generations for ab initio protein structure assembly","volume":"81","author":"Xu","year":"2013","journal-title":"Proteins"},{"key":"2022051813453545600_ref13","doi-asserted-by":"crossref","first-page":"2443","DOI":"10.1093\/bioinformatics\/btz943","article-title":"CGLFold: a contact-assisted de novo protein structure prediction using global exploration and loop perturbation sampling algorithm","volume":"36","author":"Liu","year":"2020","journal-title":"Bioinformatics"},{"key":"2022051813453545600_ref14","doi-asserted-by":"crossref","first-page":"677","DOI":"10.1093\/bioinformatics\/btw668","article-title":"LRFragLib: an effective algorithm to identify fragments for de novo protein structure prediction","volume":"33","author":"Wang","year":"2017","journal-title":"Bioinformatics"},{"key":"2022051813453545600_ref15","doi-asserted-by":"crossref","first-page":"e23294","DOI":"10.1371\/journal.pone.0023294","article-title":"Generalized fragment picking in Rosetta: design, protocols and applications","volume":"6","author":"Gront","year":"2011","journal-title":"PLoS One"},{"key":"2022051813453545600_ref16","doi-asserted-by":"crossref","first-page":"3110","DOI":"10.1093\/bioinformatics\/btr541","article-title":"HHfrag: HMM-based fragment detection using HHpred","volume":"27","author":"Kalev","year":"2011","journal-title":"Bioinformatics"},{"key":"2022051813453545600_ref17","doi-asserted-by":"crossref","first-page":"W244","DOI":"10.1093\/nar\/gki408","article-title":"The HHpred interactive server for protein homology detection and structure prediction","volume":"33","author":"S\u00f6ding","year":"2005","journal-title":"Nucleic Acids Res"},{"key":"2022051813453545600_ref18","doi-asserted-by":"crossref","first-page":"D318","DOI":"10.1093\/nar\/gkp786","article-title":"PDBselect 1992-2009 and PDBfilter-select","volume":"38","author":"Griep","year":"2010","journal-title":"Nucleic Acids Res"},{"key":"2022051813453545600_ref19","doi-asserted-by":"crossref","first-page":"e0123998","DOI":"10.1371\/journal.pone.0123998","article-title":"Building a better fragment library for de novo protein structure prediction","volume":"10","author":"De Oliveira","year":"2015","journal-title":"PLoS One"},{"key":"2022051813453545600_ref20","doi-asserted-by":"crossref","first-page":"725","DOI":"10.1038\/nprot.2010.5","article-title":"I-TASSER: a unified platform for automated protein structure and function prediction","volume":"5","author":"Roy","year":"2010","journal-title":"Nat Protoc"},{"key":"2022051813453545600_ref21","doi-asserted-by":"crossref","first-page":"bbab296","DOI":"10.1093\/bib\/bbab296","article-title":"Distance-guided protein folding based on generalized descent direction","volume":"22","author":"Wang","year":"2021","journal-title":"Brief Bioinform"},{"key":"2022051813453545600_ref22","doi-asserted-by":"crossref","first-page":"99","DOI":"10.1093\/bioinformatics\/btab620","article-title":"Ade novoprotein structure prediction by iterative partition sampling, topology adjustment and residue-level distance deviation optimization","volume":"38","author":"Liu","year":"2021","journal-title":"Bioinformatics"},{"key":"2022051813453545600_ref23","doi-asserted-by":"crossref","first-page":"1026","DOI":"10.1038\/nbt.3988","article-title":"MMseqs2 enables sensitive protein sequence searching for the analysis of massive data sets","volume":"35","author":"Steinegger","year":"2017","journal-title":"Nat Biotechnol"},{"key":"2022051813453545600_ref24","doi-asserted-by":"crossref","first-page":"2856","DOI":"10.1093\/bioinformatics\/bty1057","article-title":"MMseqs2 desktop and local web server app for fast, interactive sequence searches","volume":"35","author":"Mirdita","year":"2019","journal-title":"Bioinformatics"},{"key":"2022051813453545600_ref25","doi-asserted-by":"crossref","first-page":"473","DOI":"10.1186\/s12859-019-3019-7","article-title":"HH-suite3 for fast remote homology detection and deep protein annotation","volume":"20","author":"Steinegger","year":"2019","journal-title":"BMC Bioinformatics"},{"key":"2022051813453545600_ref26","doi-asserted-by":"crossref","first-page":"D364","DOI":"10.1093\/nar\/gku1028","article-title":"A series of PDB-related databanks for everyday needs","volume":"43","author":"Touw","year":"2015","journal-title":"Nucleic Acids Res"},{"key":"2022051813453545600_ref27","doi-asserted-by":"crossref","first-page":"2577","DOI":"10.1002\/bip.360221211","article-title":"Dictionary of protein secondary structure: pattern recognition of hydrogen-bonded and geometrical features","volume":"22","author":"Kabsch","year":"1983","journal-title":"Biopolymers"},{"key":"2022051813453545600_ref28","doi-asserted-by":"crossref","first-page":"368","DOI":"10.1016\/j.sbi.2006.04.004","article-title":"Multiple sequence alignment","volume":"16","author":"Edgar","year":"2006","journal-title":"Curr Opin Struct Biol"},{"key":"2022051813453545600_ref29","doi-asserted-by":"crossref","first-page":"173","DOI":"10.1038\/nmeth.1818","article-title":"HHblits: lightning-fast iterative protein sequence searching by HMM-HMM alignment","volume":"9","author":"Remmert","year":"2012","journal-title":"Nat Methods"},{"key":"2022051813453545600_ref30","doi-asserted-by":"crossref","first-page":"D170","DOI":"10.1093\/nar\/gkw1081","article-title":"Uniclust databases of clustered and deeply annotated protein sequences and alignments","volume":"45","author":"Mirdita","year":"2017","journal-title":"Nucleic Acids Res"},{"key":"2022051813453545600_ref31","doi-asserted-by":"crossref","first-page":"3150","DOI":"10.1093\/bioinformatics\/bts565","article-title":"CD-HIT: accelerated for clustering the next-generation sequencing data","volume":"28","author":"Fu","year":"2012","journal-title":"Bioinformatics"},{"key":"2022051813453545600_ref32","doi-asserted-by":"crossref","first-page":"490","DOI":"10.1002\/prot.23215","article-title":"The dual role of fragments in fragment-assembly methods for de novo protein structure prediction","volume":"80","author":"Handl","year":"2012","journal-title":"Proteins"},{"key":"2022051813453545600_ref33","doi-asserted-by":"crossref","first-page":"951","DOI":"10.1093\/bioinformatics\/bti125","article-title":"Protein homology detection by HMM\u2013HMM comparison","volume":"21","author":"Soding","year":"2005","journal-title":"Bioinformatics"},{"key":"2022051813453545600_ref34","doi-asserted-by":"crossref","first-page":"404","DOI":"10.1093\/bioinformatics\/16.4.404","article-title":"The PSIPRED protein structure prediction server","volume":"16","author":"McGuffin","year":"2000","journal-title":"Bioinformatics"},{"key":"2022051813453545600_ref35","doi-asserted-by":"crossref","first-page":"4350","DOI":"10.1093\/bioinformatics\/btab484","article-title":"MMpred: a distance-assisted multimodal conformation sampling for de novo protein structure prediction","volume":"37","author":"Zhao","year":"2021","journal-title":"Bioinformatics"},{"key":"2022051813453545600_ref36","doi-asserted-by":"crossref","first-page":"15930","DOI":"10.1073\/pnas.1905068116","article-title":"Assembling multidomain protein structures through analogous global structural alignments","volume":"116","author":"Zhou","year":"2019","journal-title":"Proc Natl Acad Sci U S A"},{"key":"2022051813453545600_ref37","doi-asserted-by":"crossref","first-page":"2302","DOI":"10.1093\/nar\/gki524","article-title":"TM-align: a protein structure alignment algorithm based on the TM-score","volume":"33","author":"Zhang","year":"2005","journal-title":"Nucleic Acids Res"},{"key":"2022051813453545600_ref38","doi-asserted-by":"crossref","first-page":"297","DOI":"10.1016\/S0022-2836(02)00942-7","article-title":"Small libraries of protein fragments model native protein structures accurately","volume":"323","author":"Kolodny","year":"2002","journal-title":"J Mol Biol"},{"key":"2022051813453545600_ref39","doi-asserted-by":"crossref","first-page":"i182","DOI":"10.1093\/bioinformatics\/btn165","article-title":"Designing succinct structural alphabets","volume":"24","author":"Li","year":"2008","journal-title":"Bioinformatics"},{"key":"2022051813453545600_ref40","doi-asserted-by":"crossref","first-page":"1470","DOI":"10.1110\/ps.690101","article-title":"A normalized root-mean-square distance for comparing protein three-dimensional structures","volume":"10","author":"Carugo","year":"2001","journal-title":"Protein Sci"},{"key":"2022051813453545600_ref41","doi-asserted-by":"crossref","first-page":"452","DOI":"10.1016\/j.sbi.2011.05.002","article-title":"Protein design with fragment databases","volume":"21","author":"Verschueren","year":"2011","journal-title":"Curr Opin Struct Biol"},{"key":"2022051813453545600_ref42","doi-asserted-by":"crossref","first-page":"3898","DOI":"10.1016\/j.jmb.2020.04.013","article-title":"Identification and analysis of natural building blocks for evolution-guided fragment-based protein design","volume":"432","author":"Ferruz","year":"2020","journal-title":"J Mol Biol"},{"key":"2022051813453545600_ref43","doi-asserted-by":"crossref","first-page":"2565","DOI":"10.1002\/prot.24620","article-title":"Direct prediction of profiles of sequences compatible with a protein structure by neural networks with fragment-based local and energy-based nonlocal profiles","volume":"82","author":"Li","year":"2014","journal-title":"Proteins"}],"container-title":["Briefings in Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bib\/article-pdf\/23\/3\/bbac086\/43745829\/bbac086.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bib\/article-pdf\/23\/3\/bbac086\/43745829\/bbac086.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,11,17]],"date-time":"2023-11-17T16:06:13Z","timestamp":1700237173000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bib\/article\/doi\/10.1093\/bib\/bbac086\/6547572"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,3,14]]},"references-count":43,"journal-issue":{"issue":"3","published-print":{"date-parts":[[2022,5,13]]}},"URL":"https:\/\/doi.org\/10.1093\/bib\/bbac086","relation":{"has-preprint":[{"id-type":"doi","id":"10.1101\/2022.01.03.474755","asserted-by":"object"}]},"ISSN":["1467-5463","1477-4054"],"issn-type":[{"value":"1467-5463","type":"print"},{"value":"1477-4054","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2022,5]]},"published":{"date-parts":[[2022,3,14]]},"article-number":"bbac086"}}