{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,31]],"date-time":"2026-03-31T10:32:05Z","timestamp":1774953125353,"version":"3.50.1"},"reference-count":73,"publisher":"Oxford University Press (OUP)","issue":"4","license":[{"start":{"date-parts":[[2023,6,15]],"date-time":"2023-06-15T00:00:00Z","timestamp":1686787200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/academic.oup.com\/pages\/standard-publication-reuse-rights"}],"funder":[{"DOI":"10.13039\/501100012166","name":"National Key Research and Development Program of China","doi-asserted-by":"publisher","award":["2020YFA0908400"],"award-info":[{"award-number":["2020YFA0908400"]}],"id":[{"id":"10.13039\/501100012166","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"publisher","award":["NSFC 62172296"],"award-info":[{"award-number":["NSFC 62172296"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]},{"name":"Excellent Young Scientists Fund in Hunan Province","award":["2022JJ20077"],"award-info":[{"award-number":["2022JJ20077"]}]},{"name":"Scientific Research Fund of Hunan Provincial Education Department","award":["22A0007"],"award-info":[{"award-number":["22A0007"]}]},{"name":"Zhejiang Lab Open Research Project","award":["K2022PE0AB07"],"award-info":[{"award-number":["K2022PE0AB07"]}]},{"name":"Shenzhen Science and Technology Program","award":["KQTD20200820113106007"],"award-info":[{"award-number":["KQTD20200820113106007"]}]},{"name":"High Performance Computing Center of Central South University"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2023,7,20]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:sec>\n                  <jats:title>\u00a0<\/jats:title>\n                  <jats:p>In recent years, protein structure problems have become a hotspot for understanding protein folding and function mechanisms. It has been observed that most of the protein structure works rely on and benefit from co-evolutionary information obtained by multiple sequence alignment (MSA). As an example, AlphaFold2 (AF2) is a typical MSA-based protein structure tool which is famous for its high accuracy. As a consequence, these MSA-based methods are limited by the quality of the MSAs. Especially for orphan proteins that have no homologous sequence, AlphaFold2 performs unsatisfactorily as MSA depth decreases, which may pose a barrier to its widespread application in protein mutation and design problems in which there are no rich homologous sequences and rapid prediction is needed. In this paper, we constructed two standard datasets for orphan and de novo proteins which have insufficient\/none homology information, called Orphan62 and Design204, respectively, to fairly evaluate the performance of the various methods in this case. Then, depending on whether or not utilizing scarce MSA information, we summarized two approaches, MSA-enhanced and MSA-free methods, to effectively solve the issue without sufficient MSAs. MSA-enhanced model aims to improve poor MSA quality from the data source by knowledge distillation and generation models. MSA-free model directly learns the relationship between residues on enormous protein sequences from pre-trained models, bypassing the step of extracting the residue pair representation from MSA. Next, we evaluated the performance of four MSA-free methods (trRosettaX-Single, TRFold, ESMFold and ProtT5) and MSA-enhanced (Bagging MSA) method compared with a traditional MSA-based method AlphaFold2, in two protein structure-related prediction tasks, respectively. Comparison analyses show that trRosettaX-Single and ESMFold which belong to MSA-free method can achieve fast prediction ($\\sim\\! 40$s) and comparable performance compared with AF2 in tertiary structure prediction, especially for short peptides, $\\alpha $-helical segments and targets with few homologous sequences. Bagging MSA utilizing MSA enhancement improves the accuracy of our trained base model which is an MSA-based method when poor homology information exists in secondary structure prediction. Our study provides biologists an insight of how to select rapid and appropriate prediction tools for enzyme engineering and peptide drug development.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Contact<\/jats:title>\n                  <jats:p>guofei@csu.edu.cn, jj.tang@siat.ac.cn<\/jats:p>\n               <\/jats:sec>","DOI":"10.1093\/bib\/bbad217","type":"journal-article","created":{"date-parts":[[2023,6,16]],"date-time":"2023-06-16T02:11:26Z","timestamp":1686881486000},"source":"Crossref","is-referenced-by-count":33,"title":["Improved structure-related prediction for insufficient homologous proteins using MSA enhancement and pre-trained language model"],"prefix":"10.1093","volume":"24","author":[{"given":"Qiaozhen","family":"Meng","sequence":"first","affiliation":[{"name":"School of Computer Science and Technology, College of Intelligence and Computing, Tianjin University , Tianjin , China"}]},{"given":"Fei","family":"Guo","sequence":"additional","affiliation":[{"name":"School of Computer Science and Engineering, Central South University , Changsha 410083 , China"}]},{"given":"Jijun","family":"Tang","sequence":"additional","affiliation":[{"name":"Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences , Shenzhen 518000 , China"}]}],"member":"286","published-online":{"date-parts":[[2023,6,15]]},"reference":[{"issue":"2","key":"2023072020155396600_ref1","doi-asserted-by":"crossref","first-page":"161","DOI":"10.1016\/S0959-440X(02)00304-4","article-title":"Protein folding: the free energy surface","volume":"12","author":"Gruebele","year":"2002","journal-title":"Curr Opin Struct Biol"},{"issue":"W1","key":"2023072020155396600_ref2","doi-asserted-by":"crossref","first-page":"W174","DOI":"10.1093\/nar\/gkv342","article-title":"I-TASSER server: new development for protein structure and function predictions","volume":"43","author":"Yang","year":"2015","journal-title":"Nucleic Acids Res"},{"issue":"2","key":"2023072020155396600_ref3","doi-asserted-by":"crossref","first-page":"247","DOI":"10.1016\/j.jmgm.2005.12.005","article-title":"Automatic atom type and bond type perception in molecular mechanical calculations","volume":"25","author":"Wang","year":"2006","journal-title":"J Mol Graph Model"},{"issue":"3","key":"2023072020155396600_ref4","doi-asserted-by":"crossref","first-page":"435","DOI":"10.1021\/ct700301q","article-title":"GROMACS 4: algorithms for highly efficient, load-balanced, and scalable molecular simulation","volume":"4","author":"Hess","year":"2008","journal-title":"J Chem Theory Comput"},{"issue":"6","key":"2023072020155396600_ref5","doi-asserted-by":"crossref","first-page":"3031","DOI":"10.1021\/acs.jctc.7b00125","article-title":"The Rosetta all-atom energy function for macromolecular modeling and design","volume":"13","author":"Alford","year":"2017","journal-title":"J Chem Theory Comput"},{"issue":"7","key":"2023072020155396600_ref6","doi-asserted-by":"crossref","first-page":"1715","DOI":"10.1002\/prot.24065","article-title":"Ab initio protein structure assembly using continuous structure fragments and optimized knowledge-based force field","volume":"80","author":"Xu","year":"2012","journal-title":"Proteins"},{"key":"2023072020155396600_ref7","volume-title":"Proceedings of the IEEE conference on computer vision and pattern recognition"},{"key":"2023072020155396600_ref8","article-title":"Attention is all you need","volume":"30","author":"Vaswani","year":"2017","journal-title":"Adv Neural Inf Process Syst"},{"issue":"1","key":"2023072020155396600_ref9","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1186\/1471-2105-12-206","article-title":"New methods to measure residues coevolution in proteins","volume":"12","author":"Gao","year":"2011","journal-title":"BMC Bioinformatics."},{"issue":"1","key":"2023072020155396600_ref10","doi-asserted-by":"crossref","first-page":"e1005324","DOI":"10.1371\/journal.pcbi.1005324","article-title":"Accurate de novo prediction of protein contact map by ultra-deep learning model","volume":"13","author":"Wang","year":"2017","journal-title":"PLoS Comput Biol"},{"issue":"12","key":"2023072020155396600_ref11","doi-asserted-by":"crossref","first-page":"1069","DOI":"10.1002\/prot.25810","article-title":"Analysis of distance-based protein structure prediction by deep learning in CASP13","volume":"87","author":"Xu","year":"2019","journal-title":"Proteins"},{"issue":"7792","key":"2023072020155396600_ref12","doi-asserted-by":"crossref","first-page":"706","DOI":"10.1038\/s41586-019-1923-7","article-title":"Improved protein structure prediction using potentials from deep learning","volume":"577","author":"Senior","year":"2020","journal-title":"Nature"},{"issue":"3","key":"2023072020155396600_ref13","doi-asserted-by":"crossref","first-page":"1496","DOI":"10.1073\/pnas.1914677117","article-title":"Improved protein structure prediction using predicted interresidue orientations","volume":"117","author":"Yang","year":"2020","journal-title":"Proc Natl Acad Sci"},{"issue":"7873","key":"2023072020155396600_ref14","doi-asserted-by":"crossref","first-page":"583","DOI":"10.1038\/s41586-021-03819-2","article-title":"Highly accurate protein structure prediction with AlphaFold","volume":"596","author":"Jumper","year":"2021","journal-title":"Nature"},{"issue":"6322","key":"2023072020155396600_ref15","doi-asserted-by":"crossref","first-page":"294","DOI":"10.1126\/science.aah4043","article-title":"Protein structure determination using metagenome sequence data","volume":"355","author":"Ovchinnikov","year":"2017","journal-title":"Science"},{"issue":"4","key":"2023072020155396600_ref16","doi-asserted-by":"crossref","first-page":"337","DOI":"10.1089\/cmb.1994.1.337","article-title":"On the complexity of multiple sequence alignment","volume":"1","author":"Wang","year":"1994","journal-title":"J Comput Biol"},{"issue":"8","key":"2023072020155396600_ref17","doi-asserted-by":"crossref","first-page":"1169","DOI":"10.1016\/j.str.2022.05.001","article-title":"Protein language-model embeddings for fast, accurate, and alignment-free protein structure prediction","volume":"30","author":"Wei\u00dfenow","year":"2022","journal-title":"Structure"},{"issue":"7","key":"2023072020155396600_ref18","doi-asserted-by":"crossref","first-page":"2105","DOI":"10.1093\/bioinformatics\/btz863","article-title":"DeepMSA: constructing deep multiple sequence alignment to improve contact prediction and fold-recognition for distant-homology proteins","volume":"36","author":"Zhang","year":"2020","journal-title":"Bioinformatics"},{"issue":"52","key":"2023072020155396600_ref19","doi-asserted-by":"crossref","first-page":"15898","DOI":"10.1073\/pnas.1508380112","article-title":"Unexpected features of the dark proteome","volume":"112","author":"Perdig\u00e3o","year":"2015","journal-title":"Proc Natl Acad Sci"},{"issue":"D1","key":"2023072020155396600_ref20","doi-asserted-by":"crossref","first-page":"D480","DOI":"10.1093\/nar\/gkaa1100","article-title":"UniProt: the universal protein knowledgebase in 2021","volume":"49","year":"2021","journal-title":"Nucleic Acids Res"},{"issue":"D1","key":"2023072020155396600_ref21","doi-asserted-by":"crossref","first-page":"D523","DOI":"10.1093\/nar\/gkac1052","article-title":"UniProt: the universal protein knowledgebase in 2023","volume":"51","year":"2023","journal-title":"Nucleic Acids Res"},{"issue":"6","key":"2023072020155396600_ref22","doi-asserted-by":"crossref","first-page":"926","DOI":"10.1093\/bioinformatics\/btu739","article-title":"UniRef clusters: a comprehensive and scalable alternative for improving sequence similarity searches","volume":"31","author":"Suzek","year":"2015","journal-title":"Bioinformatics"},{"issue":"D1","key":"2023072020155396600_ref23","first-page":"D570","article-title":"MGnify: the microbiome analysis resource in 2020","volume":"48","author":"Mitchell","year":"2020","journal-title":"Nucleic Acids Res"},{"issue":"11","key":"2023072020155396600_ref24","doi-asserted-by":"crossref","first-page":"1026","DOI":"10.1038\/nbt.3988","article-title":"MMseqs2 enables sensitive protein sequence searching for the analysis of massive data sets","volume":"35","author":"Steinegger","year":"2017","journal-title":"Nat Biotechnol"},{"issue":"10","key":"2023072020155396600_ref25","doi-asserted-by":"crossref","first-page":"692","DOI":"10.1038\/nrg3053","article-title":"The evolutionary origin of orphan genes","volume":"12","author":"Tautz","year":"2011","journal-title":"Nat Rev Genet"},{"issue":"13","key":"2023072020155396600_ref26","doi-asserted-by":"crossref","first-page":"4355","DOI":"10.1073\/pnas.84.13.4355","article-title":"Profile analysis: detection of distantly related proteins","volume":"84","author":"Gribskov","year":"1987","journal-title":"Proc Natl Acad Sci"},{"issue":"9","key":"2023072020155396600_ref27","doi-asserted-by":"crossref","first-page":"755","DOI":"10.1093\/bioinformatics\/14.9.755","article-title":"Profile hidden Markov models","volume":"14","author":"Eddy","year":"1998","journal-title":"Bioinformatics"},{"issue":"7","key":"2023072020155396600_ref28","doi-asserted-by":"crossref","first-page":"1336","DOI":"10.1039\/C7MB00188F","article-title":"Predicting protein\u2013protein interactions from protein sequences by a stacked sparse autoencoder deep neural network","volume":"13","author":"Wang","year":"2017","journal-title":"Mol Biosyst"},{"issue":"1","key":"2023072020155396600_ref29","first-page":"463","article-title":"Identification of DNA-binding proteins using support vector machines and evolutionary profiles","volume":"8","author":"Manish","year":"2007","journal-title":"J Eur Psychol Stud"},{"key":"2023072020155396600_ref30","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1186\/1752-0509-4-S2-S1","article-title":"BindN+ for accurate prediction of DNA and RNA-binding residues from protein sequence features","volume":"4","author":"Wang","year":"2010","journal-title":"BMC Syst Biol"},{"key":"2023072020155396600_ref31","doi-asserted-by":"crossref","first-page":"33","DOI":"10.1186\/1471-2105-6-33","article-title":"PSSM-based prediction of DNA binding sites in proteins","volume":"6","author":"Ahmad","year":"2005","journal-title":"BMC Bioinformatics"},{"issue":"2","key":"2023072020155396600_ref32","doi-asserted-by":"crossref","first-page":"195","DOI":"10.1006\/jmbi.1999.3091","article-title":"Protein secondary structure prediction based on position-specific scoring matrices","volume":"292","author":"Jones","year":"1999","journal-title":"J Mol Biol"},{"key":"2023072020155396600_ref33","doi-asserted-by":"crossref","first-page":"107456","DOI":"10.1016\/j.compbiolchem.2021.107456","article-title":"Structural protein fold recognition based on secondary structure and evolutionary information using machine learning algorithms","volume":"91","author":"Qin","year":"2021","journal-title":"Comput Biol Chem"},{"issue":"2","key":"2023072020155396600_ref34","doi-asserted-by":"crossref","first-page":"90","DOI":"10.2174\/1574893614666191017104639","article-title":"Protein secondary structure prediction: a review of progress and directions","volume":"15","author":"Smolarczyk","year":"2020","journal-title":"Current Bioinformatics"},{"issue":"7","key":"2023072020155396600_ref35","doi-asserted-by":"crossref","first-page":"e0255076","DOI":"10.1371\/journal.pone.0255076","article-title":"A secondary structure-based position-specific scoring matrix applied to the improvement in protein secondary structure prediction","volume":"16","author":"Chen","year":"2021","journal-title":"PloS One"},{"issue":"7889","key":"2023072020155396600_ref36","doi-asserted-by":"crossref","first-page":"547","DOI":"10.1038\/s41586-021-04184-w","article-title":"De novo protein design by deep network hallucination","volume":"600","author":"Anishchenko","year":"2021","journal-title":"Nature"},{"key":"2023072020155396600_ref37","first-page":"2020","article-title":"Protein sequence design by explicit energy landscape optimization","author":"Norn","year":"2020","journal-title":"BioRxiv"},{"issue":"4","key":"2023072020155396600_ref38","doi-asserted-by":"crossref","first-page":"292","DOI":"10.1016\/j.cels.2019.03.006","article-title":"End-to-end differentiable learning of protein structure","volume":"8","author":"AlQuraishi","year":"2019","journal-title":"Cell Systems"},{"issue":"6","key":"2023072020155396600_ref39","doi-asserted-by":"crossref","first-page":"520","DOI":"10.1002\/prot.25674","article-title":"NetSurfP-2.0: improved prediction of protein structural features by integrated deep learning","volume":"87","author":"Klausen","year":"2019","journal-title":"Proteins"},{"issue":"2","key":"2023072020155396600_ref40","doi-asserted-by":"crossref","first-page":"173","DOI":"10.1038\/nmeth.1818","article-title":"HHblits: lightning-fast iterative protein sequence searching by HMM-HMM alignment","volume":"9","author":"Remmert","year":"2012","journal-title":"Nat Methods"},{"issue":"2","key":"2023072020155396600_ref41","doi-asserted-by":"crossref","first-page":"184","DOI":"10.1093\/bioinformatics\/btr638","article-title":"PSICOV: precise structural contact prediction using sparse inverse covariance estimation on large multiple sequence alignments","volume":"28","author":"Jones","year":"2012","journal-title":"Bioinformatics"},{"issue":"49","key":"2023072020155396600_ref42","doi-asserted-by":"crossref","first-page":"E1293","DOI":"10.1073\/pnas.1111471108","article-title":"Direct-coupling analysis of residue coevolution captures native contacts across many protein families","volume":"108","author":"Morcos","year":"2011","journal-title":"Proc Natl Acad Sci"},{"issue":"4","key":"2023072020155396600_ref43","doi-asserted-by":"crossref","first-page":"1061","DOI":"10.1002\/prot.22934","article-title":"Learning generative models for protein fold families","volume":"79","author":"Balakrishnan","year":"2011","journal-title":"Proteins"},{"issue":"21","key":"2023072020155396600_ref44","doi-asserted-by":"crossref","first-page":"3128","DOI":"10.1093\/bioinformatics\/btu500","article-title":"CCMpred\u2014fast and precise prediction of protein residue\u2013residue contacts from correlated mutations","volume":"30","author":"Seemayer","year":"2014","journal-title":"Bioinformatics"},{"issue":"3","key":"2023072020155396600_ref45","doi-asserted-by":"crossref","first-page":"e1008865","DOI":"10.1371\/journal.pcbi.1008865","article-title":"Deducing high-accuracy protein contact-maps from a triplet of coevolutionary matrices through deep residual convolutional networks","volume":"17","author":"Li","year":"2021","journal-title":"PLoS Comput Biol"},{"issue":"1","key":"2023072020155396600_ref46","doi-asserted-by":"crossref","first-page":"65","DOI":"10.1016\/j.cels.2017.11.014","article-title":"Enhancing evolutionary couplings with deep convolutional neural networks","volume":"6","author":"Liu","year":"2018","journal-title":"Cell Systems"},{"issue":"1","key":"2023072020155396600_ref47","doi-asserted-by":"crossref","first-page":"2535","DOI":"10.1038\/s41467-021-22869-8","article-title":"CopulaNet: learning residue co-evolution directly from multiple sequence alignment for protein structure prediction","volume":"12","author":"Ju","year":"2021","journal-title":"Nat Commun"},{"issue":"4","key":"2023072020155396600_ref48","doi-asserted-by":"crossref","first-page":"e2113348119","DOI":"10.1073\/pnas.2113348119","article-title":"Ultrafast end-to-end protein structure prediction enables high-throughput exploration of uncharacterized proteins","volume":"119","author":"Kandathil","year":"2022","journal-title":"Proc Natl Acad Sci"},{"key":"2023072020155396600_ref49","volume":"14","journal-title":"PloS one"},{"issue":"15","key":"2023072020155396600_ref50","doi-asserted-by":"crossref","first-page":"e2016239118","DOI":"10.1073\/pnas.2016239118","article-title":"Biological structure and function emerge from scaling unsupervised learning to 250 million protein sequences","volume":"118","author":"Rives","year":"2021","journal-title":"Proc Natl Acad Sci"},{"key":"2023072020155396600_ref51","volume-title":"Proceedings of the38th International Conference on Machine Learning. vol. 139 of Proceedings of Machine Learning Research. PMLR"},{"issue":"14","key":"2023072020155396600_ref52","doi-asserted-by":"crossref","first-page":"3574","DOI":"10.1093\/bioinformatics\/btac351","article-title":"Prior knowledge facilitates low homologous protein secondary structure prediction with DSM distillation","volume":"38","author":"Wang","year":"2022","journal-title":"Bioinformatics"},{"key":"2023072020155396600_ref53","first-page":"4620","article-title":"Contact-Distil: Boosting Low Homologous Protein Contact Map Prediction by Self-Supervised Distillation","volume-title":"Proceedings of the AAAI Conference on Artificial Intelligence","author":"Wang","year":"2022"},{"key":"2023072020155396600_ref54","first-page":"617","article-title":"PSSM-distil: Protein secondary structure prediction (PSSP) on low-quality PSSM by knowledge distillation with contrastive learning","volume-title":"Proceedings of the AAAI Conference on Artificial Intelligence","author":"Wang","year":"2021"},{"key":"2023072020155396600_ref55","doi-asserted-by":"crossref","first-page":"88","DOI":"10.1007\/978-3-030-45257-5_6","article-title":"Bagging msa learning: Enhancing low-quality pssm with deep learning for accurate protein structure property prediction","volume-title":"Research in Computational Molecular Biology: 24th Annual International Conference, RECOMB 2020","author":"Guo","year":"2020"},{"issue":"4","key":"2023072020155396600_ref56","doi-asserted-by":"crossref","first-page":"362","DOI":"10.1089\/cmb.2020.0417","article-title":"EPTool: a new enhancing PSSM tool for protein secondary structure prediction","volume":"28","author":"Guo","year":"2021","journal-title":"J Comput Biol"},{"issue":"11","key":"2023072020155396600_ref57","doi-asserted-by":"crossref","first-page":"1617","DOI":"10.1038\/s41587-022-01432-w","article-title":"Single-sequence protein structure prediction using a language model and deep learning","volume":"40","author":"Chowdhury","year":"2022","journal-title":"Nat Biotechnol"},{"key":"2023072020155396600_ref58","volume":"379","journal-title":"Science"},{"key":"2023072020155396600_ref59","doi-asserted-by":"crossref","DOI":"10.21203\/rs.3.rs-1969991\/v1","article-title":"Helixfold-single: Msa-free protein structure prediction by using protein language model as an alternative","author":"Fang","year":"2022"},{"key":"2023072020155396600_ref60","first-page":"2022","article-title":"High-resolution de novo structure prediction from primary sequence","author":"Wu","year":"2022","journal-title":"BioRxiv"},{"issue":"12","key":"2023072020155396600_ref61","doi-asserted-by":"crossref","first-page":"804","DOI":"10.1038\/s43588-022-00373-3","article-title":"Single-sequence protein structure prediction using supervised transformer protein language models","volume":"2","author":"Wang","year":"2022","journal-title":"Nature Computational Science"},{"key":"2023072020155396600_ref62","first-page":"2022","article-title":"tFold-ab:fast and accurate antibody structure prediction without sequence homologs","author":"Wu","year":"2022","journal-title":"bioRxiv"},{"issue":"1","key":"2023072020155396600_ref63","doi-asserted-by":"crossref","first-page":"235","DOI":"10.1093\/nar\/28.1.235","article-title":"The protein data bank","volume":"28","author":"Berman","year":"2000","journal-title":"Nucleic Acids Res"},{"issue":"1","key":"2023072020155396600_ref64","doi-asserted-by":"crossref","first-page":"268","DOI":"10.1039\/C1MB05231D","article-title":"Attributes of short linear motifs","volume":"8","author":"Davey","year":"2012","journal-title":"Mol Biosyst"},{"issue":"2","key":"2023072020155396600_ref65","doi-asserted-by":"crossref","first-page":"161","DOI":"10.1016\/j.molcel.2014.05.032","article-title":"A million peptide motifs for the molecular biologist","volume":"55","author":"Tompa","year":"2014","journal-title":"Mol Cell"},{"key":"2023072020155396600_ref66","doi-asserted-by":"crossref","first-page":"e10034","DOI":"10.7554\/eLife.10034","article-title":"Structural determinants of nuclear export signal orientation in binding to exportin CRM1","volume":"4","author":"Fung","year":"2015","journal-title":"Elife"},{"issue":"1","key":"2023072020155396600_ref67","first-page":"1","article-title":"Protein secondary structure prediction using deep convolutional neural fields","volume":"6","author":"Wang","year":"2016","journal-title":"Sci Rep"},{"issue":"7","key":"2023072020155396600_ref68","doi-asserted-by":"crossref","first-page":"601","DOI":"10.1038\/s42256-021-00348-5","article-title":"Improved protein structure prediction by deep learning irrespective of co-evolution information","volume":"3","author":"Xu","year":"2021","journal-title":"Nature Machine Intelligence"},{"issue":"12","key":"2023072020155396600_ref69","doi-asserted-by":"crossref","first-page":"1589","DOI":"10.1093\/bioinformatics\/btg224","article-title":"PISCES: a protein sequence culling server","volume":"19","author":"Wang","year":"2003","journal-title":"Bioinformatics"},{"key":"2023072020155396600_ref70","volume-title":"International conferenceon machine learning. Journal Machine Learning Research"},{"key":"2023072020155396600_ref71","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1186\/1471-2105-11-431","article-title":"Hidden Markov model speed heuristic and iterative HMM search procedure","volume":"11","author":"Johnson","year":"2010","journal-title":"BMC Bioinformatics"},{"key":"2023072020155396600_ref72","volume-title":"International Conference on Learning Representations"},{"issue":"6628","key":"2023072020155396600_ref73","doi-asserted-by":"crossref","first-page":"195","DOI":"10.1126\/science.ade9434","article-title":"Combinatorial assembly and design of enzymes","volume":"379","author":"Lipsh-Sokolik","year":"2023","journal-title":"Science"}],"container-title":["Briefings in Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bib\/article-pdf\/24\/4\/bbad217\/50917116\/bbad217.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bib\/article-pdf\/24\/4\/bbad217\/50917116\/bbad217.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,7,20]],"date-time":"2023-07-20T20:18:07Z","timestamp":1689884287000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bib\/article\/doi\/10.1093\/bib\/bbad217\/7198547"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,6,15]]},"references-count":73,"journal-issue":{"issue":"4","published-print":{"date-parts":[[2023,7,20]]}},"URL":"https:\/\/doi.org\/10.1093\/bib\/bbad217","relation":{},"ISSN":["1467-5463","1477-4054"],"issn-type":[{"value":"1467-5463","type":"print"},{"value":"1477-4054","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2023,7]]},"published":{"date-parts":[[2023,6,15]]},"article-number":"bbad217"}}