{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,8]],"date-time":"2026-02-08T05:25:01Z","timestamp":1770528301007,"version":"3.49.0"},"reference-count":56,"publisher":"Oxford University Press (OUP)","issue":"12","license":[{"start":{"date-parts":[[2016,10,2]],"date-time":"2016-10-02T00:00:00Z","timestamp":1475366400000},"content-version":"vor","delay-in-days":1576,"URL":"http:\/\/creativecommons.org\/licenses\/by-nc\/3.0"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2012,6,15]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:p>Motivation: Alignment errors are still the main bottleneck for current template-based protein modeling (TM) methods, including protein threading and homology modeling, especially when the sequence identity between two proteins under consideration is low (&amp;lt;30%).<\/jats:p>\n               <jats:p>Results: We present a novel protein threading method, CNFpred, which achieves much more accurate sequence\u2013template alignment by employing a probabilistic graphical model called a Conditional Neural Field (CNF), which aligns one protein sequence to its remote template using a non-linear scoring function. This scoring function accounts for correlation among a variety of protein sequence and structure features, makes use of information in the neighborhood of two residues to be aligned, and is thus much more sensitive than the widely used linear or profile-based scoring function. To train this CNF threading model, we employ a novel quality-sensitive method, instead of the standard maximum-likelihood method, to maximize directly the expected quality of the training set. Experimental results show that CNFpred generates significantly better alignments than the best profile-based and threading methods on several public (but small) benchmarks as well as our own large dataset. CNFpred outperforms others regardless of the lengths or classes of proteins, and works particularly well for proteins with sparse sequence profiles due to the effective utilization of structure information. Our methodology can also be adapted to protein sequence alignment.<\/jats:p>\n               <jats:p>Contact: \u00a0j3xu@ttic.edu<\/jats:p>\n               <jats:p>Supplementary information: \u00a0Supplementary data are available at Bioinformatics online.<\/jats:p>","DOI":"10.1093\/bioinformatics\/bts213","type":"journal-article","created":{"date-parts":[[2012,6,11]],"date-time":"2012-06-11T14:09:18Z","timestamp":1339423758000},"page":"i59-i66","source":"Crossref","is-referenced-by-count":81,"title":["A conditional neural fields model for protein threading"],"prefix":"10.1093","volume":"28","author":[{"given":"Jianzhu","family":"Ma","sequence":"first","affiliation":[{"name":"Toyota Technological Institute at Chicago IL 60637, USA"}]},{"given":"Jian","family":"Peng","sequence":"additional","affiliation":[{"name":"Toyota Technological Institute at Chicago IL 60637, USA"}]},{"given":"Sheng","family":"Wang","sequence":"additional","affiliation":[{"name":"Toyota Technological Institute at Chicago IL 60637, USA"}]},{"given":"Jinbo","family":"Xu","sequence":"additional","affiliation":[{"name":"Toyota Technological Institute at Chicago IL 60637, USA"}]}],"member":"286","published-online":{"date-parts":[[2012,6,9]]},"reference":[{"key":"2023012512392433300_B1","first-page":"514","article-title":"Hardness results on local multiple alignment of biological sequences","volume":"2","author":"Akutsu","year":"2007","journal-title":"Inform. Media Technol."},{"key":"2023012512392433300_B2","doi-asserted-by":"crossref","first-page":"3389","DOI":"10.1093\/nar\/25.17.3389","article-title":"Gapped BLAST and PSI-BLAST: a new generation of protein database search programs","volume":"25","author":"Altschul","year":"1997","journal-title":"Nucleic Acids Res."},{"key":"2023012512392433300_B3","doi-asserted-by":"crossref","first-page":"D154","DOI":"10.1093\/nar\/gki070","article-title":"The universal protein resource (UniProt)","volume":"33","author":"Bairoch","year":"2005","journal-title":"Nucleic Acids Res."},{"key":"2023012512392433300_B4","doi-asserted-by":"crossref","first-page":"D138","DOI":"10.1093\/nar\/gkh121","article-title":"The Pfam protein families database","volume":"32","author":"Bateman","year":"2004","journal-title":"Nucleic Acids Res."},{"key":"2023012512392433300_B5","doi-asserted-by":"crossref","first-page":"807","DOI":"10.1093\/bioinformatics\/btn039","article-title":"De novo identification of highly diverged protein repeats by probabilistic consistency","volume":"24","author":"Biegert","year":"2008","journal-title":"Bioinformatics"},{"key":"2023012512392433300_B6","doi-asserted-by":"crossref","first-page":"3770","DOI":"10.1073\/pnas.0810767106","article-title":"Sequence context-specific profiles for homology searching","volume":"106","author":"Biegert","year":"2009","journal-title":"Proc. Natl Acad. Sci. USA"},{"key":"2023012512392433300_B7","doi-asserted-by":"crossref","first-page":"151","DOI":"10.1002\/prot.20284","article-title":"Relationship between multiple sequence alignments and quality of protein comparative models","volume":"58","author":"Cozzetto","year":"2005","journal-title":"Prot. Struct. Funct. Bioinformatics"},{"key":"2023012512392433300_B8","first-page":"703","volume-title":"Prob Cons: Probabilistic Consistency-Based Multiple Alignment of Amino Acid Sequences.","author":"Do","year":"2004"},{"key":"2023012512392433300_B9","doi-asserted-by":"crossref","first-page":"717","DOI":"10.1142\/S0219720007002734","article-title":"Incorporating homologues into sequence embeddings for protein analysis","volume":"5","author":"Eskin","year":"2007","journal-title":"J. Bioinformatics Comput. Biol."},{"key":"2023012512392433300_B10","doi-asserted-by":"crossref","first-page":"1443","DOI":"10.1126\/science.1604319","article-title":"Exhaustive matching of the entire protein sequence database","volume":"256","author":"Gonnet","year":"1992","journal-title":"Science"},{"key":"2023012512392433300_B11","volume-title":"Neural Networks: A Comprehensive Foundation.","author":"Haykin","year":"1999"},{"key":"2023012512392433300_B12","doi-asserted-by":"crossref","first-page":"10915","DOI":"10.1073\/pnas.89.22.10915","article-title":"Amino acid substitution matrices from protein blocks","volume":"89","author":"Henikoff","year":"1992","journal-title":"Proc. Natl Acad. Sci. USA"},{"key":"2023012512392433300_B13","doi-asserted-by":"crossref","first-page":"128","DOI":"10.1002\/prot.22499","article-title":"Fast and accurate automatic structure prediction with HHpred","volume":"77","author":"Hildebrand","year":"2009","journal-title":"Prot. Struct. Funct. Bioinformatics"},{"key":"2023012512392433300_B14","doi-asserted-by":"crossref","first-page":"123","DOI":"10.1006\/jmbi.1993.1489","article-title":"Protein structure comparison by alignment of distance matrices","volume":"233","author":"Holm","year":"1993","journal-title":"J. Mol. Biol."},{"key":"2023012512392433300_B15","first-page":"93","article-title":"Clustering of database sequences for fast homology search using upper bounds on alignment score","volume":"15","author":"Itoh","year":"2004","journal-title":"Genome Inform."},{"key":"2023012512392433300_B16","doi-asserted-by":"crossref","first-page":"W284","DOI":"10.1093\/nar\/gki418","article-title":"FFAS03: a server for profile\u2013profile sequence alignments","volume":"33","author":"Jaroszewski","year":"2005","journal-title":"Nucleic Acids Res."},{"key":"2023012512392433300_B17","doi-asserted-by":"crossref","first-page":"797","DOI":"10.1006\/jmbi.1999.2583","article-title":"GenTHREADER: an efficient and reliable protein fold recognition method for genomic sequences1","volume":"287","author":"Jones","year":"1999","journal-title":"J. Mol. Biol."},{"key":"2023012512392433300_B18","doi-asserted-by":"crossref","first-page":"2577","DOI":"10.1002\/bip.360221211","article-title":"Dictionary of protein secondary structure: pattern recognition of hydrogen-bonded and geometrical features","volume":"22","author":"Kabsch","year":"1983","journal-title":"Biopolymers"},{"key":"2023012512392433300_B19","doi-asserted-by":"crossref","first-page":"846","DOI":"10.1093\/bioinformatics\/14.10.846","article-title":"Hidden Markov models for detecting remote protein homologies","volume":"14","author":"Karplus","year":"1998","journal-title":"Bioinformatics"},{"key":"2023012512392433300_B20","doi-asserted-by":"crossref","first-page":"501","DOI":"10.1006\/jmbi.2000.3741","article-title":"Enhanced genome annotation using structural profiles in the program 3D-PSSM1","volume":"299","author":"Kelley","year":"2000","journal-title":"J. Mol. Biol."},{"key":"2023012512392433300_B21","doi-asserted-by":"crossref","first-page":"1602","DOI":"10.1093\/bioinformatics\/btp265","article-title":"Augmented training of hidden Markov models to recognize remote homologs via simulated evolution","volume":"25","author":"Kumar","year":"2009","journal-title":"Bioinformatics"},{"key":"2023012512392433300_B22","doi-asserted-by":"crossref","first-page":"745","DOI":"10.1093\/protein\/13.11.745","article-title":"ProSup: a refined tool for protein structure alignment","volume":"13","author":"Lackner","year":"2000","journal-title":"Prot. Engineer."},{"key":"2023012512392433300_B23","first-page":"282","volume-title":"Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data.","author":"Lafferty","year":"2001"},{"key":"2023012512392433300_B24","doi-asserted-by":"crossref","first-page":"503","DOI":"10.1007\/BF01589116","article-title":"On the limited memory BFGS method for large scale optimization","volume":"45","author":"Liu","year":"1989","journal-title":"Math. Program."},{"key":"2023012512392433300_B25","author":"Marcin","year":"2011","journal-title":"In-silico prediction of disorder content using hybrid sequence representation."},{"key":"2023012512392433300_B26","doi-asserted-by":"crossref","first-page":"1071","DOI":"10.1110\/ps.03379804","article-title":"Alignment of protein sequences by their profiles","volume":"13","author":"Marti Renom","year":"2004","journal-title":"Protein Sci."},{"key":"2023012512392433300_B27","doi-asserted-by":"crossref","first-page":"404","DOI":"10.1093\/bioinformatics\/16.4.404","article-title":"The PSIPRED protein structure prediction server","volume":"16","author":"McGuffin","year":"2000","journal-title":"Bioinformatics"},{"key":"2023012512392433300_B28","doi-asserted-by":"crossref","first-page":"677","DOI":"10.1089\/cmb.2010.0328","article-title":"Sequence alignment as hypothesis testing","volume":"18","author":"Meng","year":"2011","journal-title":"J. Comput. Biol."},{"key":"2023012512392433300_B29","doi-asserted-by":"crossref","first-page":"e10","DOI":"10.1371\/journal.pcbi.0040010","article-title":"Matt: local flexibility aids protein multiple structure alignment","volume":"4","author":"Menke","year":"2008","journal-title":"PLoS Comput. Biol."},{"key":"2023012512392433300_B30","author":"Mott","year":"2005","journal-title":"Smith\u2013Waterman Algorithm."},{"key":"2023012512392433300_B31","first-page":"1009","article-title":"Discrete profile alignment via constrained information bottleneck","volume":"17","author":"O'Rourke","year":"2005","journal-title":"Adv. Neural Inform. Processing Sys."},{"key":"2023012512392433300_B32","doi-asserted-by":"crossref","first-page":"700","DOI":"10.1093\/bioinformatics\/17.8.700","article-title":"AL2CO: calculation of positional conservation in a protein sequence alignment","volume":"17","author":"Pei","year":"2001","journal-title":"Bioinformatics"},{"key":"2023012512392433300_B33","first-page":"1419","article-title":"Conditional neural fields","volume":"22","author":"Peng","year":"2009","journal-title":"Adv. Neural Informat. Process. Syst."},{"key":"2023012512392433300_B34","first-page":"31","article-title":"Boosting Protein Threading Accuracy","volume-title":"Proceedings of the 13th Annual International Conference on Research in Computational Molecular Biology","author":"Peng","year":"2009"},{"key":"2023012512392433300_B35","doi-asserted-by":"crossref","first-page":"545","DOI":"10.1093\/protein\/13.8.545","article-title":"Structure-derived substitution matrices for alignment of distantly related sequences","volume":"13","author":"Prli","year":"2000","journal-title":"Prot. Engineer."},{"key":"2023012512392433300_B36","doi-asserted-by":"crossref","first-page":"881","DOI":"10.1002\/prot.20854","article-title":"SSALN: an alignment algorithm using structure dependent substitution matrices and gap penalties learned from structurally aligned protein pairs","volume":"62","author":"Qiu","year":"2006","journal-title":"Prot. Struct. Funct. Bioinformatics"},{"key":"2023012512392433300_B37","doi-asserted-by":"crossref","first-page":"W244","DOI":"10.1093\/nar\/gki408","article-title":"The HHpred interactive server for protein homology detection and structure prediction","volume":"33","author":"S\u00f6ding","year":"2005","journal-title":"Nucleic Acids Res."},{"key":"2023012512392433300_B38","doi-asserted-by":"crossref","first-page":"318","DOI":"10.1002\/prot.340230306","article-title":"Evaluation of comparative protein modeling by MODELLER","volume":"23","author":"\u0160ali","year":"1995","journal-title":"Prot. Struct. Funct. Bioinformatics"},{"key":"2023012512392433300_B39","first-page":"350","article-title":"Pair HMM based gap statistics for re-evaluation of indels in alignments with affine gap penalties","author":"Sch\u00f6nhuth","year":"2010","journal-title":"Proceedings of the WABI2010"},{"key":"2023012512392433300_B40","doi-asserted-by":"crossref","first-page":"243","DOI":"10.1006\/jmbi.2001.4762","article-title":"FUGUE: sequence-structure homology recognition using environment-specific substitution tables and structure-dependent gap penalties1","volume":"310","author":"Shi","year":"2001","journal-title":"J. Mol. Biol."},{"key":"2023012512392433300_B41","doi-asserted-by":"crossref","first-page":"364","DOI":"10.1186\/1471-2105-7-364","article-title":"Improving the quality of protein structure models by selecting from alignment alternatives","volume":"7","author":"Sommer","year":"2006","journal-title":"BMC Bioinformatics"},{"key":"2023012512392433300_B42","doi-asserted-by":"crossref","first-page":"587","DOI":"10.1002\/prot.21020","article-title":"Statistical potential based amino acid similarity matrices for aligning distantly related protein sequences","volume":"64","author":"Tan","year":"2006","journal-title":"Prot. Struct. Funct. Bioinformatics"},{"key":"2023012512392433300_B43","doi-asserted-by":"crossref","first-page":"260","DOI":"10.1109\/TIT.1967.1054010","article-title":"Error bounds for convolutional codes and an asymptotically optimum decoding algorithm","volume":"13","author":"Viterbi","year":"1967","journal-title":"Inform. Theory IEEE Transact."},{"key":"2023012512392433300_B44","doi-asserted-by":"crossref","first-page":"1089","DOI":"10.1145\/1553374.1553513","article-title":"BoltzRank: Learning to Maximize Expected Ranking Gain","volume-title":"Proceedings of the 26th Annual International Conference on Machine Learning","author":"Volkovs","year":"2009"},{"key":"2023012512392433300_B45","first-page":"339","article-title":"Simultaneous alignment and folding of protein sequences","volume-title":"Proceedings of the 13th Annual International Conference on Research in Computational Molecular Biology","author":"Waldisp\u00fchl","year":"2009"},{"key":"2023012512392433300_B46","doi-asserted-by":"crossref","first-page":"1589","DOI":"10.1093\/bioinformatics\/btg224","article-title":"PISCES: a protein sequence culling server","volume":"19","author":"Wang","year":"2003","journal-title":"Bioinformatics"},{"key":"2023012512392433300_B47","first-page":"109","article-title":"Protein 8-class secondary structure prediction using Conditional Neural Fields","author":"Wang","year":"2010","journal-title":"IEEE"},{"key":"2023012512392433300_B48","doi-asserted-by":"crossref","first-page":"2138","DOI":"10.1093\/bioinformatics\/bth195","article-title":"The DISOPRED server for the prediction of protein disorder","volume":"20","author":"Ward","year":"2004","journal-title":"Bioinformatics"},{"key":"2023012512392433300_B49","doi-asserted-by":"crossref","first-page":"547","DOI":"10.1002\/prot.21945","article-title":"MUSTER: improving protein sequence profile\u2013profile alignments by using multiple sources of structure information","volume":"72","author":"Wu","year":"2008","journal-title":"Prot. Struct. Funct. Bioinformatics"},{"key":"2023012512392433300_B50","doi-asserted-by":"crossref","first-page":"157","DOI":"10.1109\/TCBB.2005.24","article-title":"Fold recognition by predicted alignment accuracy","volume":"2","author":"Xu","year":"2005","journal-title":"IEEE\/ACM Trans. Computat. Biol. Bioinformatics"},{"key":"2023012512392433300_B51","doi-asserted-by":"crossref","first-page":"95","DOI":"10.1142\/S0219720003000186","article-title":"RAPTOR: optimal protein threading by linear programming","volume":"1","author":"Xu","year":"2003","journal-title":"Int. J. Bioinform. Comput. Biol."},{"key":"2023012512392433300_B52","doi-asserted-by":"crossref","first-page":"702","DOI":"10.1002\/prot.20264","article-title":"Scoring function for automated assessment of protein structure template quality","volume":"57","author":"Zhang","year":"2004","journal-title":"Prot. Struct. Funct. Bioinformatics"},{"key":"2023012512392433300_B53","doi-asserted-by":"crossref","first-page":"2302","DOI":"10.1093\/nar\/gki524","article-title":"TM-align: a protein structure alignment algorithm based on the TM-score","volume":"33","author":"Zhang","year":"2005","journal-title":"Nucleic Acids Res."},{"key":"2023012512392433300_B54","doi-asserted-by":"crossref","first-page":"e2325","DOI":"10.1371\/journal.pone.0002325","article-title":"SP5: improving protein fold recognition by using torsion angle profiles and profile-based gap penalty model","volume":"3","author":"Zhang","year":"2008","journal-title":"PLoS One"},{"key":"2023012512392433300_B55","doi-asserted-by":"crossref","first-page":"i310","DOI":"10.1093\/bioinformatics\/btq193","article-title":"Fragment-free approach to protein folding using conditional neural fields","volume":"26","author":"Zhao","year":"2010","journal-title":"Bioinformatics"},{"key":"2023012512392433300_B56","doi-asserted-by":"crossref","first-page":"152","DOI":"10.1002\/prot.20732","article-title":"SPARKS 2 and SP3 servers in CASP6","volume":"61","author":"Zhou","year":"2005","journal-title":"Prot. Struct. Funct. Bioinformatics"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/28\/12\/i59\/48883601\/bioinformatics_28_12_i59.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/28\/12\/i59\/48883601\/bioinformatics_28_12_i59.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,1,25]],"date-time":"2023-01-25T16:42:08Z","timestamp":1674664928000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/28\/12\/i59\/268198"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2012,6,9]]},"references-count":56,"journal-issue":{"issue":"12","published-print":{"date-parts":[[2012,6,15]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/bts213","relation":{},"ISSN":["1367-4811","1367-4803"],"issn-type":[{"value":"1367-4811","type":"electronic"},{"value":"1367-4803","type":"print"}],"subject":[],"published-other":{"date-parts":[[2012,6,15]]},"published":{"date-parts":[[2012,6,9]]}}}