{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,2]],"date-time":"2026-04-02T16:25:36Z","timestamp":1775147136371,"version":"3.50.1"},"reference-count":44,"publisher":"Oxford University Press (OUP)","issue":"12","license":[{"start":{"date-parts":[[2016,10,28]],"date-time":"2016-10-28T00:00:00Z","timestamp":1477612800000},"content-version":"vor","delay-in-days":139,"URL":"http:\/\/creativecommons.org\/licenses\/by-nc\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2016,6,15]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:p>Motivation : Protein homology detection, a fundamental problem in computational biology, is an indispensable step toward predicting protein structures and understanding protein functions. Despite the advances in recent decades on sequence alignment, threading and alignment-free methods, protein homology detection remains a challenging open problem. Recently, network methods that try to find transitive paths in the protein structure space demonstrate the importance of incorporating network information of the structure space. Yet, current methods merge the sequence space and the structure space into a single space, and thus introduce inconsistency in combining different sources of information.<\/jats:p>\n               <jats:p>Method : We present a novel network-based protein homology detection method, CMsearch, based on cross-modal learning. Instead of exploring a single network built from the mixture of sequence and structure space information, CMsearch builds two separate networks to represent the sequence space and the structure space. It then learns sequence\u2013structure correlation by simultaneously taking sequence information, structure information, sequence space information and structure space information into consideration.<\/jats:p>\n               <jats:p>Results: We tested CMsearch on two challenging tasks, protein homology detection and protein structure prediction, by querying all 8332 PDB40 proteins. Our results demonstrate that CMsearch is insensitive to the similarity metrics used to define the sequence and the structure spaces. By using HMM\u2013HMM alignment as the sequence similarity metric, CMsearch clearly outperforms state-of-the-art homology detection methods and the CASP-winning template-based protein structure prediction methods.<\/jats:p>\n               <jats:p>Availability and implementation: Our program is freely available for download from http:\/\/sfb.kaust.edu.sa\/Pages\/Software.aspx .<\/jats:p>\n               <jats:p>Contact : xin.gao@kaust.edu.sa<\/jats:p>\n               <jats:p>Supplementary information : Supplementary data are available at Bioinformatics online.<\/jats:p>","DOI":"10.1093\/bioinformatics\/btw271","type":"journal-article","created":{"date-parts":[[2016,6,15]],"date-time":"2016-06-15T15:43:52Z","timestamp":1466005432000},"page":"i332-i340","source":"Crossref","is-referenced-by-count":79,"title":["CMsearch: simultaneous exploration of protein sequence space and structure space improves not only protein homology detection but also protein structure prediction"],"prefix":"10.1093","volume":"32","author":[{"given":"Xuefeng","family":"Cui","sequence":"first","affiliation":[{"name":"1 King Abdullah University of Science and Technology (KAUST), Computational Bioscience Research Center (CBRC), Computer, Electrical and Mathematical Sciences and Engineering (CEMSE) Division, Thuwal 23955-6900, Saudi Arabia"}]},{"given":"Zhiwu","family":"Lu","sequence":"additional","affiliation":[{"name":"2 Beijing Key Laboratory of Big Data Management and Analysis Methods, School of Information, Renmin University of China, Beijing 100872, China"}]},{"given":"Sheng","family":"Wang","sequence":"additional","affiliation":[{"name":"3 Toyota Technological Institute at Chicago, 6045 Kenwood Avenue, Chicago, IL 60637, USA"},{"name":"4 Department of Human Genetics, University of Chicago, E. 58th St, Chicago, IL 60637, USA"}]},{"given":"Jim","family":"Jing-Yan Wang","sequence":"additional","affiliation":[{"name":"1 King Abdullah University of Science and Technology (KAUST), Computational Bioscience Research Center (CBRC), Computer, Electrical and Mathematical Sciences and Engineering (CEMSE) Division, Thuwal 23955-6900, Saudi Arabia"}]},{"given":"Xin","family":"Gao","sequence":"additional","affiliation":[{"name":"1 King Abdullah University of Science and Technology (KAUST), Computational Bioscience Research Center (CBRC), Computer, Electrical and Mathematical Sciences and Engineering (CEMSE) Division, Thuwal 23955-6900, Saudi Arabia"}]}],"member":"286","published-online":{"date-parts":[[2016,6,11]]},"reference":[{"key":"2023020112313752200_btw271-B1","doi-asserted-by":"crossref","first-page":"3389","DOI":"10.1093\/nar\/25.17.3389","article-title":"Gapped BLAST and PSI-BLAST: a new generation of protein database search programs","volume":"25","author":"Altschul","year":"1997","journal-title":"Nucleic Acids Res"},{"key":"2023020112313752200_btw271-B2","doi-asserted-by":"crossref","first-page":"195","DOI":"10.1093\/bioinformatics\/bti770","article-title":"The SWISS-MODEL workspace: a web-based environment for protein structure homology modelling","volume":"22","author":"Arnold","year":"2006","journal-title":"Bioinformatics"},{"key":"2023020112313752200_btw271-B3","doi-asserted-by":"crossref","first-page":"820","DOI":"10.1145\/361573.361582","article-title":"Solution of the matrix equation AX+ XB= C [F4]","volume":"15","author":"Bartels","year":"1972","journal-title":"Commun. ACM"},{"key":"2023020112313752200_btw271-B4","doi-asserted-by":"crossref","first-page":"i26","DOI":"10.1093\/bioinformatics\/btg1002","article-title":"Remote homology detection: a motif based approach","volume":"19 (Suppl 1)","author":"Ben-Hur","year":"2003","journal-title":"Bioinformatics"},{"key":"2023020112313752200_btw271-B5","doi-asserted-by":"crossref","first-page":"1456","DOI":"10.1093\/bioinformatics\/btl102","article-title":"A machine learning information retrieval approach to protein fold recognition","volume":"22","author":"Cheng","year":"2006","journal-title":"Bioinformatics"},{"key":"2023020112313752200_btw271-B6","doi-asserted-by":"crossref","first-page":"113","DOI":"10.1186\/1471-2105-8-113","article-title":"Improved residue contact prediction using support vector machines and a large feature set","volume":"8","author":"Cheng","year":"2007","journal-title":"BMC Bioinform"},{"key":"2023020112313752200_btw271-B7","author":"Cui","year":"2013"},{"key":"2023020112313752200_btw271-B8","author":"Cui","year":"2015"},{"key":"2023020112313752200_btw271-B9","doi-asserted-by":"crossref","first-page":"i133","DOI":"10.1093\/bioinformatics\/btv242","article-title":"Finding optimal interaction interface alignments between biological complexes","volume":"31","author":"Cui","year":"2015","journal-title":"Bioinformatics"},{"key":"2023020112313752200_btw271-B10","doi-asserted-by":"crossref","first-page":"1216","DOI":"10.1093\/bioinformatics\/bts110","article-title":"SMURFLite: combining simplified Markov random fields with simulated evolution improves remote homology detection for beta-structural proteins into the twilight zone","volume":"28","author":"Daniels","year":"2012","journal-title":"Bioinformatics"},{"key":"2023020112313752200_btw271-B11","author":"Davis","year":"2006"},{"key":"2023020112313752200_btw271-B12","first-page":"12.","article-title":"Random walks and electric networks","volume":"10","author":"Doyle","year":"1984","journal-title":"AMC"},{"key":"2023020112313752200_btw271-B13","first-page":"W29","article-title":"Comparative protein structure modeling using Modeller","volume":"39","author":"Eswar","year":"2006","journal-title":"Curr. Protoc. Bioinform"},{"key":"2023020112313752200_btw271-B14","first-page":"gkr367.","article-title":"HMMER web server: interactive sequence similarity searching","author":"Finn","year":"2011","journal-title":"Nucleic Acids Res"},{"key":"2023020112313752200_btw271-B15","doi-asserted-by":"crossref","first-page":"28","DOI":"10.1186\/1472-6807-9-28","article-title":"Improving consensus contact prediction via server correlation reduction","volume":"9","author":"Gao","year":"2009","journal-title":"BMC Struct. Biol"},{"key":"2023020112313752200_btw271-B16","doi-asserted-by":"crossref","first-page":"bat031","DOI":"10.1093\/database\/bat031","article-title":"The Protein Model Portal - a comprehensive resource for protein structure and model information","volume":"2013","author":"Haas","year":"2013","journal-title":"Database"},{"key":"2023020112313752200_btw271-B17","doi-asserted-by":"crossref","first-page":"128","DOI":"10.1002\/prot.22499","article-title":"Fast and accurate automatic structure prediction with HHpred","volume":"77","author":"Hildebrand","year":"2009","journal-title":"Proteins"},{"key":"2023020112313752200_btw271-B18","doi-asserted-by":"crossref","first-page":"17573.","DOI":"10.1038\/srep17573","article-title":"Improving protein fold recognition by deep learning networks","volume":"5","author":"Jo","year":"2015","journal-title":"Sci. Rep"},{"key":"2023020112313752200_btw271-B19","doi-asserted-by":"crossref","first-page":"195","DOI":"10.1006\/jmbi.1999.3091","article-title":"Protein secondary structure prediction based on position-specific scoring matrices","volume":"292","author":"Jones","year":"1999","journal-title":"J. Mol. Biol"},{"key":"2023020112313752200_btw271-B20","doi-asserted-by":"crossref","first-page":"1511","DOI":"10.1038\/nprot.2012.085","article-title":"Template-based protein structure modeling using the RaptorX web server","volume":"7","author":"K\u00e4llberg","year":"2012","journal-title":"Nat. Protoc"},{"key":"2023020112313752200_btw271-B21","doi-asserted-by":"crossref","first-page":"846","DOI":"10.1093\/bioinformatics\/14.10.846","article-title":"Hidden markov models for detecting remote protein homologies","volume":"14","author":"Karplus","year":"1998","journal-title":"Bioinformatics"},{"key":"2023020112313752200_btw271-B22","doi-asserted-by":"crossref","first-page":"379","DOI":"10.1016\/0022-2836(71)90324-X","article-title":"The interpretation of protein structures: estimation of static accessibility","volume":"55","author":"Lee","year":"1971","journal-title":"J. Mol. Biol"},{"key":"2023020112313752200_btw271-B23","first-page":"btv125.","article-title":"A new method to improve network topological similarity search: applied to fold recognition","author":"Lhota","year":"2015","journal-title":"Bioinformatics"},{"key":"2023020112313752200_btw271-B24","doi-asserted-by":"crossref","first-page":"472","DOI":"10.1093\/bioinformatics\/btt709","article-title":"Combining evolutionary information extracted from frequency profiles with sequence-based kernels for protein remote homology detection","volume":"30","author":"Liu","year":"2014","journal-title":"Bioinformatics"},{"key":"2023020112313752200_btw271-B25","doi-asserted-by":"crossref","first-page":"306","DOI":"10.1007\/s11263-012-0602-z","article-title":"Exhaustive and efficient constraint propagation: a graph-based learning approach and its applications","volume":"103","author":"Lu","year":"2013","journal-title":"Int. J. Comput. Vision"},{"key":"2023020112313752200_btw271-B26","doi-asserted-by":"crossref","first-page":"i59","DOI":"10.1093\/bioinformatics\/bts213","article-title":"A conditional neural fields model for protein threading","volume":"28","author":"Ma","year":"2012","journal-title":"Bioinformatics"},{"key":"2023020112313752200_btw271-B27","doi-asserted-by":"crossref","first-page":"e1003500","DOI":"10.1371\/journal.pcbi.1003500","article-title":"MRFalign: protein homology detection through alignment of Markov random fields","volume":"10","author":"Ma","year":"2014","journal-title":"PLoS Comput. Biol"},{"key":"2023020112313752200_btw271-B28","doi-asserted-by":"crossref","first-page":"1072","DOI":"10.1038\/nbt.2419","article-title":"Protein structure prediction from sequence variation","volume":"30","author":"Marks","year":"2012","journal-title":"Nat. Biotechnol"},{"key":"2023020112313752200_btw271-B29","doi-asserted-by":"crossref","first-page":"e1001047\u2013e1001047","DOI":"10.1371\/journal.pcbi.1001047","article-title":"Detecting remote evolutionary relationships among proteins by large-scale semantic embedding","volume":"7","author":"Melvin","year":"2011","journal-title":"PLoS Comput. Biol"},{"key":"2023020112313752200_btw271-B30","doi-asserted-by":"crossref","first-page":"i444","DOI":"10.1093\/bioinformatics\/bts398","article-title":"Protein domain recurrence and order can enhance prediction of protein functions","volume":"28","author":"Messih","year":"2012","journal-title":"Bioinformatics"},{"key":"2023020112313752200_btw271-B31","doi-asserted-by":"crossref","first-page":"11691","DOI":"10.1073\/pnas.1403395111","article-title":"Global view of the protein universe","volume":"111","author":"Nepomnyachiy","year":"2014","journal-title":"Proc. Natl. Acad. Sci. USA"},{"key":"2023020112313752200_btw271-B32","doi-asserted-by":"crossref","first-page":"1201","DOI":"10.1006\/jmbi.1998.2221","article-title":"Sequence comparisons using multiple sequences detect three times as many remote homologues as pairwise methods","volume":"284","author":"Park","year":"1998","journal-title":"J. Mol. Biol"},{"key":"2023020112313752200_btw271-B33","doi-asserted-by":"crossref","first-page":"197","DOI":"10.1002\/prot.21583","article-title":"The X-ray crystallographic structure and activity analysis of a Pseudomonas-specific subfamily of the HAD enzyme superfamily evidences a novel biochemical function","volume":"70","author":"Peisach","year":"2008","journal-title":"Proteins"},{"key":"2023020112313752200_btw271-B34","doi-asserted-by":"crossref","first-page":"610","DOI":"10.1111\/j.1365-2958.2007.05932.x","article-title":"A putative house-cleaning enzyme encoded within an integron array: 1.8 \n              \u00c5\n               crystal structure defines a new MazG subtype","volume":"66","author":"Robinson","year":"2007","journal-title":"Mol. Microbiol"},{"key":"2023020112313752200_btw271-B35","doi-asserted-by":"crossref","first-page":"725","DOI":"10.1038\/nprot.2010.5","article-title":"I-TASSER: a unified platform for automated protein structure and function prediction","volume":"5","author":"Roy","year":"2010","journal-title":"Nat. Protoc"},{"key":"2023020112313752200_btw271-B36","doi-asserted-by":"crossref","first-page":"951","DOI":"10.1093\/bioinformatics\/bti125","article-title":"Protein homology detection by HMM\u2013HMM comparison","volume":"21","author":"S\u00f6ding","year":"2005","journal-title":"Bioinformatics"},{"key":"2023020112313752200_btw271-B37","doi-asserted-by":"crossref","first-page":"1589","DOI":"10.1093\/bioinformatics\/btg224","article-title":"Pisces: a protein sequence culling server","volume":"19","author":"Wang","year":"2003","journal-title":"Bioinformatics"},{"key":"2023020112313752200_btw271-B38","doi-asserted-by":"crossref","first-page":"S2.","DOI":"10.1186\/1471-2105-13-S7-S2","article-title":"ProDis-ContSHC: learning protein dissimilarity measures and hierarchical context coherently for protein-protein comparison in protein database retrieval","volume":"13 (Suppl 7)","author":"Wang","year":"2012","journal-title":"BMC Bioinform"},{"key":"2023020112313752200_btw271-B39","doi-asserted-by":"crossref","first-page":"307.","DOI":"10.1186\/1471-2105-13-307","article-title":"Multiple graph regularized protein domain ranking","volume":"13","author":"Wang","year":"2012","journal-title":"BMC Bioinform"},{"key":"2023020112313752200_btw271-B40","doi-asserted-by":"crossref","first-page":"547","DOI":"10.1002\/prot.21945","article-title":"MUSTER: improving protein sequence profile\u2013profile alignments by using multiple sources of structure information","volume":"72","author":"Wu","year":"2008","journal-title":"Proteins"},{"key":"2023020112313752200_btw271-B41","doi-asserted-by":"crossref","first-page":"889","DOI":"10.1093\/bioinformatics\/btq066","article-title":"How significant is a protein structure similarity with TM-score=0.5?","volume":"26","author":"Xu","year":"2010","journal-title":"Bioinformatics"},{"key":"2023020112313752200_btw271-B42","doi-asserted-by":"crossref","first-page":"3370","DOI":"10.1093\/nar\/gkg571","article-title":"LGA: a method for finding 3D similarities in protein structures","volume":"31","author":"Zemla","year":"2003","journal-title":"Nucleic Acids Res"},{"key":"2023020112313752200_btw271-B43","doi-asserted-by":"crossref","first-page":"702","DOI":"10.1002\/prot.20264","article-title":"Scoring function for automated assessment of protein structure template quality","volume":"57","author":"Zhang","year":"2004","journal-title":"Proteins"},{"key":"2023020112313752200_btw271-B44","first-page":"321","article-title":"Learning with local and global consistency","author":"Zhou","year":"2004","journal-title":"Adv. Neural Inf. Process. Syst"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/32\/12\/i332\/49021415\/bioinformatics_32_12_i332.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/32\/12\/i332\/49021415\/bioinformatics_32_12_i332.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,2,1]],"date-time":"2023-02-01T22:38:04Z","timestamp":1675291084000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/32\/12\/i332\/2288851"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2016,6,11]]},"references-count":44,"journal-issue":{"issue":"12","published-print":{"date-parts":[[2016,6,15]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btw271","relation":{},"ISSN":["1367-4811","1367-4803"],"issn-type":[{"value":"1367-4811","type":"electronic"},{"value":"1367-4803","type":"print"}],"subject":[],"published-other":{"date-parts":[[2016,6,15]]},"published":{"date-parts":[[2016,6,11]]}}}