{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,17]],"date-time":"2026-03-17T17:32:05Z","timestamp":1773768725332,"version":"3.50.1"},"reference-count":28,"publisher":"Oxford University Press (OUP)","issue":"23","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2005,12,1]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:p>Motivation: Protein remote homology detection is a central problem in computational biology. Supervised learning algorithms based on support vector machines are currently one of the most effective methods for remote homology detection. The performance of these methods depends on how the protein sequences are modeled and on the method used to compute the kernel function between them.<\/jats:p>\n               <jats:p>Results: We introduce two classes of kernel functions that are constructed by combining sequence profiles with new and existing approaches for determining the similarity between pairs of protein sequences. These kernels are constructed directly from these explicit protein similarity measures and employ effective profile-to-profile scoring schemes for measuring the similarity between pairs of proteins. Experiments with remote homology detection and fold recognition problems show that these kernels are capable of producing results that are substantially better than those produced by all of the existing state-of-the-art SVM-based methods. In addition, the experiments show that these kernels, even when used in the absence of profiles, produce results that are better than those produced by existing non-profile-based schemes.<\/jats:p>\n               <jats:p>Availability: The programs for computing the various kernel functions are available on request from the authors.<\/jats:p>\n               <jats:p>Contact: \u00a0karypis@cs.umn.edu<\/jats:p>","DOI":"10.1093\/bioinformatics\/bti687","type":"journal-article","created":{"date-parts":[[2005,9,28]],"date-time":"2005-09-28T03:23:31Z","timestamp":1127877811000},"page":"4239-4247","source":"Crossref","is-referenced-by-count":123,"title":["Profile-based direct kernels for remote homology detection and fold recognition"],"prefix":"10.1093","volume":"21","author":[{"given":"Huzefa","family":"Rangwala","sequence":"first","affiliation":[]},{"given":"George","family":"Karypis","sequence":"additional","affiliation":[]}],"member":"286","published-online":{"date-parts":[[2005,9,27]]},"reference":[{"key":"2023061010032889700_b1","doi-asserted-by":"crossref","first-page":"3389","DOI":"10.1093\/nar\/25.17.3389","article-title":"Gapped blast and PSI-blast: a new generation of protein database search programs","volume":"25","author":"Altschul","year":"1997","journal-title":"Nucleic Acids Res."},{"key":"2023061010032889700_b2","doi-asserted-by":"crossref","first-page":"1053","DOI":"10.1073\/pnas.91.3.1059","article-title":"Hidden Markov models of biological primary sequence information","volume":"91","author":"Baldi","year":"1994","journal-title":"Proc. Natl Acad. Sci. USA"},{"key":"2023061010032889700_b3","doi-asserted-by":"crossref","first-page":"i26","DOI":"10.1093\/bioinformatics\/btg1002","article-title":"Remote homology detection: a motif based approach","volume":"19","author":"Ben-Hur","year":"2003","journal-title":"Bioinformatics"},{"key":"2023061010032889700_b4","doi-asserted-by":"crossref","first-page":"705","DOI":"10.1016\/0022-2836(82)90398-9","article-title":"An improved algorithm for matching biological sequences","volume":"162","author":"Gotoh","year":"1982","journal-title":"J. Mol. Biol."},{"key":"2023061010032889700_b5","doi-asserted-by":"crossref","first-page":"355","DOI":"10.1073\/pnas.84.13.4355","article-title":"Profile analysis: detection of distantly related proteins","volume":"84","author":"Gribskov","year":"1987","journal-title":"Proc. Natl Acad. Sci. USA"},{"key":"2023061010032889700_b6","doi-asserted-by":"crossref","first-page":"25","DOI":"10.1016\/S0097-8485(96)80004-0","article-title":"Use of receiver operating characteristic (ROC) analysis to evaluate sequence matching","volume":"20","author":"Gribskov","year":"1996","journal-title":"Comput. Chem."},{"key":"2023061010032889700_b7","doi-asserted-by":"crossref","DOI":"10.1017\/CBO9780511574931","volume-title":"Algorithms on Strings, Trees, and Sequences: Computer Science and Computational Biology","author":"Gusfield","year":"1997"},{"key":"2023061010032889700_b8","doi-asserted-by":"crossref","first-page":"272","DOI":"10.1093\/bioinformatics\/17.3.272","article-title":"Picasso: generating a covering set of protein family profiles","volume":"17","author":"Heger","year":"2001","journal-title":"Bioinformatics"},{"key":"2023061010032889700_b9","doi-asserted-by":"crossref","first-page":"10915","DOI":"10.1073\/pnas.89.22.10915","article-title":"Amino acid subsitution matrices from protein blocks","volume":"89","author":"Henikoff","year":"1992","journal-title":"Proc. Natl Acad. Sci. USA"},{"key":"2023061010032889700_b10","doi-asserted-by":"crossref","first-page":"2294","DOI":"10.1093\/bioinformatics\/btg317","article-title":"Efficient remote homology detection using local structure","volume":"19","author":"Hou","year":"2003","journal-title":"Bioinformatics"},{"key":"2023061010032889700_b11","doi-asserted-by":"crossref","first-page":"518","DOI":"10.1002\/prot.20221","article-title":"Remote homolog detection using local sequence\u2013structure correlations","volume":"57","author":"Hou","year":"2004","journal-title":"Proteins"},{"key":"2023061010032889700_b12","doi-asserted-by":"crossref","first-page":"95","DOI":"10.1089\/10665270050081405","article-title":"A discriminative framework for detecting remote protein homologies","volume":"7","author":"Jaakkola","year":"2000","journal-title":"J. Comput. Biol."},{"key":"2023061010032889700_b13","article-title":"Making large-scale svm learning practical","volume-title":"Advances in Kernel Methods\u2014Support Vector Learning","author":"Joachims","year":"1999"},{"key":"2023061010032889700_b14","doi-asserted-by":"crossref","first-page":"846","DOI":"10.1093\/bioinformatics\/14.10.846","article-title":"Hidden Markov models for detecting remote protein homologies","volume":"14","author":"Karplus","year":"1998","journal-title":"Bioinformatics"},{"key":"2023061010032889700_b15","doi-asserted-by":"crossref","first-page":"1501","DOI":"10.1006\/jmbi.1994.1104","article-title":"Hidden Markov models in computational biology: applications to protein modeling","volume":"235","author":"Krogh","year":"1994","journal-title":"J. Mol. Biol."},{"key":"2023061010032889700_b16","doi-asserted-by":"crossref","first-page":"527","DOI":"10.1142\/S021972000500120X","article-title":"Profile-based string kernels for remote homology detection and motif extraction","volume":"3","author":"Kuang","year":"2005","journal-title":"J. Bioinform. Comput. Biol."},{"key":"2023061010032889700_b17","first-page":"564","article-title":"The spectrum kernel: a string kernel for SVM protein classification","author":"Leslie","year":"2002","journal-title":"Pac. Symp. Biocomput."},{"key":"2023061010032889700_b18","first-page":"467","article-title":"Mismatch string kernels for svm protein classification","volume":"20","author":"Leslie","year":"2003","journal-title":"Adv. Neural Inf. Process. Syst."},{"key":"2023061010032889700_b19","first-page":"225","article-title":"Combining pairwise sequence similarity and support vector machines for detecting remote protein evolutionary and structural relationships","author":"Liao","year":"2002"},{"key":"2023061010032889700_b20","doi-asserted-by":"crossref","first-page":"1071","DOI":"10.1110\/ps.03379804","article-title":"Alignment of protein sequences by their profiles","volume":"13","author":"Marti-Renom","year":"2004","journal-title":"Protein Sci."},{"key":"2023061010032889700_b21","doi-asserted-by":"crossref","first-page":"1531","DOI":"10.1093\/bioinformatics\/btg185","article-title":"Probabilistic scoring measures for profile\u2013profile comparison yield more accurate short seed alignments","volume":"19","author":"Mittelman","year":"2003","journal-title":"Bioinformatics"},{"key":"2023061010032889700_b22","doi-asserted-by":"crossref","first-page":"536","DOI":"10.1016\/S0022-2836(05)80134-2","article-title":"SCOP: a structural classification of proteins database for the investigation of sequences and structures","volume":"247","author":"Murzin","year":"1995","journal-title":"J. Mol. Biol."},{"key":"2023061010032889700_b23","volume-title":"Combinatorial Optimization: Algorithms and Complexity","author":"Papadimitriou","year":"1982"},{"key":"2023061010032889700_b24","doi-asserted-by":"crossref","first-page":"1682","DOI":"10.1093\/bioinformatics\/bth141","article-title":"Protein homology detection using string alignment kernels","volume":"20","author":"Saigo","year":"2004","journal-title":"Bioinformatics"},{"key":"2023061010032889700_b25","volume-title":"Learning with Kernels","author":"Scholkopf","year":"2002"},{"key":"2023061010032889700_b26","doi-asserted-by":"crossref","first-page":"195","DOI":"10.1016\/0022-2836(81)90087-5","article-title":"Identification of common molecular subsequences","volume":"147","author":"Smith","year":"1981","journal-title":"J. Mol. Biol."},{"key":"2023061010032889700_b27","volume-title":"Statistical Learning Theory","author":"Vapnik","year":"1998"},{"key":"2023061010032889700_b28","doi-asserted-by":"crossref","first-page":"1612","DOI":"10.1110\/ps.03601504","article-title":"Scoring profile-to-profile sequence alignments","volume":"13","author":"Wang","year":"2004","journal-title":"Protein Sci."}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/21\/23\/4239\/50567321\/bioinformatics_21_23_4239.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/21\/23\/4239\/50567321\/bioinformatics_21_23_4239.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,6,10]],"date-time":"2023-06-10T10:04:05Z","timestamp":1686391445000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/21\/23\/4239\/194750"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2005,9,27]]},"references-count":28,"journal-issue":{"issue":"23","published-print":{"date-parts":[[2005,12,1]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/bti687","relation":{},"ISSN":["1367-4803","1367-4811"],"issn-type":[{"value":"1367-4803","type":"print"},{"value":"1367-4811","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2005,12]]},"published":{"date-parts":[[2005,9,27]]}}}