{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2023,8,31]],"date-time":"2023-08-31T14:07:29Z","timestamp":1693490849670},"reference-count":38,"publisher":"Oxford University Press (OUP)","issue":"13","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2006,7,1]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:p>Motivation: A global view of the protein space is essential for functional and evolutionary analysis of proteins. In order to achieve this, a similarity network can be built using pairwise relationships among proteins. However, existing similarity networks employ a single similarity measure and therefore their utility depends highly on the quality of the selected measure. A more robust representation of the protein space can be realized if multiple sources of information are used.<\/jats:p>\n               <jats:p>Results: We propose a novel approach for analyzing multi-attribute similarity networks by combining random walks on graphs with Bayesian theory. A multi-attribute network is created by combining sequence and structure based similarity measures. For each attribute of the similarity network, one can compute a measure of affinity from a given protein to every other protein in the network using random walks. This process makes use of the implicit clustering information of the similarity network, and we show that it is superior to naive, local ranking methods. We then combine the computed affinities using a Bayesian framework. In particular, when we train a Bayesian model for automated classification of a novel protein, we achieve high classification accuracy and outperform single attribute networks. In addition, we demonstrate the effectiveness of our technique by comparison with a competing kernel-based information integration approach.<\/jats:p>\n               <jats:p>Availability: Source code is available upon request from the primary author.<\/jats:p>\n               <jats:p>Contact: \u00a0orhan@cs.ucsb.edu<\/jats:p>\n               <jats:p>Supplementary Information: Supplementary data are available on Bioinformatic online.<\/jats:p>","DOI":"10.1093\/bioinformatics\/btl130","type":"journal-article","created":{"date-parts":[[2006,4,5]],"date-time":"2006-04-05T00:24:30Z","timestamp":1144196670000},"page":"1585-1592","source":"Crossref","is-referenced-by-count":15,"title":["Integrating multi-attribute similarity networks for robust representation of the protein space"],"prefix":"10.1093","volume":"22","author":[{"given":"Orhan","family":"\u00c7amo\u011flu","sequence":"first","affiliation":[{"name":"Department of Computer Science, University of California 1 \u00a0 1 \u00a0 \u00a0 Santa Barbara, CA 93106, USA"}]},{"given":"Tolga","family":"Can","sequence":"additional","affiliation":[{"name":"Department of Computer Engineering, Middle East Technical University 2 \u00a0 2 \u00a0 \u00a0 06531, Ankara, Turkey"}]},{"given":"Ambuj K.","family":"Singh","sequence":"additional","affiliation":[{"name":"Department of Computer Science, University of California 1 \u00a0 1 \u00a0 \u00a0 Santa Barbara, CA 93106, USA"}]}],"member":"286","published-online":{"date-parts":[[2006,4,4]]},"reference":[{"key":"2023012408333731700_b1","doi-asserted-by":"crossref","first-page":"444","DOI":"10.1016\/S0968-0004(98)01298-5","article-title":"Iterated profile searches with PSI-BLAST\u2013a tool for discovery in protein databases","volume":"23","author":"Altschul","year":"1998","journal-title":"Trends Biochem. Sci."},{"key":"2023012408333731700_b2","doi-asserted-by":"crossref","first-page":"D138","DOI":"10.1093\/nar\/gkh121","article-title":"The Pfam protein families database","volume":"32","author":"Bateman","year":"2004","journal-title":"Nucleic Acids Res."},{"key":"2023012408333731700_b3","doi-asserted-by":"crossref","first-page":"235","DOI":"10.1093\/nar\/28.1.235","article-title":"The protein data bank","volume":"28","author":"Berman","year":"2000","journal-title":"Nucleic Acids Res."},{"key":"2023012408333731700_b4","doi-asserted-by":"crossref","DOI":"10.1002\/0471200581","volume-title":"Queueing Networks and Markov Chains: Modeling and Performance Evaluation with Computer Science Applications","author":"Bolch","year":"1998"},{"key":"2023012408333731700_b5","doi-asserted-by":"crossref","first-page":"D189","DOI":"10.1093\/nar\/gkh034","article-title":"The ASTRAL compendium in 2004","volume":"32","author":"Chandonia","year":"2004","journal-title":"Nucleic Acids Res."},{"key":"2023012408333731700_b6","volume-title":"LIBSVM: a library for support vector machines","author":"Chang","year":"2001"},{"key":"2023012408333731700_b7","doi-asserted-by":"crossref","first-page":"463","DOI":"10.1089\/1066527041410346","article-title":"An integrated probabilistic model for functional prediction of proteins","volume":"11","author":"Deng","year":"2004","journal-title":"J. Comput. Biol."},{"key":"2023012408333731700_b8","doi-asserted-by":"crossref","first-page":"755","DOI":"10.1093\/bioinformatics\/14.9.755","article-title":"Profile hidden markov models","volume":"14","author":"Eddy","year":"1998","journal-title":"Bioinformatics"},{"key":"2023012408333731700_b9","doi-asserted-by":"crossref","first-page":"1897","DOI":"10.1107\/S0907444902015160","article-title":"The SUPERFAMILY database in structural genomics","volume":"58","author":"Gough","year":"2002","journal-title":"Acta Crystallogr. D. Biol. Crystallogr."},{"key":"2023012408333731700_b10","article-title":"Convolution Kernels on Discrete Structures","volume-title":"Technical Report UCSC-CLR-99-10","author":"Haussler","year":"1999"},{"key":"2023012408333731700_b11","doi-asserted-by":"crossref","first-page":"123","DOI":"10.1006\/jmbi.1993.1489","article-title":"Protein structure comparison by alignment of distance matrices","volume":"233","author":"Holm","year":"1993","journal-title":"J. Mol. Biol."},{"key":"2023012408333731700_b12","doi-asserted-by":"crossref","first-page":"595","DOI":"10.1126\/science.273.5275.595","article-title":"Mapping the protein universe","volume":"273","author":"Holm","year":"1996","journal-title":"Science"},{"key":"2023012408333731700_b13","doi-asserted-by":"crossref","first-page":"2386","DOI":"10.1073\/pnas.2628030100","article-title":"A global representation of the protein fold space","volume":"100","author":"Hou","year":"2003","journal-title":"Proc. Natl Acad. Sci. USA"},{"key":"2023012408333731700_b14","doi-asserted-by":"crossref","DOI":"10.1007\/978-1-4757-3502-4","volume-title":"Bayesian networks and decision graphs","author":"Jensen","year":"2001"},{"key":"2023012408333731700_b15","doi-asserted-by":"crossref","first-page":"3711","DOI":"10.1093\/bioinformatics\/bti608","article-title":"Motif-based protein ranking by network propagation","volume":"21","author":"Kuang","year":"2005","journal-title":"Bioinformatics"},{"key":"2023012408333731700_b16","first-page":"300","article-title":"Kernel-based data fusion and its application to protein function prediction in yeast","author":"Lanckriet","year":"2004","journal-title":"Pac. Symp. Biocomput."},{"key":"2023012408333731700_b17","doi-asserted-by":"crossref","first-page":"857","DOI":"10.1089\/106652703322756113","article-title":"Combining pairwise sequence similarity and support vector machines for detecting remote protein evolutionary and structural relationships","volume":"10","author":"Liao","year":"2003","journal-title":"J. Comput. Biol."},{"key":"2023012408333731700_b18","doi-asserted-by":"crossref","first-page":"613","DOI":"10.1006\/jmbi.1999.3377","article-title":"Identification of related proteins on family, superfamily and fold level","volume":"295","author":"Lindahl","year":"2000","journal-title":"J. Mol. Biol."},{"key":"2023012408333731700_b19","doi-asserted-by":"crossref","first-page":"5","DOI":"10.1016\/S1367-5931(02)00003-0","article-title":"Domains, motifs and clusters in the protein universe","volume":"7","author":"Liu","year":"2003","journal-title":"Curr. Opin. Chem. Biol."},{"key":"2023012408333731700_b20","first-page":"563","article-title":"Text classification using string kernels","author":"Lodhi","year":"2000"},{"key":"2023012408333731700_b21","first-page":"353","article-title":"Random walks on graphs: a survey","volume-title":"Combinatorics, Paul Erdos is Eighty","author":"Lovasz","year":"1996"},{"key":"2023012408333731700_b22","doi-asserted-by":"crossref","first-page":"356","DOI":"10.1002\/prot.340230309","article-title":"Threading a database of protein cores","volume":"23","author":"Madej","year":"1995","journal-title":"Proteins"},{"key":"2023012408333731700_b23","volume-title":"Machine Learning","author":"Mitchell","year":"1997"},{"key":"2023012408333731700_b24","doi-asserted-by":"crossref","first-page":"536","DOI":"10.1016\/S0022-2836(05)80134-2","article-title":"SCOP: a structural classification of proteins database for the investigation of sequences and structures","volume":"247","author":"Murzin","year":"1995","journal-title":"J. Mol. Biol."},{"key":"2023012408333731700_b25","doi-asserted-by":"crossref","first-page":"1093","DOI":"10.1016\/S0969-2126(97)00260-8","article-title":"CATH\u2014a hierarchic classification of protein domain structures","volume":"5","author":"Orengo","year":"1997","journal-title":"Structure"},{"key":"2023012408333731700_b26","first-page":"146","article-title":"GCap: graph-based automatic image captioning","author":"Pan","year":"2004"},{"key":"2023012408333731700_b27","doi-asserted-by":"crossref","first-page":"289","DOI":"10.1093\/nar\/30.1.289","article-title":"SUPFAM\u2014a database of potential protein superfamily relationships derived by comparing sequence-based and structure-based families: implications for structural genomics and function annotation in genomes","volume":"30","author":"Pandit","year":"2002","journal-title":"Nucleic Acids Res."},{"key":"2023012408333731700_b28","first-page":"249","article-title":"Gene functional classification from heteregeneous data","author":"Pavlidis","year":"2001"},{"key":"2023012408333731700_b29","doi-asserted-by":"crossref","first-page":"899","DOI":"10.1093\/bioinformatics\/18.7.899","article-title":"Selecting targets for structural determination by navigating in a graph of protein families","volume":"18","author":"Portugaly","year":"2002","journal-title":"Bioinformatics"},{"key":"2023012408333731700_b30","doi-asserted-by":"crossref","first-page":"1682","DOI":"10.1093\/bioinformatics\/bth141","article-title":"Protein homology detection using string alignment kernels","volume":"20","author":"Saigo","year":"2004","journal-title":"Bioinformatics"},{"key":"2023012408333731700_b31","doi-asserted-by":"crossref","DOI":"10.7551\/mitpress\/4057.001.0001","volume-title":"Kernel methods in computational biology.","author":"Schoelkopf","year":"2004"},{"key":"2023012408333731700_b32","doi-asserted-by":"crossref","first-page":"739","DOI":"10.1093\/protein\/11.9.739","article-title":"Protein structure alignment by incremental combinatorial extension (CE) of the optimal path","volume":"11","author":"Shindyalov","year":"1998","journal-title":"Protein Eng."},{"key":"2023012408333731700_b33","doi-asserted-by":"crossref","first-page":"265","DOI":"10.1093\/bib\/3.3.265","article-title":"PROSITE: a documented database using patterns and profiles as motif descriptors","volume":"3","author":"Sigrist","year":"2002","journal-title":"Brief Bioinform."},{"key":"2023012408333731700_b34","first-page":"945","article-title":"Partially labeled classification with markov random walks","author":"Szummer","year":"2001"},{"key":"2023012408333731700_b35","doi-asserted-by":"crossref","first-page":"6559","DOI":"10.1073\/pnas.0308067101","article-title":"Protein ranking: from local to global structure in the protein similarity network","volume":"101","author":"Weston","year":"2004","journal-title":"Proc. Natl Acad. Sci. USA"},{"key":"2023012408333731700_b36","doi-asserted-by":"crossref","first-page":"i363","DOI":"10.1093\/bioinformatics\/bth910","article-title":"Protein network inference from multiple genomic data: a supervised approach","volume":"20","author":"Yamanishi","year":"2004","journal-title":"Bioinformatics"},{"key":"2023012408333731700_b37","first-page":"395","article-title":"Towards a complete map of the protein space based on a unified sequence and structure analysis of all known proteins","volume":"8","author":"Yona","year":"2000","journal-title":"Proc. Int. Conf. Intell. Syst. Mol. Biol."},{"key":"2023012408333731700_b38","doi-asserted-by":"crossref","first-page":"360","DOI":"10.1002\/(SICI)1097-0134(19991115)37:3<360::AID-PROT5>3.0.CO;2-Z","article-title":"ProtoMap: automatic classification of protein sequences, a hierarchy of protein families, and local maps of the protein space","volume":"37","author":"Yona","year":"1999","journal-title":"Proteins"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/22\/13\/1585\/48838918\/bioinformatics_22_13_1585.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/22\/13\/1585\/48838918\/bioinformatics_22_13_1585.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,1,24]],"date-time":"2023-01-24T08:52:15Z","timestamp":1674550335000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/22\/13\/1585\/193368"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2006,4,4]]},"references-count":38,"journal-issue":{"issue":"13","published-print":{"date-parts":[[2006,7,1]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btl130","relation":{},"ISSN":["1367-4811","1367-4803"],"issn-type":[{"value":"1367-4811","type":"electronic"},{"value":"1367-4803","type":"print"}],"subject":[],"published-other":{"date-parts":[[2006,7,1]]},"published":{"date-parts":[[2006,4,4]]}}}