{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,30]],"date-time":"2025-10-30T22:24:58Z","timestamp":1761863098578},"reference-count":44,"publisher":"Oxford University Press (OUP)","issue":"10","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2012,5,15]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:p>Motivation: Finding geometrically similar protein binding sites is crucial for understanding protein functions and can provide valuable information for protein\u2013protein docking and drug discovery. As the number of known protein\u2013protein interaction structures has dramatically increased, a high-throughput and accurate protein binding site comparison method is essential. Traditional alignment-based methods can provide accurate correspondence between the binding sites but are computationally expensive.<\/jats:p>\n               <jats:p>Results: In this article, we present a novel method for the comparisons of protein binding sites using a \u2018visual words\u2019 representation (PBSword). We first extract geometric features of binding site surfaces and build a vocabulary of visual words by clustering a large set of feature descriptors. We then describe a binding site surface with a high-dimensional vector that encodes the frequency of visual words, enhanced by the spatial relationships among them. Finally, we measure the similarity of binding sites by utilizing metric space operations, which provide speedy comparisons between protein binding sites. Our experimental results show that PBSword achieves a comparable classification accuracy to an alignment-based method and improves accuracy of a feature-based method by 36% on a non-redundant dataset. PBSword also exhibits a significant efficiency improvement over an alignment-based method.<\/jats:p>\n               <jats:p>Availability: PBSword is available at http:\/\/proteindbs.rnet.missouri.edu\/pbsword\/pbsword.html<\/jats:p>\n               <jats:p>Contact: \u00a0shyuc@missouri.edu<\/jats:p>\n               <jats:p>Supplementary information: \u00a0Supplementary data are available at Bioinformatics online.<\/jats:p>","DOI":"10.1093\/bioinformatics\/bts138","type":"journal-article","created":{"date-parts":[[2012,4,8]],"date-time":"2012-04-08T08:06:13Z","timestamp":1333872373000},"page":"1345-1352","source":"Crossref","is-referenced-by-count":10,"title":["Fast protein binding site comparisons using visual words representation"],"prefix":"10.1093","volume":"28","author":[{"given":"Bin","family":"Pang","sequence":"first","affiliation":[{"name":"1 Informatics Institute and 2Department of Computer Science, University of Missouri, Columbia, MO 65211, USA"}]},{"given":"Nan","family":"Zhao","sequence":"additional","affiliation":[{"name":"1 Informatics Institute and 2Department of Computer Science, University of Missouri, Columbia, MO 65211, USA"}]},{"given":"Dmitry","family":"Korkin","sequence":"additional","affiliation":[{"name":"1 Informatics Institute and 2Department of Computer Science, University of Missouri, Columbia, MO 65211, USA"},{"name":"1 Informatics Institute and 2Department of Computer Science, University of Missouri, Columbia, MO 65211, USA"}]},{"given":"Chi-Ren","family":"Shyu","sequence":"additional","affiliation":[{"name":"1 Informatics Institute and 2Department of Computer Science, University of Missouri, Columbia, MO 65211, USA"},{"name":"1 Informatics Institute and 2Department of Computer Science, University of Missouri, Columbia, MO 65211, USA"}]}],"member":"286","published-online":{"date-parts":[[2012,4,6]]},"reference":[{"key":"2023012512303390600_B1","doi-asserted-by":"crossref","first-page":"188","DOI":"10.1038\/nrm1859","article-title":"Structural systems biology: modelling protein interactions","volume":"7","author":"Aloy","year":"2006","journal-title":"Nat. Rev. Mol. Cell Biol."},{"key":"2023012512303390600_B2","doi-asserted-by":"crossref","first-page":"1059","DOI":"10.1007\/s00018-007-7451-x","article-title":"The interface of protein-protein complexes: analysis of contacts and prediction of interactions","volume":"65","author":"Bahadur","year":"2008","journal-title":"Cell. Mol. Life Sci."},{"key":"2023012512303390600_B3","doi-asserted-by":"crossref","first-page":"509","DOI":"10.1109\/34.993558","article-title":"Shape matching and object recognition using shape contexts","volume":"24","author":"Belongie","year":"2002","journal-title":"IEEE T Pattern Anal. Mach. Intell."},{"key":"2023012512303390600_B4","doi-asserted-by":"crossref","first-page":"365","DOI":"10.1016\/j.jmb.2006.07.028","article-title":"Insights into protein-protein interfaces using a Bayesian network prediction method","volume":"362","author":"Bradford","year":"2006","journal-title":"J. Mol. Biol."},{"key":"2023012512303390600_B5","doi-asserted-by":"crossref","first-page":"1145","DOI":"10.1016\/S0031-3203(96)00142-2","article-title":"The use of the area under the ROC curve in the evaluation of machine learning algorithms","volume":"30","author":"Bradley","year":"1997","journal-title":"Pattern Recogn."},{"key":"2023012512303390600_B6","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1145\/1899404.1899405","article-title":"Shape google: geometric words and expressions for invariant shape retrieval","volume":"30","author":"Bronstein","year":"2011","journal-title":"ACM Trans. Graph."},{"key":"2023012512303390600_B7","doi-asserted-by":"crossref","first-page":"3481","DOI":"10.1073\/pnas.0914097107","article-title":"FragBag, an accurate representation of protein structure, retrieves structural neighbors from the entire PDB quickly and accurately","volume":"107","author":"Budowski-Tal","year":"2010","journal-title":"Proc. Natl Acad. Sci. USA"},{"key":"2023012512303390600_B8","doi-asserted-by":"crossref","first-page":"2863","DOI":"10.1021\/ci900317x","article-title":"Rapid comparison of protein binding site surfaces with property encoded shape distributions","volume":"49","author":"Das","year":"2009","journal-title":"J. Chem. Inf. Model"},{"key":"2023012512303390600_B9","doi-asserted-by":"crossref","first-page":"1901","DOI":"10.1093\/bioinformatics\/bti277","article-title":"PIBASE: a comprehensive database of structurally defined protein interfaces","volume":"21","author":"Davis","year":"2005","journal-title":"Bioinformatics"},{"key":"2023012512303390600_B10","doi-asserted-by":"crossref","first-page":"269","DOI":"10.1007\/BF01386390","article-title":"A note on two problems in connexion with graphs","volume":"1","author":"Dijkstra","year":"1959","journal-title":"Numer. Math."},{"key":"2023012512303390600_B11","doi-asserted-by":"crossref","first-page":"207","DOI":"10.1016\/S0079-6603(08)60870-3","article-title":"Evolution of Ca(2+)-dependent animal lectins","volume":"45","author":"Drickamer","year":"1993","journal-title":"Prog. Nucleic Acid Res. Mol. Biol."},{"key":"2023012512303390600_B12","doi-asserted-by":"crossref","first-page":"59","DOI":"10.1042\/bss0690059","article-title":"Genomic analysis of C-type lectins","author":"Drickamer","year":"2002","journal-title":"Biochem. Soc. Symp."},{"key":"2023012512303390600_B13","doi-asserted-by":"crossref","first-page":"410","DOI":"10.1093\/bioinformatics\/bti011","article-title":"iPfam: visualization of protein-protein interactions in PDB at domain and amino acid resolutions","volume":"21","author":"Finn","year":"2005","journal-title":"Bioinformatics"},{"key":"2023012512303390600_B14","doi-asserted-by":"crossref","first-page":"2259","DOI":"10.1093\/bioinformatics\/btq404","article-title":"iAlign: a method for the structural comparison of protein-protein interfaces","volume":"26","author":"Gao","year":"2010","journal-title":"Bioinformatics"},{"key":"2023012512303390600_B15","doi-asserted-by":"crossref","first-page":"610","DOI":"10.1109\/TSMC.1973.4309314","article-title":"Textural features for image classification","volume":"3","author":"Haralick","year":"1973","journal-title":"IEEE T Syst. Man Cybern."},{"key":"2023012512303390600_B16","doi-asserted-by":"crossref","first-page":"550","DOI":"10.1093\/bioinformatics\/bti782","article-title":"Equivalent binding sites reveal convergently evolved interaction motifs","volume":"22","author":"Henschel","year":"2006","journal-title":"Bioinformatics"},{"key":"2023012512303390600_B17","doi-asserted-by":"crossref","first-page":"D841","DOI":"10.1093\/nar\/gkr1088","article-title":"The IntAct molecular interaction database in 2012","volume":"40","author":"Kerrien","year":"2012","journal-title":"Nucleic Acids Res."},{"key":"2023012512303390600_B18","doi-asserted-by":"crossref","first-page":"341","DOI":"10.1016\/j.str.2007.01.007","article-title":"Similar binding sites and different partners: implications to shared proteins in cellular pathways","volume":"15","author":"Keskin","year":"2007","journal-title":"Structure"},{"key":"2023012512303390600_B19","doi-asserted-by":"crossref","first-page":"1043","DOI":"10.1110\/ps.03484604","article-title":"A new, structurally nonredundant, diverse data set of protein-protein interfaces and its implications","volume":"13","author":"Keskin","year":"2004","journal-title":"Protein Sci."},{"key":"2023012512303390600_B20","doi-asserted-by":"crossref","first-page":"e124","DOI":"10.1371\/journal.pcbi.0020124","article-title":"The many faces of protein-protein interactions: a compendium of interface geometry","volume":"2","author":"Kim","year":"2006","journal-title":"PLoS Comput. Biol."},{"key":"2023012512303390600_B21","doi-asserted-by":"crossref","first-page":"2350","DOI":"10.1110\/ps.051571905","article-title":"Localization of protein-binding sites within families of proteins","volume":"14","author":"Korkin","year":"2005","journal-title":"Protein Sci."},{"key":"2023012512303390600_B22","doi-asserted-by":"crossref","first-page":"D501","DOI":"10.1093\/nar\/gkr1128","article-title":"DOMMINO: a database of macromolecular interactions","volume":"40","author":"Kuang","year":"2012","journal-title":"Nucleic Acids Res."},{"key":"2023012512303390600_B23","doi-asserted-by":"crossref","first-page":"157","DOI":"10.1186\/1471-2105-10-157","article-title":"IDSS: deformation invariant signatures for molecular shape comparison","volume":"10","author":"Liu","year":"2009","journal-title":"BMC Bioinform."},{"key":"2023012512303390600_B24","doi-asserted-by":"crossref","first-page":"129","DOI":"10.1109\/TIT.1982.1056489","article-title":"Least squares quantization in PCM","volume":"28","author":"Lloyd","year":"1982","journal-title":"IEEE T Inform. Theory"},{"key":"2023012512303390600_B25","doi-asserted-by":"crossref","first-page":"1004","DOI":"10.1109\/TCBB.2010.21","article-title":"Image-based surface matching algorithm oriented to structural biology","volume":"8","author":"Merelli","year":"2011","journal-title":"IEEE\/ACM T Comput. Biol. Bioinform."},{"key":"2023012512303390600_B26","doi-asserted-by":"crossref","first-page":"6","DOI":"10.1002\/prot.20580","article-title":"Generation and analysis of a protein-protein interface data set with similar chemical and spatial patterns of interactions","volume":"61","author":"Mintz","year":"2005","journal-title":"Proteins"},{"key":"2023012512303390600_B27","doi-asserted-by":"crossref","first-page":"536","DOI":"10.1016\/S0022-2836(05)80134-2","article-title":"SCOP: a structural classification of proteins database for the investigation of sequences and structures","volume":"247","author":"Murzin","year":"1995","journal-title":"J. Mol. Biol."},{"key":"2023012512303390600_B28","doi-asserted-by":"crossref","first-page":"741","DOI":"10.1016\/S0022-2836(02)00649-6","article-title":"One fold with many functions: the evolutionary relationships between TIM barrel families based on their sequences, structures and functions","volume":"321","author":"Nagano","year":"2002","journal-title":"J. Mol. Biol."},{"key":"2023012512303390600_B29","doi-asserted-by":"crossref","first-page":"154","DOI":"10.1109\/SMA.2001.923386","article-title":"Matching 3D models with shape distributions","volume-title":"Proceedings of the International Conference on Shape Modeling & Applications.","author":"Osada","year":"2001"},{"key":"2023012512303390600_B30","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1002\/prot.22141","article-title":"Rapid comparison of properties on protein surface","volume":"73","author":"Sael","year":"2008","journal-title":"Proteins"},{"key":"2023012512303390600_B31","first-page":"79","article-title":"Structural descriptors of protein-protein binding sites","volume-title":"Proceedings of 6th Asia-Pacific Bioinformatics Conference.","author":"Sander","year":"2008"},{"key":"2023012512303390600_B32","doi-asserted-by":"crossref","first-page":"305","DOI":"10.1002\/(SICI)1097-0282(199603)38:3<305::AID-BIP4>3.0.CO;2-Y","article-title":"Reduced surface: an efficient way to compute molecular surfaces","volume":"38","author":"Sanner","year":"1996","journal-title":"Biopolymers"},{"key":"2023012512303390600_B33","doi-asserted-by":"crossref","first-page":"194","DOI":"10.1007\/978-3-540-30219-3_17","article-title":"Protein-protein interfaces: recognition of similar spatial and chemical organizations","volume-title":"Algorithms in Bioinformatics.","author":"Shulman-Peleg","year":"2004"},{"key":"2023012512303390600_B34","doi-asserted-by":"crossref","first-page":"2677","DOI":"10.1073\/pnas.0813249106","article-title":"Alignment-free genome comparison with feature frequency profiles (FFP) and optimal resolutions","volume":"106","author":"Sims","year":"2009","journal-title":"Proc. Natl Acad. Sci. USA"},{"key":"2023012512303390600_B35","doi-asserted-by":"crossref","first-page":"3139","DOI":"10.1093\/bioinformatics\/btm503","article-title":"Moment invariants as shape recognition technique for comparing protein binding sites","volume":"23","author":"Sommer","year":"2007","journal-title":"Bioinformatics"},{"key":"2023012512303390600_B36","doi-asserted-by":"crossref","first-page":"604","DOI":"10.1006\/jmbi.1996.0424","article-title":"A dataset of protein-protein interfaces generated with a sequence-order-independent comparison technique","volume":"260","author":"Tsai","year":"1996","journal-title":"J. Mol. Biol."},{"key":"2023012512303390600_B37","doi-asserted-by":"crossref","first-page":"785","DOI":"10.1016\/j.jmb.2008.04.071","article-title":"Architectures and functional coverage of protein-protein interfaces","volume":"381","author":"Tuncbag","year":"2008","journal-title":"J. Mol. Biol."},{"key":"2023012512303390600_B38","doi-asserted-by":"crossref","first-page":"D310","DOI":"10.1093\/nar\/gkj099","article-title":"SCOPPI: a structural classification of protein-protein interfaces","volume":"34","author":"Winter","year":"2006","journal-title":"Nucleic Acids Res."},{"key":"2023012512303390600_B39","doi-asserted-by":"crossref","first-page":"87","DOI":"10.1016\/j.compbiolchem.2003.10.003","article-title":"The iProClass integrated database for protein functional analysis","volume":"28","author":"Wu","year":"2004","journal-title":"Comput. Biol. Chem."},{"key":"2023012512303390600_B40","doi-asserted-by":"crossref","first-page":"889","DOI":"10.1093\/bioinformatics\/btq066","article-title":"How significant is a protein structure similarity with TM-score = 0.5?","volume":"26","author":"Xu","year":"2010","journal-title":"Bioinformatics"},{"key":"2023012512303390600_B41","doi-asserted-by":"crossref","first-page":"16622","DOI":"10.1073\/pnas.0906146106","article-title":"Fast screening of protein surfaces using geometric invariant fingerprints","volume":"106","author":"Yin","year":"2009","journal-title":"Proc. Natl Acad. Sci. USA"},{"key":"2023012512303390600_B42","doi-asserted-by":"crossref","first-page":"6179","DOI":"10.1111\/j.1742-4658.2005.05031.x","article-title":"The C-type lectin-like domain superfamily","volume":"272","author":"Zelensky","year":"2005","journal-title":"FEBS J."},{"key":"2023012512303390600_B43","doi-asserted-by":"crossref","first-page":"2302","DOI":"10.1093\/nar\/gki524","article-title":"TM-align: a protein structure alignment algorithm based on the TM-score","volume":"33","author":"Zhang","year":"2005","journal-title":"Nucleic Acids Res."},{"key":"2023012512303390600_B44","doi-asserted-by":"crossref","first-page":"e19554","DOI":"10.1371\/journal.pone.0019554","article-title":"Structural similarity and classification of protein interaction interfaces","volume":"6","author":"Zhao","year":"2011","journal-title":"PLoS One"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/28\/10\/1345\/48866569\/bioinformatics_28_10_1345.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/28\/10\/1345\/48866569\/bioinformatics_28_10_1345.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,1,25]],"date-time":"2023-01-25T15:56:58Z","timestamp":1674662218000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/28\/10\/1345\/211737"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2012,4,6]]},"references-count":44,"journal-issue":{"issue":"10","published-print":{"date-parts":[[2012,5,15]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/bts138","relation":{},"ISSN":["1367-4811","1367-4803"],"issn-type":[{"value":"1367-4811","type":"electronic"},{"value":"1367-4803","type":"print"}],"subject":[],"published-other":{"date-parts":[[2012,5,15]]},"published":{"date-parts":[[2012,4,6]]}}}