{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,1,5]],"date-time":"2026-01-05T07:31:49Z","timestamp":1767598309765},"reference-count":29,"publisher":"Oxford University Press (OUP)","issue":"7","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2014,4,1]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>Motivation:\u2002One common task in structural biology is to assess the similarities and differences among protein structures. A variety of structure alignment algorithms and programs has been designed and implemented for this purpose. A major drawback with existing structure alignment programs is that they require a large amount of computational time, rendering them infeasible for pairwise alignments on large collections of structures. To overcome this drawback, a fragment alphabet learned from known structures has been introduced. The method, however, considers local similarity only, and therefore occasionally assigns high scores to structures that are similar only in local fragments.<\/jats:p><jats:p>Method:\u2002We propose a novel approach that eliminates false positives, through the comparison of both local and remote similarity, with little compromise in speed. Two kinds of contact libraries (ContactLib) are introduced to fingerprint protein structures effectively and efficiently. Each contact group of the contact library consists of one local or two remote fragments and is represented by a concise vector. These vectors are then indexed and used to calculate a new combined hit-rate score to identify similar protein structures effectively and efficiently.<\/jats:p><jats:p>Results:\u2002We tested our method on the high-quality protein structure subset of SCOP30 containing 3297 protein structures. For each protein structure of the subset, we retrieved its neighbor protein structures from the rest of the subset. The best area under the Receiver-Operating Characteristic curve, archived by ContactLib, is as high as 0.960. This is a significant improvement compared with 0.747, the best result achieved by FragBag. We also demonstrated that incorporating remote contact information is critical to consistently retrieve accurate neighbor protein structures for all- query protein structures.<\/jats:p><jats:p>Availability and implementation:\u2002https:\/\/cs.uwaterloo.ca\/\u223cxfcui\/contactlib\/.<\/jats:p><jats:p>Contact:\u2002shuaicli@cityu.edu.hk or mli@uwaterloo.ca<\/jats:p>","DOI":"10.1093\/bioinformatics\/btt659","type":"journal-article","created":{"date-parts":[[2013,12,1]],"date-time":"2013-12-01T01:09:43Z","timestamp":1385860183000},"page":"949-955","source":"Crossref","is-referenced-by-count":9,"title":["Fingerprinting protein structures effectively and efficiently"],"prefix":"10.1093","volume":"30","author":[{"given":"Xuefeng","family":"Cui","sequence":"first","affiliation":[{"name":"1 David R. Cheriton School of Computer Science, University of Waterloo, Ontario, Canada and 2Department of Computer Science, City University of Hong Kong, Kowloon, Hong Kong"}]},{"given":"Shuai Cheng","family":"Li","sequence":"additional","affiliation":[{"name":"1 David R. Cheriton School of Computer Science, University of Waterloo, Ontario, Canada and 2Department of Computer Science, City University of Hong Kong, Kowloon, Hong Kong"}]},{"given":"Lin","family":"He","sequence":"additional","affiliation":[{"name":"1 David R. Cheriton School of Computer Science, University of Waterloo, Ontario, Canada and 2Department of Computer Science, City University of Hong Kong, Kowloon, Hong Kong"}]},{"given":"Ming","family":"Li","sequence":"additional","affiliation":[{"name":"1 David R. Cheriton School of Computer Science, University of Waterloo, Ontario, Canada and 2Department of Computer Science, City University of Hong Kong, Kowloon, Hong Kong"}]}],"member":"286","published-online":{"date-parts":[[2013,11,29]]},"reference":[{"key":"2023012710452778700_btt659-B1","first-page":"1629","article-title":"Protein structure alignment using dynamic programing and iterative improvement","volume":"79","author":"Akutsu","year":"1996","journal-title":"IEICE Trans. Inf. Syst."},{"key":"2023012710452778700_btt659-B2","doi-asserted-by":"crossref","first-page":"732","DOI":"10.1016\/j.drudis.2007.07.014","article-title":"Rapid retrieval of protein structures from databases","volume":"12","author":"Aung","year":"2007","journal-title":"Drug Discov. Today"},{"key":"2023012710452778700_btt659-B3","doi-asserted-by":"crossref","first-page":"3481","DOI":"10.1073\/pnas.0914097107","article-title":"FragBag, an accurate representation of protein structure, retrieves structural neighbors from the entire PDB quickly and accurately","volume":"107","author":"Budowski-Tal","year":"2010","journal-title":"Proc. Natl Acad. Sci. USA"},{"key":"2023012710452778700_btt659-B4","doi-asserted-by":"crossref","first-page":"2001","DOI":"10.1110\/ps.03154503","article-title":"A graph-theory algorithm for rapid protein side-chain prediction","volume":"12","author":"Canutescu","year":"2003","journal-title":"Protein Sci."},{"key":"2023012710452778700_btt659-B5","doi-asserted-by":"crossref","first-page":"D189","DOI":"10.1093\/nar\/gkh034","article-title":"The astral compendium in 2004","volume":"32","author":"Chandonia","year":"2004","journal-title":"Nucleic Acids Res."},{"key":"2023012710452778700_btt659-B6","doi-asserted-by":"crossref","first-page":"3797","DOI":"10.1073\/pnas.0308656100","article-title":"Local feature frequency profile: a method to measure structural similarity in proteins","volume":"101","author":"Choi","year":"2004","journal-title":"Proc. Natl Acad. Sci. USA"},{"key":"2023012710452778700_btt659-B7","first-page":"5","article-title":"Protein structure idealization: how accurately is it possible to model protein structures with dihedral angles? Algorithms Mol","volume":"8","author":"Cui","year":"2013","journal-title":"Biol."},{"key":"2023012710452778700_btt659-B8","doi-asserted-by":"crossref","first-page":"359","DOI":"10.1002\/pro.2225","article-title":"Toward a structural blast: using structural relationships to infer function","volume":"22","author":"Dey","year":"2013","journal-title":"Protein Sci."},{"key":"2023012710452778700_btt659-B9","doi-asserted-by":"crossref","first-page":"382","DOI":"10.1107\/97809553602060000695","article-title":"Chapter 18.3: Structure quality and target parameters","volume-title":"International Tables for Crystallography","author":"Engh","year":"2006"},{"key":"2023012710452778700_btt659-B10","doi-asserted-by":"crossref","first-page":"861","DOI":"10.1016\/j.patrec.2005.10.010","article-title":"An introduction to ROC analysis","volume":"27","author":"Fawcett","year":"2006","journal-title":"Pattern Recogn. Lett."},{"key":"2023012710452778700_btt659-B11","doi-asserted-by":"crossref","first-page":"447","DOI":"10.1007\/BF02192866","article-title":"Improved efficiency of protein structure calculations from NMR data using the program DIANA with redundant dihedral angle constraints","volume":"1","author":"G\u00fcntert","year":"1991","journal-title":"J. Biomol. NMR"},{"key":"2023012710452778700_btt659-B12","doi-asserted-by":"crossref","first-page":"123","DOI":"10.1006\/jmbi.1993.1489","article-title":"Protein structure comparison by alignment of distance matrices","volume":"233","author":"Holm","year":"1993","journal-title":"J. Mol. Biol."},{"key":"2023012710452778700_btt659-B13","doi-asserted-by":"crossref","first-page":"12201","DOI":"10.1073\/pnas.0404383101","article-title":"Approximate protein structural alignment in polynomial time","volume":"101","author":"Kolodny","year":"2004","journal-title":"Proc. Natl Acad. Sci. USA"},{"key":"2023012710452778700_btt659-B14","doi-asserted-by":"crossref","first-page":"1173","DOI":"10.1016\/j.jmb.2004.12.032","article-title":"Comprehensive evaluation of protein structure alignment methods: scoring by geometric measures","volume":"346","author":"Kolodny","year":"2005","journal-title":"J. Mol. Biol."},{"key":"2023012710452778700_btt659-B15","doi-asserted-by":"crossref","first-page":"2256","DOI":"10.1107\/S0907444904026460","article-title":"Secondary-structure matching (SSM), a new tool for fast protein structure alignment in three dimensions","volume":"60","author":"Krissinel","year":"2004","journal-title":"Acta Crystallogr. D Biol. Crystallogr."},{"key":"2023012710452778700_btt659-B16","doi-asserted-by":"crossref","first-page":"1925","DOI":"10.1110\/ps.036442.108","article-title":"Fragment-HMM: a new approach to protein structure prediction","volume":"17","author":"Li","year":"2008","journal-title":"Protein Sci."},{"key":"2023012710452778700_btt659-B17","doi-asserted-by":"crossref","first-page":"724","DOI":"10.1093\/bib\/bbs052","article-title":"Assessing protein conformational sampling methods based on Bivariate lag-distributions of backbone angles","volume":"14","author":"Maadooliat","year":"2012","journal-title":"Brief. Bioinform."},{"key":"2023012710452778700_btt659-B18","doi-asserted-by":"crossref","first-page":"536","DOI":"10.1016\/S0022-2836(05)80134-2","article-title":"SCOP: a structural classification of proteins database for the investigation of sequences and structures","volume":"247","author":"Murzin","year":"1995","journal-title":"J. Mol. Biol."},{"key":"2023012710452778700_btt659-B19","doi-asserted-by":"crossref","first-page":"1093","DOI":"10.1016\/S0969-2126(97)00260-8","article-title":"CATH\u2013a hierarchic classification of protein domain structures","volume":"5","author":"Orengo","year":"1997","journal-title":"Structure"},{"key":"2023012710452778700_btt659-B20","doi-asserted-by":"crossref","first-page":"277","DOI":"10.1002\/prot.340190403","article-title":"Torsion angle dynamics: reduced variable conformational sampling enhances crystallographic structure refinement","volume":"19","author":"Rice","year":"1994","journal-title":"Proteins"},{"key":"2023012710452778700_btt659-B21","article-title":"The PyMOL Molecular Graphics System","author":"Schr\u00f6dinger","year":"2010"},{"key":"2023012710452778700_btt659-B22","doi-asserted-by":"crossref","first-page":"739","DOI":"10.1093\/protein\/11.9.739","article-title":"Protein structure alignment by incremental combinatorial extension (CE) of the optimal path","volume":"11","author":"Shindyalov","year":"1998","journal-title":"Protein Eng."},{"key":"2023012710452778700_btt659-B23","doi-asserted-by":"crossref","first-page":"209","DOI":"10.1006\/jmbi.1997.0959","article-title":"Assembly of protein tertiary structures from fragments with similar local sequences using simulated annealing and Bayesian scoring functions","volume":"268","author":"Simons","year":"1997","journal-title":"J. Mol. Biol."},{"key":"2023012710452778700_btt659-B24","doi-asserted-by":"crossref","first-page":"22059","DOI":"10.1016\/S0021-9258(18)45665-7","article-title":"Protein structure determination in solution by NMR spectroscopy","volume":"265","author":"W\u00fcthrich","year":"1990","journal-title":"J. Biol. Chem."},{"key":"2023012710452778700_btt659-B25","doi-asserted-by":"crossref","first-page":"2080","DOI":"10.1002\/prot.24100","article-title":"A new size-independent score for pairwise protein structure alignment and its application to structure classification and nucleic acid binding prediction","volume":"80","author":"Yang","year":"2012","journal-title":"Proteins"},{"key":"2023012710452778700_btt659-B26","doi-asserted-by":"crossref","first-page":"3370","DOI":"10.1093\/nar\/gkg571","article-title":"LGA: a method for finding 3D similarities in protein structures","volume":"31","author":"Zemla","year":"2003","journal-title":"Nucleic Acids Res."},{"key":"2023012710452778700_btt659-B27","doi-asserted-by":"crossref","first-page":"1137","DOI":"10.1002\/prot.22634","article-title":"Mufold: a new solution for protein 3D structure prediction","volume":"78","author":"Zhang","year":"2010","journal-title":"Proteins"},{"key":"2023012710452778700_btt659-B28","doi-asserted-by":"crossref","first-page":"2302","DOI":"10.1093\/nar\/gki524","article-title":"TM-align: a protein structure alignment algorithm based on the TM-score","volume":"33","author":"Zhang","year":"2005","journal-title":"Nucleic Acids Res."},{"key":"2023012710452778700_btt659-B29","doi-asserted-by":"crossref","first-page":"2714","DOI":"10.1110\/ps.0217002","article-title":"Distance-scaled, finite ideal-gas reference state improves structure-derived potentials of mean force for structure selection and stability prediction","volume":"11","author":"Zhou","year":"2002","journal-title":"Protein Sci."}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/30\/7\/949\/48920589\/bioinformatics_30_7_949.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/30\/7\/949\/48920589\/bioinformatics_30_7_949.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,5,20]],"date-time":"2024-05-20T19:11:46Z","timestamp":1716232306000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/30\/7\/949\/233056"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2013,11,29]]},"references-count":29,"journal-issue":{"issue":"7","published-print":{"date-parts":[[2014,4,1]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btt659","relation":{},"ISSN":["1367-4811","1367-4803"],"issn-type":[{"value":"1367-4811","type":"electronic"},{"value":"1367-4803","type":"print"}],"subject":[],"published-other":{"date-parts":[[2014,4,1]]},"published":{"date-parts":[[2013,11,29]]}}}