{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2023,8,18]],"date-time":"2023-08-18T22:22:02Z","timestamp":1692397322195},"reference-count":36,"publisher":"Oxford University Press (OUP)","issue":"19","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2009,10,1]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:p>Motivation: To investigate structure\u2013function relationships, life sciences researchers usually retrieve and classify proteins with similar substructures into the same fold. A manually constructed database, SCOP, is believed to be highly accurate; however, it is labor intensive. Another known method, DALI, is also precise but computationally expensive. We have developed an efficient algorithm, namely, index-based protein substructure alignment (IPSA), for protein-fold classification. IPSA constructs a two-layer indexing tree to quickly retrieve similar substructures in proteins and suggests possible folds by aligning these substructures.<\/jats:p>\n               <jats:p>Results: Compared with known algorithms, such as DALI, CE, MultiProt and MAMMOTH, on a sample dataset of non-redundant proteins from SCOP v1.73, IPSA exhibits an efficiency improvement of 53.10, 16.87, 3.60 and 1.64 times speedup, respectively. Evaluated on three different datasets of non-redundant proteins from SCOP, average accuracy of IPSA is approximately equal to DALI and better than CE, MAMMOTH, MultiProt and SSM. With reliable accuracy and efficiency, this work will benefit the study of high-throughput protein structure\u2013function relationships.<\/jats:p>\n               <jats:p>Availability: IPSA is publicly accessible at http:\/\/ProteinDBS.rnet.missouri.edu\/IPSA.php<\/jats:p>\n               <jats:p>Contact: \u00a0ShyuC@missouri.edu<\/jats:p>\n               <jats:p>Supplementary information: \u00a0Supplementary data are available at Bioinformatics online.<\/jats:p>","DOI":"10.1093\/bioinformatics\/btp474","type":"journal-article","created":{"date-parts":[[2009,8,11]],"date-time":"2009-08-11T01:06:19Z","timestamp":1249952779000},"page":"2559-2565","source":"Crossref","is-referenced-by-count":8,"title":["Efficient SCOP-fold classification and retrieval using index-based protein substructure alignments"],"prefix":"10.1093","volume":"25","author":[{"given":"Pin-Hao","family":"Chi","sequence":"first","affiliation":[{"name":"1 Medical and Biological Digital Library Research Lab, Informatics Institute and 2Department of Computer Science, University of Missouri, Columbia, MO 65211, USA"},{"name":"1 Medical and Biological Digital Library Research Lab, Informatics Institute and 2Department of Computer Science, University of Missouri, Columbia, MO 65211, USA"}]},{"given":"Bin","family":"Pang","sequence":"additional","affiliation":[{"name":"1 Medical and Biological Digital Library Research Lab, Informatics Institute and 2Department of Computer Science, University of Missouri, Columbia, MO 65211, USA"},{"name":"1 Medical and Biological Digital Library Research Lab, Informatics Institute and 2Department of Computer Science, University of Missouri, Columbia, MO 65211, USA"}]},{"given":"Dmitry","family":"Korkin","sequence":"additional","affiliation":[{"name":"1 Medical and Biological Digital Library Research Lab, Informatics Institute and 2Department of Computer Science, University of Missouri, Columbia, MO 65211, USA"},{"name":"1 Medical and Biological Digital Library Research Lab, Informatics Institute and 2Department of Computer Science, University of Missouri, Columbia, MO 65211, USA"}]},{"given":"Chi-Ren","family":"Shyu","sequence":"additional","affiliation":[{"name":"1 Medical and Biological Digital Library Research Lab, Informatics Institute and 2Department of Computer Science, University of Missouri, Columbia, MO 65211, USA"},{"name":"1 Medical and Biological Digital Library Research Lab, Informatics Institute and 2Department of Computer Science, University of Missouri, Columbia, MO 65211, USA"}]}],"member":"286","published-online":{"date-parts":[[2009,8,10]]},"reference":[{"key":"2023013112123403100_B1","doi-asserted-by":"crossref","first-page":"727","DOI":"10.1093\/protein\/9.9.727","article-title":"Sarfing the pdb","volume":"9","author":"Alexandrov","year":"1996","journal-title":"Protein Eng."},{"key":"2023013112123403100_B2","doi-asserted-by":"crossref","first-page":"1045","DOI":"10.1093\/bioinformatics\/bth036","article-title":"Rapid 3D protein structure database searching using information retrieval techniques","volume":"20","author":"Aung","year":"2004","journal-title":"Bioinformatics"},{"key":"2023013112123403100_B3","doi-asserted-by":"crossref","first-page":"235","DOI":"10.1093\/nar\/28.1.235","article-title":"The protein data bank","volume":"28","author":"Berman","year":"2000","journal-title":"Nucleic Acids Res."},{"key":"2023013112123403100_B4","doi-asserted-by":"crossref","first-page":"647","DOI":"10.1093\/protein\/8.7.647","article-title":"Optimal protein structure alignments by multiple linkage clustering: application to distantly related proteins","volume":"8","author":"Boutonnet","year":"1995","journal-title":"Protein Eng."},{"key":"2023013112123403100_B5","first-page":"224","article-title":"Automated protein classification using consensus decision","volume-title":"Proceedings of the Third International IEEE Computer Society Computational Systems Bioinformatics Conference","author":"Can","year":"2004"},{"key":"2023013112123403100_B6","volume-title":"Introduction to Protein Structures","author":"Carl","year":"1999","edition":"2"},{"key":"2023013112123403100_B7","doi-asserted-by":"crossref","first-page":"2860","DOI":"10.1093\/bioinformatics\/bth300","article-title":"TargetDB: a target registration database for structural genomics projects","volume":"20","author":"Chen","year":"2004","journal-title":"Bioinformatics"},{"key":"2023013112123403100_B8","doi-asserted-by":"crossref","first-page":"362","DOI":"10.1186\/1471-2105-7-362","article-title":"A fast SCOP fold classification system using content-based E-Predict algorithm","volume":"7","author":"Chi","year":"2006","journal-title":"BMC Bioinformatics"},{"key":"2023013112123403100_B9","first-page":"426","article-title":"M-tree: an efficient access method for similarity search in metric spaces","volume-title":"Proceedings of the International Conference on Very Large Databases","author":"Ciaccia","year":"1997"},{"key":"2023013112123403100_B10","doi-asserted-by":"crossref","first-page":"377","DOI":"10.1016\/S0959-440X(96)80058-3","article-title":"Surprising similarities in structure comparison","volume":"6","author":"Gibrat","year":"1996","journal-title":"Curr. Opin. Struct. Biol."},{"key":"2023013112123403100_B11","doi-asserted-by":"crossref","first-page":"1325","DOI":"10.1002\/pro.5560050711","article-title":"The structural alignment between two proteins: is there a unique answer?","volume":"5","author":"Godzik","year":"1996","journal-title":"Protein Sci."},{"key":"2023013112123403100_B12","doi-asserted-by":"crossref","first-page":"522","DOI":"10.1002\/pro.5560030317","article-title":"Enlarged representative set of protein structures","volume":"3","author":"Hobohm","year":"1994","journal-title":"Protein Sci."},{"key":"2023013112123403100_B13","doi-asserted-by":"crossref","first-page":"123","DOI":"10.1006\/jmbi.1993.1489","article-title":"Protein structure comparison by alignment of distance matrices","volume":"233","author":"Holm","year":"1993","journal-title":"J. Mol. Biol."},{"key":"2023013112123403100_B14","first-page":"3600","article-title":"The FSSP database of structurally aligned protein fold families","volume":"22","author":"Holm","year":"1994","journal-title":"Nucleic Acids Res."},{"key":"2023013112123403100_B15","first-page":"411","article-title":"Accurate classification of protein structural families using coherent subgraph analysis","volume-title":"Proceedings of the Pacific Symposium on Biocomputing","author":"Huan","year":"2004"},{"key":"2023013112123403100_B16","doi-asserted-by":"crossref","first-page":"535","DOI":"10.1093\/protein\/13.8.535","article-title":"Protein structure alignment using environmental profiles","volume":"13","author":"Jung","year":"2000","journal-title":"Protein Eng."},{"key":"2023013112123403100_B17","doi-asserted-by":"crossref","first-page":"922","DOI":"10.1107\/S0567739476001873","article-title":"A solution for the best rotation to relate two sets of vectors","volume":"32A","author":"Kabsch","year":"1976","journal-title":"Acta Crystallographica Section A"},{"key":"2023013112123403100_B18","first-page":"2256","article-title":"Secondary-structure matching (SSM), a new tool for fast protein structure alignment in three dimensions","volume":"D60","author":"Krissinel","year":"2004","journal-title":"Acta Aryst."},{"key":"2023013112123403100_B19","doi-asserted-by":"crossref","first-page":"745","DOI":"10.1093\/protein\/13.11.745","article-title":"ProSup: a refined tool for protein structure alignment","volume":"13","author":"Lackner","year":"2000","journal-title":"Protein Eng."},{"key":"2023013112123403100_B20","doi-asserted-by":"crossref","first-page":"235","DOI":"10.1002\/prot.1034","article-title":"Automated multiple structure alignment and detection of a common substructure motif","volume":"43","author":"Leibowitz","year":"2001","journal-title":"Proteins"},{"key":"2023013112123403100_B21","doi-asserted-by":"crossref","first-page":"536","DOI":"10.1016\/S0022-2836(05)80134-2","article-title":"Scop: a structural classification of proteins database for the investigation of sequences and structures","volume":"247","author":"Murzin","year":"1995","journal-title":"J. Mol. Biol."},{"key":"2023013112123403100_B22","doi-asserted-by":"crossref","first-page":"260","DOI":"10.1002\/prot.10553","article-title":"Evaluation of protein fold comparison servers","volume":"54","author":"Novotny","year":"2004","journal-title":"Proteins"},{"key":"2023013112123403100_B23","doi-asserted-by":"crossref","first-page":"2606","DOI":"10.1110\/ps.0215902","article-title":"MAMMOTH (Matching molecular models obtained from theory): an automated method for model comparison","volume":"11","author":"Ortiz","year":"2002","journal-title":"Protein Sci."},{"key":"2023013112123403100_B24","doi-asserted-by":"crossref","first-page":"452","DOI":"10.1093\/nar\/gkg062","article-title":"The CATH database: an extended protein family resource for structural and functional genomics","volume":"31","author":"Pearl","year":"2003","journal-title":"Nucleic Acids Res."},{"key":"2023013112123403100_B25","doi-asserted-by":"crossref","first-page":"119","DOI":"10.1073\/pnas.2636460100","article-title":"Automatic classification of protein structure by using Gauss integrals","volume":"100","author":"Rogen","year":"2003","journal-title":"Proc. Natl Sci. USA"},{"key":"2023013112123403100_B26","doi-asserted-by":"crossref","first-page":"143","DOI":"10.1002\/prot.10628","article-title":"A method for simultaneous alignment of multiple protein structures","volume":"56","author":"Shatsky","year":"2004","journal-title":"Proteins"},{"key":"2023013112123403100_B27","doi-asserted-by":"crossref","first-page":"739","DOI":"10.1093\/protein\/11.9.739","article-title":"Protein structure alignment by incremental combinatorial extension (CE) of the optimal path","volume":"9","author":"Shindyalov","year":"1998","journal-title":"Protein Eng."},{"key":"2023013112123403100_B28","doi-asserted-by":"crossref","first-page":"572","DOI":"10.1093\/nar\/gkh436","article-title":"ProteinDBS\u2014a content-based retrieval system for protein structure databases","volume":"32","author":"Shyu","year":"2004","journal-title":"Nucleic Acids Res."},{"key":"2023013112123403100_B29","first-page":"284","article-title":"Hierarchical protein structure superposition using both secondary structure and atomic representations","volume-title":"Proceedings of 5th International Conference on Intelligent Systems for Molecular Biology (ISMB'97)","author":"Singh","year":"1997"},{"key":"2023013112123403100_B30","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1016\/0022-2836(89)90084-3","article-title":"Protein structure alignment","volume":"208","author":"Taylor","year":"1989","journal-title":"J. Mol. Biol."},{"key":"2023013112123403100_B31","volume-title":"Information Retrieval","author":"van Rijsbergen","year":"1979","edition":"2"},{"key":"2023013112123403100_B32","doi-asserted-by":"crossref","first-page":"53","DOI":"10.1186\/1471-2105-7-53","article-title":"PDB-UF: database of predicted enzymatic functions for unannotated protein structures from structural genomics","volume":"7","author":"von Grotthuss","year":"2006","journal-title":"BMC Bioinformatics"},{"key":"2023013112123403100_B33","doi-asserted-by":"crossref","first-page":"3646","DOI":"10.1093\/nar\/gkl395","article-title":"Protein structure database search and evolutionary classification","volume":"34","author":"Yang","year":"2006","journal-title":"Nucleic Acids Res."},{"key":"2023013112123403100_B34","doi-asserted-by":"crossref","first-page":"317","DOI":"10.1002\/(SICI)1097-0134(19990215)34:3<317::AID-PROT5>3.0.CO;2-7","article-title":"A rapid method for exploring the protein structure universe","volume":"34","author":"Young","year":"1999","journal-title":"Proteins."},{"key":"2023013112123403100_B35","doi-asserted-by":"crossref","first-page":"189","DOI":"10.1073\/pnas.95.26.15189","article-title":"Structure-based assignment of the biochemical function of a hypothetical protein: a test case of structural genomics","volume":"95","author":"Zarembinski","year":"1998","journal-title":"Proc. Natl Sci. USA"},{"key":"2023013112123403100_B36","doi-asserted-by":"crossref","first-page":"40","DOI":"10.1186\/1471-2105-7-40","article-title":"Protein structure similarity from principle component correlation analysis","volume":"7","author":"Zhou","year":"2006","journal-title":"BMC Bioinformatics"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/25\/19\/2559\/48994895\/bioinformatics_25_19_2559.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/25\/19\/2559\/48994895\/bioinformatics_25_19_2559.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,1,31]],"date-time":"2023-01-31T21:38:14Z","timestamp":1675201094000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/25\/19\/2559\/181913"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2009,8,10]]},"references-count":36,"journal-issue":{"issue":"19","published-print":{"date-parts":[[2009,10,1]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btp474","relation":{},"ISSN":["1367-4811","1367-4803"],"issn-type":[{"value":"1367-4811","type":"electronic"},{"value":"1367-4803","type":"print"}],"subject":[],"published-other":{"date-parts":[[2009,10,1]]},"published":{"date-parts":[[2009,8,10]]}}}