{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,1,21]],"date-time":"2026-01-21T04:39:32Z","timestamp":1768970372396,"version":"3.49.0"},"reference-count":32,"publisher":"Oxford University Press (OUP)","issue":"5","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2009,3,1]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:p>Motivation: An enabling resource for drug discovery and protein function prediction is a large, accurate and actively maintained collection of protein\/small-molecule complex structures. Models of binding are typically constructed from these structural libraries by generalizing the observed interaction patterns. Consequently, the quality of the model is dependent on the quality of the structural library. An ideal library should be non-biased and comprehensive, contain high-resolution structures and be actively maintained.<\/jats:p>\n               <jats:p>Results: We present a new protein\/small-molecule database (the PSMDB) that offers a non-redundant set of holo PDB complexes. The database was designed to allow frequent updates through a fully automated process without manual annotation or filtering. Our method of database construction addresses redundancy at both the protein and the small-molecule level. By efficiently handling structures with covalently bound ligands, we allow our database to include a larger number of structures than previous methods. Multiple versions of the database are available at our web site, including structures of split complexes\u2014the proteins without their binding ligands and the non-covalently bound ligands within their native coordinate frame.<\/jats:p>\n               <jats:p>Availability: \u00a0http:\/\/compbio.cs.toronto.edu\/psmdb<\/jats:p>\n               <jats:p>Contact: \u00a0izharw@cs.toronto.edu; lilien@cs.toronto.edu<\/jats:p>","DOI":"10.1093\/bioinformatics\/btp035","type":"journal-article","created":{"date-parts":[[2009,1,20]],"date-time":"2009-01-20T01:44:58Z","timestamp":1232415898000},"page":"615-620","source":"Crossref","is-referenced-by-count":48,"title":["The protein\u2013small-molecule database, a non-redundant structural resource for the analysis of protein-ligand binding"],"prefix":"10.1093","volume":"25","author":[{"given":"Izhar","family":"Wallach","sequence":"first","affiliation":[{"name":"1 Department of Computer Science and 2Banting and Best Department of Medical Research, University of Toronto, Toronto, Ontario, Canada"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Ryan","family":"Lilien","sequence":"additional","affiliation":[{"name":"1 Department of Computer Science and 2Banting and Best Department of Medical Research, University of Toronto, Toronto, Ontario, Canada"},{"name":"1 Department of Computer Science and 2Banting and Best Department of Medical Research, University of Toronto, Toronto, Ontario, Canada"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"286","published-online":{"date-parts":[[2009,1,19]]},"reference":[{"key":"2023013110110590300_B1","doi-asserted-by":"crossref","first-page":"3389","DOI":"10.1093\/nar\/25.17.3389","article-title":"Gapped BLAST and PSI-BLAST: a new generation of protein database search programs","volume":"25","author":"Altschul","year":"1997","journal-title":"Nucleic Acids Res."},{"key":"2023013110110590300_B2","volume-title":"Approximating Maximum Stable Set and Minimum Graph Coloring Problems with the Positive Semidefinite Relaxation.","author":"Benson","year":"2000"},{"key":"2023013110110590300_B3","doi-asserted-by":"crossref","first-page":"235","DOI":"10.1093\/nar\/28.1.235","article-title":"The Protein Data Bank","volume":"28","author":"Berman","year":"2000","journal-title":"Nucleic Acids Res."},{"issue":"Suppl. 1","key":"2023013110110590300_B4","doi-asserted-by":"crossref","first-page":"D522","DOI":"10.1093\/nar\/gkj039","article-title":"AffinDB: a freely accessible database of affinities for protein-ligand complexes from the PDB","volume":"34","author":"Block","year":"2006","journal-title":"Nucleic Acids Res."},{"key":"2023013110110590300_B5","doi-asserted-by":"crossref","first-page":"137","DOI":"10.1007\/s10107-002-0356-4","article-title":"Maximum stable set formulations and heuristics based on continuous optimization","volume":"94","author":"Burer","year":"2002","journal-title":"Mathematical Program."},{"key":"2023013110110590300_B6","doi-asserted-by":"crossref","first-page":"2832","DOI":"10.1039\/b801115j","article-title":"Covalent radii revisited","volume":"21","author":"Cordero","year":"2008","journal-title":"Dalton Trans."},{"key":"2023013110110590300_B7","doi-asserted-by":"crossref","first-page":"D231","DOI":"10.1093\/nar\/gkj062","article-title":"SitesBase: a database for structure-based protein-ligand binding site comparisons","volume":"34","author":"Gold","year":"2006","journal-title":"Nucleic Acids Res."},{"key":"2023013110110590300_B8","doi-asserted-by":"crossref","first-page":"991","DOI":"10.1021\/ci050400b","article-title":"The blue obelisk-interoperability in chemical informatics","volume":"46","author":"Guha","year":"2006","journal-title":"J. Chem. Inf. Model."},{"key":"2023013110110590300_B9","doi-asserted-by":"crossref","first-page":"409","DOI":"10.1002\/pro.5560010313","article-title":"Selection of representative protein data sets","volume":"1","author":"Hobohm","year":"1992","journal-title":"Protein Sci."},{"key":"2023013110110590300_B10","doi-asserted-by":"crossref","first-page":"333","DOI":"10.1002\/prot.20512","article-title":"Binding MOAD (Mother Of All Databases)","volume":"60","author":"Hu","year":"2005","journal-title":"Proteins"},{"key":"2023013110110590300_B11","article-title":"Daylight Theory Manual-Daylight 4.71","author":"James","year":"2000","journal-title":"Daylight Chemical Information Systems"},{"key":"2023013110110590300_B12","doi-asserted-by":"crossref","DOI":"10.1002\/3527609164","volume-title":"Pharmacophores and Pharmacophore Searches.","author":"Langer","year":"2006","edition":"1"},{"key":"2023013110110590300_B13","doi-asserted-by":"crossref","first-page":"266","DOI":"10.1093\/nar\/gki001","article-title":"PDBsum more: new summaries and analyses of the known 3D structures of proteins and nucleic acids","volume":"33","author":"Laskowski","year":"2005","journal-title":"Nucleic Acids Res."},{"key":"2023013110110590300_B14","doi-asserted-by":"crossref","first-page":"3183","DOI":"10.1073\/pnas.0611678104","article-title":"Growth of novel protein structural data","volume":"104","author":"Levitt","year":"2007","journal-title":"Proc. Natl Acad. Sci. USA"},{"key":"2023013110110590300_B15","doi-asserted-by":"crossref","first-page":"198","DOI":"10.1093\/nar\/gkl999","article-title":"BindingDB: a web-accessible database of experimentally determined protein-ligand binding affinities","volume":"35","author":"Liu","year":"2007","journal-title":"Nucleic Acids Res."},{"key":"2023013110110590300_B16","doi-asserted-by":"crossref","first-page":"165","DOI":"10.1002\/prot.21651","article-title":"Assessment of predictions submitted for the CASP7 function prediction category","volume":"69","author":"L\u00f3pez","year":"2007","journal-title":"Proteins Struct. Funct. Bioinform."},{"key":"2023013110110590300_B17","doi-asserted-by":"crossref","first-page":"1219","DOI":"10.1021\/jm960352+","article-title":"Selecting optimally diverse compounds from structure databases: a validation study of two-dimensional and three-dimensional molecular descriptors","volume":"40","author":"Matter","year":"1997","journal-title":"J. Med. Chem."},{"key":"2023013110110590300_B18","doi-asserted-by":"crossref","first-page":"2374","DOI":"10.1021\/ci700244t","article-title":"A pharmacophore map of small molecule protein kinase inhibitors","volume":"47","author":"McGregor","year":"2007","journal-title":"J. Chem. Inf. Model."},{"key":"2023013110110590300_B19","doi-asserted-by":"crossref","first-page":"1165","DOI":"10.1002\/(SICI)1096-987X(199908)20:11<1165::AID-JCC7>3.0.CO;2-A","article-title":"BLEEP - potential of mean force describing protein-ligand interactions: I. Generating potential","volume":"20","author":"Mitchell","year":"1999","journal-title":"J. Comput. Chem."},{"key":"2023013110110590300_B20","doi-asserted-by":"crossref","first-page":"2347","DOI":"10.1093\/bioinformatics\/bti337","article-title":"Real spherical harmonic expansion coefficients as 3D shape descriptors for protein binding pocket and ligand comparisons","volume":"21","author":"Morris","year":"2005","journal-title":"Bioinformatics"},{"key":"2023013110110590300_B21","doi-asserted-by":"crossref","first-page":"1449","DOI":"10.1093\/bioinformatics\/btl115","article-title":"GBPM: GRID-based pharmacophore model: concept and application studies to protein-protein recognition","volume":"22","author":"Ortuso","year":"2006","journal-title":"Bioinformatics"},{"key":"2023013110110590300_B22","doi-asserted-by":"crossref","first-page":"1856","DOI":"10.1093\/bioinformatics\/btg243","article-title":"Protein Ligand Database (PLD): additional understanding of the nature and specificity of protein-ligand complexes","volume":"19","author":"Puvanendrampillai","year":"2003","journal-title":"Bioinformatics"},{"key":"2023013110110590300_B23","doi-asserted-by":"crossref","first-page":"85","DOI":"10.1093\/protein\/12.2.85","article-title":"Twilight zone of protein sequence alignments","volume":"12","author":"Rost","year":"1999","journal-title":"Protein Eng."},{"key":"2023013110110590300_B24","doi-asserted-by":"crossref","first-page":"200","DOI":"10.1093\/bioinformatics\/18.1.200","article-title":"LigBase: a database of families of aligned ligand binding sites in known protein sequences and structures","volume":"18","author":"Stuart","year":"2002","journal-title":"Bioinformatics"},{"key":"2023013110110590300_B25","article-title":"An elementary mathematical theory of classification and prediction","volume-title":"IBM Internal Report.","author":"Tanimoto","year":"1958"},{"key":"2023013110110590300_B26","doi-asserted-by":"crossref","first-page":"247","DOI":"10.1111\/j.1574-6968.1999.tb13575.x","article-title":"Blast 2 sequences, a new tool for comparing protein and nucleotide sequences","volume":"174","author":"Tatusova","year":"1999","journal-title":"FEMS Microbiol. Lett."},{"key":"2023013110110590300_B27","doi-asserted-by":"crossref","first-page":"1816","DOI":"10.1021\/ci049920h","article-title":"Comparison of 2D similarity and 3D superposition. Application to searching a conformational drug database","volume":"44","author":"Thimm","year":"2004","journal-title":"J. Chem. Inf. Comput. Sci."},{"key":"2023013110110590300_B28","first-page":"2579","article-title":"Visualizing data using t-SNE","volume":"9","author":"van der Maaten","year":"2008","journal-title":"J. Mach. Learn. Res."},{"key":"2023013110110590300_B29","doi-asserted-by":"crossref","first-page":"1589","DOI":"10.1093\/bioinformatics\/btg224","article-title":"PISCES: a protein sequence culling server","volume":"19","author":"Wang","year":"2003","journal-title":"Bioinformatics"},{"key":"2023013110110590300_B30","doi-asserted-by":"crossref","first-page":"4111","DOI":"10.1021\/jm048957q","article-title":"The PDBbind Database: methodologies and Updates","volume":"48","author":"Wang","year":"2005","journal-title":"J. Med.Chem."},{"key":"2023013110110590300_B31","doi-asserted-by":"crossref","first-page":"32","DOI":"10.1016\/j.cbpa.2008.01.045","article-title":"Structural genomics and drug discovery: all in the family","volume":"12","author":"Weigelt","year":"2008","journal-title":"Curr. Opin. Chem. Biol."},{"key":"2023013110110590300_B32","doi-asserted-by":"crossref","first-page":"79","DOI":"10.1093\/jb\/mvh009","article-title":"Het-PDB Navi.: a database for protein-small molecule interactions","volume":"135","author":"Yamaguchi","year":"2004","journal-title":"J. Biochem."}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/25\/5\/615\/48983595\/bioinformatics_25_5_615.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/25\/5\/615\/48983595\/bioinformatics_25_5_615.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,1,31]],"date-time":"2023-01-31T19:47:29Z","timestamp":1675194449000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/25\/5\/615\/183421"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2009,1,19]]},"references-count":32,"journal-issue":{"issue":"5","published-print":{"date-parts":[[2009,3,1]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btp035","relation":{},"ISSN":["1367-4811","1367-4803"],"issn-type":[{"value":"1367-4811","type":"electronic"},{"value":"1367-4803","type":"print"}],"subject":[],"published-other":{"date-parts":[[2009,3,1]]},"published":{"date-parts":[[2009,1,19]]}}}