{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,20]],"date-time":"2026-02-20T16:51:27Z","timestamp":1771606287951,"version":"3.50.1"},"reference-count":39,"publisher":"Oxford University Press (OUP)","issue":"15","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2010,8,1]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:p>Motivation: The limited availability of protein structures often restricts the functional annotation of proteins and the identification of their protein\u2013protein interaction sites. Computational methods to identify interaction sites from protein sequences alone are, therefore, required for unraveling the functions of many proteins. This article describes a new method (PSIVER) to predict interaction sites, i.e. residues binding to other proteins, in protein sequences. Only sequence features (position-specific scoring matrix and predicted accessibility) are used for training a Na\u00efve Bayes classifier (NBC), and conditional probabilities of each sequence feature are estimated using a kernel density estimation method (KDE).<\/jats:p>\n               <jats:p>Results: The leave-one out cross-validation of PSIVER achieved a Matthews correlation coefficient (MCC) of 0.151, an F-measure of 35.3%, a precision of 30.6% and a recall of 41.6% on a non-redundant set of 186 protein sequences extracted from 105 heterodimers in the Protein Data Bank (consisting of 36 219 residues, of which 15.2% were known interface residues). Even though the dataset used for training was highly imbalanced, a randomization test demonstrated that the proposed method managed to avoid overfitting. PSIVER was also tested on 72 sequences not used in training (consisting of 18 140 residues, of which 10.6% were known interface residues), and achieved an MCC of 0.135, an F-measure of 31.5%, a precision of 25.0% and a recall of 46.5%, outperforming other publicly available servers tested on the same dataset. PSIVER enables experimental biologists to identify potential interface residues in unknown proteins from sequence information alone, and to mutate those residues selectively in order to unravel protein functions.<\/jats:p>\n               <jats:p>Availability: Freely available on the web at http:\/\/tardis.nibio.go.jp\/PSIVER\/<\/jats:p>\n               <jats:p>Contact: \u00a0yoichi@nibio.go.jp; kenji@nibio.go.jp<\/jats:p>\n               <jats:p>Supplementary information: \u00a0Supplementary data are available at Bioinformatics online.<\/jats:p>","DOI":"10.1093\/bioinformatics\/btq302","type":"journal-article","created":{"date-parts":[[2010,6,8]],"date-time":"2010-06-08T01:20:28Z","timestamp":1275960028000},"page":"1841-1848","source":"Crossref","is-referenced-by-count":276,"title":["Applying the Na\u00efve Bayes classifier with kernel density estimation to the prediction of protein\u2013protein interaction sites"],"prefix":"10.1093","volume":"26","author":[{"given":"Yoichi","family":"Murakami","sequence":"first","affiliation":[{"name":"National Institute of Biomedical Innovation, Osaka, Japan"}]},{"given":"Kenji","family":"Mizuguchi","sequence":"additional","affiliation":[{"name":"National Institute of Biomedical Innovation, Osaka, Japan"}]}],"member":"286","published-online":{"date-parts":[[2010,6,6]]},"reference":[{"key":"2023012507594959300_B1","doi-asserted-by":"crossref","first-page":"467","DOI":"10.1002\/prot.20441","article-title":"Combining prediction of secondary structure and solvent accessibility in proteins","volume":"59","author":"Adamczak","year":"2005","journal-title":"Proteins"},{"key":"2023012507594959300_B2","doi-asserted-by":"crossref","first-page":"3389","DOI":"10.1093\/nar\/25.17.3389","article-title":"Gapped BLAST and PSI- BLAST: a new generation of protein database search programs","volume":"25","author":"Altschul","year":"1997","journal-title":"Nucleic Acids Res."},{"key":"2023012507594959300_B3","doi-asserted-by":"crossref","first-page":"412","DOI":"10.1093\/bioinformatics\/16.5.412","article-title":"Assessing the accuracy of prediction algorithms for classification: an overview","volume":"16","author":"Baldi","year":"2000","journal-title":"Bioinformatics"},{"key":"2023012507594959300_B4","doi-asserted-by":"crossref","first-page":"235","DOI":"10.1093\/nar\/28.1.235","article-title":"The Protein Data Bank","volume":"28","author":"Berman","year":"2000","journal-title":"Nucleic Acids Res."},{"key":"2023012507594959300_B5","doi-asserted-by":"crossref","first-page":"1335","DOI":"10.1093\/bioinformatics\/btl079","article-title":"Predicting protein interaction sites: binding hot-spots in protein-protein and protein-ligand interfaces","volume":"22","author":"Burgoyne","year":"2006","journal-title":"Bioinformatics"},{"key":"2023012507594959300_B6","doi-asserted-by":"crossref","first-page":"585","DOI":"10.1093\/bioinformatics\/btp039","article-title":"Sequence-based prediction of protein interaction sites with an integrative method","volume":"25","author":"Chen","year":"2009","journal-title":"Bioinformatics"},{"key":"2023012507594959300_B7","doi-asserted-by":"crossref","first-page":"313","DOI":"10.1093\/nar\/26.1.313","article-title":"The HSSP database of protein structure-sequence alignments and family profiles","volume":"26","author":"Dodge","year":"1998","journal-title":"Nucleic Acids Res."},{"key":"2023012507594959300_B8","doi-asserted-by":"crossref","first-page":"1356","DOI":"10.1046\/j.1432-1033.2002.02767.x","article-title":"Prediction of protein\u2013protein interaction sites in heterocomplexes with neural networks","volume":"269","author":"Fariselli","year":"2002","journal-title":"Eur. J. Biochem."},{"key":"2023012507594959300_B9","doi-asserted-by":"crossref","first-page":"233","DOI":"10.1093\/bib\/bbp021","article-title":"Progress and challenges in predicting protein-protein interaction sites","volume":"10","author":"Ezkurdia","year":"2009","journal-title":"Brief. Bioinform."},{"key":"2023012507594959300_B10","doi-asserted-by":"crossref","first-page":"134","DOI":"10.1002\/prot.20285","article-title":"Optimal docking area: a new method for predicting protein-protein interaction sites","volume":"58","author":"Fernandez-Recio","year":"2005","journal-title":"Proteins"},{"key":"2023012507594959300_B11","doi-asserted-by":"crossref","first-page":"296","DOI":"10.1197\/jamia.M1733","article-title":"Agreement, the f-measure, and reliability in information retrieval","volume":"12","author":"Hripcsak","year":"2005","journal-title":"J. Am. Med. Inform. Assoc."},{"key":"2023012507594959300_B12","volume-title":"\u2018NACCESS\u2019, Computer Program.","author":"Hubbard","year":"1993"},{"key":"2023012507594959300_B13","doi-asserted-by":"crossref","first-page":"705","DOI":"10.1002\/prot.22106","article-title":"Protein-protein docking benchmark version 3.0","volume":"73","author":"Hwang","year":"2008","journal-title":"Proteins"},{"key":"2023012507594959300_B14","doi-asserted-by":"crossref","first-page":"133","DOI":"10.1006\/jmbi.1997.1233","article-title":"Prediction of protein-protein interaction sites using patch analysis","volume":"272","author":"Jones","year":"1997","journal-title":"J. Mol. Biol."},{"key":"2023012507594959300_B15","doi-asserted-by":"crossref","first-page":"121","DOI":"10.1006\/jmbi.1997.1234","article-title":"Analysis of protein-protein interaction sites using surface patches","volume":"272","author":"Jones","year":"1997","journal-title":"J. Mol. Biol."},{"key":"2023012507594959300_B16","doi-asserted-by":"crossref","first-page":"379","DOI":"10.1016\/0022-2836(71)90324-X","article-title":"The interpretation of protein structures: estimation of static accessibility","volume":"55","author":"Lee","year":"1971","journal-title":"J. Mol. Biol."},{"key":"2023012507594959300_B17","doi-asserted-by":"crossref","first-page":"442","DOI":"10.1016\/0005-2795(75)90109-9","article-title":"Comparison of the predicted and observed secondary structure of T4 phage lysozyme","volume":"405","author":"Matthews","year":"1975","journal-title":"Biochim. Biophys. Acta"},{"key":"2023012507594959300_B18","doi-asserted-by":"crossref","first-page":"214","DOI":"10.1002\/prot.20560","article-title":"Protein-Protein Docking Benchmark 2.0: an update","volume":"60","author":"Mintseris","year":"2005","journal-title":"Proteins"},{"key":"2023012507594959300_B19","volume-title":"Machine Learning.","author":"Mitchell","year":"1997"},{"key":"2023012507594959300_B20","doi-asserted-by":"crossref","first-page":"536","DOI":"10.1016\/S0022-2836(05)80134-2","article-title":"SCOP: a structural classification of proteins database for the investigation of sequences and structures","volume":"247","author":"Murzin","year":"1995","journal-title":"J. Mol. Biol."},{"key":"2023012507594959300_B21","doi-asserted-by":"crossref","first-page":"181","DOI":"10.1016\/j.jmb.2004.02.040","article-title":"ProMate: a structure based prediction program to identify the location of protein-protein binding sites","volume":"338","author":"Neuvirth","year":"2004","journal-title":"J. Mol. Biol."},{"key":"2023012507594959300_B22","doi-asserted-by":"crossref","first-page":"991","DOI":"10.1016\/S0022-2836(02)01281-0","article-title":"Structural characterisation and functional significance of transient protein-protein interactions","volume":"325","author":"Nooren","year":"2003","journal-title":"J. Mol. Biol."},{"key":"2023012507594959300_B23","doi-asserted-by":"crossref","first-page":"236","DOI":"10.1016\/S0014-5793(03)00456-3","article-title":"Predicted protein-protein interaction sites from local sequence information","volume":"544","author":"Ofran","year":"2003","journal-title":"FEBS Lett."},{"key":"2023012507594959300_B24","doi-asserted-by":"crossref","first-page":"e13","DOI":"10.1093\/bioinformatics\/btl303","article-title":"ISIS: interaction sites identified from sequence","volume":"23","author":"Ofran","year":"2007","journal-title":"Bioinformatics"},{"key":"2023012507594959300_B25","doi-asserted-by":"crossref","first-page":"1065","DOI":"10.1214\/aoms\/1177704472","article-title":"On estimation of a probability density function and mode","volume":"33","author":"Parzen","year":"1962","journal-title":"Ann. Math. Stat."},{"key":"2023012507594959300_B26","doi-asserted-by":"crossref","first-page":"630","DOI":"10.1002\/prot.21248","article-title":"Prediction-based fingerprints of protein-protein interactions","volume":"66","author":"Porollo","year":"2007","journal-title":"Proteins"},{"key":"2023012507594959300_B27","doi-asserted-by":"crossref","first-page":"490","DOI":"10.1002\/prot.20865","article-title":"Evaluation of different biological data and computational classification methods for use in protein interaction prediction","volume":"63","author":"Qi","year":"2006","journal-title":"Proteins"},{"key":"2023012507594959300_B28","doi-asserted-by":"crossref","first-page":"2496","DOI":"10.1093\/bioinformatics\/bti340","article-title":"An evolution based classifier for prediction of protein interfaces without using protein structures","volume":"21","author":"Res","year":"2005","journal-title":"Bioinformatics"},{"key":"2023012507594959300_B29","doi-asserted-by":"crossref","first-page":"666","DOI":"10.1038\/nchembio.119","article-title":"Targeting and tinkering with interaction networks","volume":"4","author":"Russell","year":"2008","journal-title":"Nat. Chem. Biol."},{"key":"2023012507594959300_B30","doi-asserted-by":"crossref","first-page":"317","DOI":"10.1023\/A:1009752403260","article-title":"On comparing classifiers: pitfalls to avoid and a recommended approach","volume":"1","author":"Salzberg","year":"1997","journal-title":"Data Mining and Knowledge Discovery"},{"key":"2023012507594959300_B31","doi-asserted-by":"crossref","first-page":"e1000278","DOI":"10.1371\/journal.pcbi.1000278","article-title":"Prediction of protein-protein interaction sites in sequences and 3D structures by random forests","volume":"5","author":"Sikic","year":"2009","journal-title":"PLoS Comput. Biol."},{"key":"2023012507594959300_B32","doi-asserted-by":"crossref","first-page":"W578","DOI":"10.1093\/nar\/gkm294","article-title":"RNABindR: a server for analyzing and predicting RNA-binding sites in proteins","volume":"35","author":"Terribilini","year":"2007","journal-title":"Nucleic Acids Res."},{"key":"2023012507594959300_B33","doi-asserted-by":"crossref","first-page":"2964","DOI":"10.1093\/bioinformatics\/bth340","article-title":"Transmembrane proteins in the Protein Data Bank: identification and classification","volume":"20","author":"Tusnady","year":"2004","journal-title":"Bioinformatics"},{"key":"2023012507594959300_B34","doi-asserted-by":"crossref","first-page":"355","DOI":"10.1089\/cmb.2005.12.355","article-title":"Linear regression models for solvent accessibility prediction in proteins","volume":"12","author":"Wagner","year":"2005","journal-title":"J. Comput. Biol."},{"key":"2023012507594959300_B35","doi-asserted-by":"crossref","first-page":"380","DOI":"10.1016\/j.febslet.2005.11.081","article-title":"Predicting protein interaction sites from residue spatial sequence profile and evolution rate","volume":"580","author":"Wang","year":"2006","journal-title":"FEBS Lett."},{"issue":"Suppl. 1","key":"2023012507594959300_B36","doi-asserted-by":"crossref","first-page":"i371","DOI":"10.1093\/bioinformatics\/bth920","article-title":"A two-stage classifier for identification of protein-protein interface residues","volume":"20","author":"Yan","year":"2004","journal-title":"Bioinformatics"},{"key":"2023012507594959300_B37","doi-asserted-by":"crossref","first-page":"262","DOI":"10.1186\/1471-2105-7-262","article-title":"Predicting DNA-binding sites of proteins from amino acid sequence","volume":"7","author":"Yan","year":"2006","journal-title":"BMC Bioinformatics"},{"key":"2023012507594959300_B38","doi-asserted-by":"crossref","first-page":"2203","DOI":"10.1093\/bioinformatics\/btm323","article-title":"Interaction-site prediction for protein complexes: a critical assessment","volume":"23","author":"Zhou","year":"2007","journal-title":"Bioinformatics"},{"key":"2023012507594959300_B39","doi-asserted-by":"crossref","first-page":"336","DOI":"10.1002\/prot.1099","article-title":"Prediction of protein interaction sites from sequence profile and residue neighbor list","volume":"44","author":"Zhou","year":"2001","journal-title":"Proteins."}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/26\/15\/1841\/48854376\/bioinformatics_26_15_1841.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/26\/15\/1841\/48854376\/bioinformatics_26_15_1841.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,1,25]],"date-time":"2023-01-25T08:00:07Z","timestamp":1674633607000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/26\/15\/1841\/189350"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2010,6,6]]},"references-count":39,"journal-issue":{"issue":"15","published-print":{"date-parts":[[2010,8,1]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btq302","relation":{},"ISSN":["1367-4811","1367-4803"],"issn-type":[{"value":"1367-4811","type":"electronic"},{"value":"1367-4803","type":"print"}],"subject":[],"published-other":{"date-parts":[[2010,8,1]]},"published":{"date-parts":[[2010,6,6]]}}}