{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,7]],"date-time":"2026-04-07T09:40:55Z","timestamp":1775554855931,"version":"3.50.1"},"reference-count":49,"publisher":"Oxford University Press (OUP)","issue":"2","license":[{"start":{"date-parts":[[2017,9,18]],"date-time":"2017-09-18T00:00:00Z","timestamp":1505692800000},"content-version":"vor","delay-in-days":0,"URL":"http:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/100008568","name":"CASE","doi-asserted-by":"publisher","award":["BB\/J013110\/1"],"award-info":[{"award-number":["BB\/J013110\/1"]}],"id":[{"id":"10.13039\/100008568","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2018,1,15]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:sec><jats:title>Motivation<\/jats:title><jats:p>Protein\u2013protein interactions are vital for protein function with the average protein having between three and ten interacting partners. Knowledge of precise protein\u2013protein interfaces comes from crystal structures deposited in the Protein Data Bank (PDB), but only 50% of structures in the PDB are complexes. There is therefore a need to predict protein\u2013protein interfaces in silico and various methods for this purpose. Here we explore the use of a predictor based on structural features and which exploits random forest machine learning, comparing its performance with a number of popular established methods.<\/jats:p><\/jats:sec><jats:sec><jats:title>Results<\/jats:title><jats:p>On an independent test set of obligate and transient complexes, our IntPred predictor performs well (MCC\u2009=\u20090.370, ACC\u2009=\u20090.811, SPEC\u2009=\u20090.916, SENS\u2009=\u20090.411) and compares favourably with other methods. Overall, IntPred ranks second of six methods tested with SPPIDER having slightly better overall performance (MCC\u2009=\u20090.410, ACC\u2009=\u20090.759, SPEC\u2009=\u20090.783, SENS\u2009=\u20090.676), but considerably worse specificity than IntPred. As with SPPIDER, using an independent test set of obligate complexes enhanced performance (MCC\u2009=\u20090.381) while performance is somewhat reduced on a dataset of transient complexes (MCC\u2009=\u20090.303). The trade-off between sensitivity and specificity compared with SPPIDER suggests that the choice of the appropriate tool is application-dependent.<\/jats:p><\/jats:sec><jats:sec><jats:title>Availability and implementation<\/jats:title><jats:p>IntPred is implemented in Perl and may be downloaded for local use or run via a web server at www.bioinf.org.uk\/intpred\/.<\/jats:p><\/jats:sec><jats:sec><jats:title>Supplementary information<\/jats:title><jats:p>Supplementary data are available at Bioinformatics online.<\/jats:p><\/jats:sec>","DOI":"10.1093\/bioinformatics\/btx585","type":"journal-article","created":{"date-parts":[[2017,9,15]],"date-time":"2017-09-15T19:13:59Z","timestamp":1505502839000},"page":"223-229","source":"Crossref","is-referenced-by-count":92,"title":["IntPred: a structure-based predictor of protein\u2013protein interaction sites"],"prefix":"10.1093","volume":"34","author":[{"given":"Thomas C","family":"Northey","sequence":"first","affiliation":[{"name":"Institute of Structural and Molecular Biology, Division of Biosciences, University College London, London, UK"}]},{"given":"Anja","family":"Bare\u0161i\u0107","sequence":"additional","affiliation":[{"name":"Institute of Structural and Molecular Biology, Division of Biosciences, University College London, London, UK"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-2835-2572","authenticated-orcid":false,"given":"Andrew C R","family":"Martin","sequence":"additional","affiliation":[{"name":"Institute of Structural and Molecular Biology, Division of Biosciences, University College London, London, UK"}]}],"member":"286","published-online":{"date-parts":[[2017,9,18]]},"reference":[{"key":"2023012712224251800_btx585-B1","doi-asserted-by":"crossref","first-page":"S4","DOI":"10.1186\/1471-2164-14-S3-S4","article-title":"The SAAP pipeline and database: tools to analyze the impact and predict the pathogenicity of mutations","volume":"14","author":"Al-Numair","year":"2013","journal-title":"BMC Genomics"},{"key":"2023012712224251800_btx585-B2","doi-asserted-by":"crossref","first-page":"2947","DOI":"10.1093\/bioinformatics\/btw362","article-title":"The structural effects of mutations can aid in differential phenotype prediction of beta-myosin heavy chain (Myosin-7) missense variants","volume":"32","author":"Al-Numair","year":"2016","journal-title":"Bioinformatics"},{"key":"2023012712224251800_btx585-B3","doi-asserted-by":"crossref","first-page":"403","DOI":"10.1016\/S0022-2836(05)80360-2","article-title":"Basic local alignment search tool","volume":"215","author":"Altschul","year":"1990","journal-title":"J. Mol. Biol"},{"key":"2023012712224251800_btx585-B4","doi-asserted-by":"crossref","first-page":"97","DOI":"10.1016\/0079-6107(84)90007-5","article-title":"Hydrogen bonding in globular proteins","volume":"44","author":"Baker","year":"1984","journal-title":"Prog. Biophys. Mol. Biol"},{"key":"2023012712224251800_btx585-B5","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1006\/jmbi.1998.1843","article-title":"Anatomy of hot spots in protein interfaces","volume":"280","author":"Bogan","year":"1998","journal-title":"J. Mol. Biol"},{"key":"2023012712224251800_btx585-B6","doi-asserted-by":"crossref","first-page":"353","DOI":"10.1002\/prot.20433","article-title":"Statistical analysis and prediction of protein\u2013protein interfaces","volume":"60","author":"Bordner","year":"2005","journal-title":"Proteins"},{"key":"2023012712224251800_btx585-B7","doi-asserted-by":"crossref","first-page":"292","DOI":"10.1016\/j.sbi.2004.05.003","article-title":"Protein interaction networks from yeast to human","volume":"14","author":"Bork","year":"2004","journal-title":"Curr. Opin. Struct. Biol"},{"key":"2023012712224251800_btx585-B8","doi-asserted-by":"crossref","first-page":"1487","DOI":"10.1093\/bioinformatics\/bti242","article-title":"Improved prediction of protein\u2013protein binding sites using a support vector machines approach","volume":"21","author":"Bradford","year":"2005","journal-title":"Bioinformatics"},{"key":"2023012712224251800_btx585-B9","doi-asserted-by":"crossref","first-page":"5","DOI":"10.1023\/A:1010933404324","article-title":"Random forests","volume":"45","author":"Breiman","year":"2001","journal-title":"Mach. Learn"},{"key":"2023012712224251800_btx585-B10","doi-asserted-by":"crossref","first-page":"21","DOI":"10.1002\/prot.20514","article-title":"Prediction of interface residues in protein\u2013protein complexes by a consensus neural network method: test against NMR data","volume":"61","author":"Chen","year":"2005","journal-title":"Proteins"},{"key":"2023012712224251800_btx585-B11","doi-asserted-by":"crossref","first-page":"630","DOI":"10.1002\/prot.20741","article-title":"Exploiting sequence and structure homologs to identify protein\u2013protein binding sites","volume":"62","author":"Chung","year":"2005","journal-title":"Proteins"},{"key":"2023012712224251800_btx585-B12","doi-asserted-by":"crossref","first-page":"3460","DOI":"10.1093\/bioinformatics\/btv398","article-title":"Functional classification of CATH superfamilies: a domain-based approach for protein function annotation","volume":"31","author":"Das","year":"2015","journal-title":"Bioinformatics"},{"key":"2023012712224251800_btx585-B13","doi-asserted-by":"crossref","first-page":"394","DOI":"10.2174\/138920308785132712","article-title":"How proteins get in touch: interface prediction in the study of biomolecular complexes","volume":"9","author":"de Vries","year":"2008","journal-title":"Curr. Protein Pept. Sci"},{"key":"2023012712224251800_btx585-B14","doi-asserted-by":"crossref","first-page":"1792","DOI":"10.1093\/nar\/gkh340","article-title":"MUSCLE: multiple sequence alignment with high accuracy and high throughput","volume":"32","author":"Edgar","year":"2004","journal-title":"Nucleic Acids Res"},{"key":"2023012712224251800_btx585-B15","doi-asserted-by":"crossref","first-page":"117","DOI":"10.1093\/bib\/bbv027","article-title":"Progress and challenges in predicting protein interfaces","volume":"17","author":"Esmaielbeiki","year":"2016","journal-title":"Brief. Bioinf"},{"key":"2023012712224251800_btx585-B16","doi-asserted-by":"crossref","first-page":"1356","DOI":"10.1046\/j.1432-1033.2002.02767.x","article-title":"Prediction of protein\u2013protein interaction sites in heterocomplexes with neural networks","volume":"269","author":"Fariselli","year":"2002","journal-title":"Eur. J. Biochem"},{"key":"2023012712224251800_btx585-B17","doi-asserted-by":"crossref","first-page":"215","DOI":"10.1098\/rsif.2006.0115","article-title":"Targeting protein\u2013protein interactions by rational design: mimicry of protein surfaces","volume":"3","author":"Fletcher","year":"2006","journal-title":"J. R. Soc. Interface"},{"key":"2023012712224251800_btx585-B18","doi-asserted-by":"crossref","first-page":"605","DOI":"10.1093\/bioinformatics\/btl683","article-title":"Comparison of human protein\u2013protein interaction maps","volume":"23","author":"Futschik","year":"2007","journal-title":"Bioinformatics"},{"key":"2023012712224251800_btx585-B19","doi-asserted-by":"crossref","first-page":"49","DOI":"10.1186\/1471-2156-11-49","article-title":"An application of Random Forests to a genome-wide association dataset: methodological considerations and new findings","volume":"11","author":"Goldstein","year":"2010","journal-title":"BMC Genet"},{"key":"2023012712224251800_btx585-B20","doi-asserted-by":"crossref","first-page":"10","DOI":"10.1145\/1656274.1656278","article-title":"The weka data mining software: An update","volume":"11","author":"Hall","year":"2009","journal-title":"SIGKDD Explor. Newsl"},{"key":"2023012712224251800_btx585-B21","doi-asserted-by":"crossref","DOI":"10.1007\/978-0-387-84858-7","volume-title":"The Elements of Statistical Learning","author":"Hastie","year":"2009","edition":"2nd edn."},{"key":"2023012712224251800_btx585-B22","doi-asserted-by":"crossref","first-page":"119","DOI":"10.1093\/protein\/2.2.119","article-title":"Model building of disulfide bonds in proteins with known three-dimensional structure","volume":"2","author":"Hazes","year":"1988","journal-title":"Protein Eng"},{"key":"2023012712224251800_btx585-B23","doi-asserted-by":"crossref","first-page":"121","DOI":"10.1006\/jmbi.1997.1234","article-title":"Analysis of protein\u2013protein interaction sites using surface patches","volume":"272","author":"Jones","year":"1997","journal-title":"J. Mol. Biol"},{"key":"2023012712224251800_btx585-B24","doi-asserted-by":"crossref","first-page":"2577","DOI":"10.1002\/bip.360221211","article-title":"Dictionary of protein secondary structure: Pattern recognition of hydrogen-bonded and geometrical features","volume":"22","author":"Kabsch","year":"1983","journal-title":"Biopolymers"},{"key":"2023012712224251800_btx585-B25","doi-asserted-by":"crossref","first-page":"1225","DOI":"10.1021\/cr040409x","article-title":"Principles of protein\u2013protein interactions: what are the preferred ways for proteins to interact?","volume":"108","author":"Keskin","year":"2008","journal-title":"Chem. Rev"},{"key":"2023012712224251800_btx585-B26","doi-asserted-by":"crossref","first-page":"165","DOI":"10.1093\/protein\/gzh020","article-title":"Prediction of protein\u2013protein interaction sites using support vector machines","volume":"17","author":"Koike","year":"2004","journal-title":"Protein Eng. Des. Sel"},{"key":"2023012712224251800_btx585-B27","doi-asserted-by":"crossref","first-page":"774","DOI":"10.1016\/j.jmb.2007.05.022","article-title":"Inference of macromolecular assemblies from crystalline state","volume":"372","author":"Krissinel","year":"2007","journal-title":"J. Mol. Biol"},{"key":"2023012712224251800_btx585-B28","doi-asserted-by":"crossref","first-page":"400","DOI":"10.1002\/prot.21233","article-title":"PIER: protein interface recognition for structural proteomics","volume":"67","author":"Kufareva","year":"2007","journal-title":"Proteins Struct. Funct. Bioinf"},{"key":"2023012712224251800_btx585-B29","doi-asserted-by":"crossref","first-page":"105","DOI":"10.1016\/0022-2836(82)90515-0","article-title":"A simple method for displaying the hydropathic character of a protein","volume":"157","author":"Kyte","year":"1982","journal-title":"J. Mol. Biol"},{"key":"2023012712224251800_btx585-B30","doi-asserted-by":"crossref","first-page":"3698","DOI":"10.1093\/nar\/gkl454","article-title":"Protein binding site prediction using an empirical scoring function","volume":"34","author":"Liang","year":"2006","journal-title":"Nucleic Acids Res"},{"key":"2023012712224251800_btx585-B31","doi-asserted-by":"crossref","first-page":"2177","DOI":"10.1006\/jmbi.1998.2439","article-title":"The atomic structure of protein\u2013protein recognition sites","volume":"285","author":"Lo Conte","year":"1999","journal-title":"J. Mol. Biol"},{"key":"2023012712224251800_btx585-B32","doi-asserted-by":"crossref","first-page":"4297","DOI":"10.1093\/bioinformatics\/bti694","article-title":"Mapping PDB chains to UniProtKB entries","volume":"21","author":"Martin","year":"2005","journal-title":"Bioinformatics"},{"key":"2023012712224251800_btx585-B33","doi-asserted-by":"crossref","first-page":"418","DOI":"10.1186\/1471-2105-9-418","article-title":"Automatically extracting functionally equivalent proteins from SwissProt","volume":"9","author":"McMillan","year":"2008","journal-title":"BMC Bioinformatics"},{"key":"2023012712224251800_btx585-B34","doi-asserted-by":"crossref","first-page":"e1000350","DOI":"10.1371\/journal.pcbi.1000350","article-title":"Information flow analysis of interactome networks","volume":"5","author":"Missiuro","year":"2009","journal-title":"PLoS Comput. Biol"},{"key":"2023012712224251800_btx585-B35","doi-asserted-by":"crossref","first-page":"181","DOI":"10.1016\/j.jmb.2004.02.040","article-title":"ProMate: a structure based prediction program to identify the location of protein\u2013protein binding sites","volume":"338","author":"Neuvirth","year":"2004","journal-title":"J. Mol. Biol"},{"key":"2023012712224251800_btx585-B36","doi-asserted-by":"crossref","first-page":"236","DOI":"10.1016\/S0014-5793(03)00456-3","article-title":"Predicted protein\u2013protein interaction sites from local sequence information","volume":"544","author":"Ofran","year":"2003","journal-title":"FEBS Lett"},{"key":"2023012712224251800_btx585-B37","doi-asserted-by":"crossref","first-page":"863","DOI":"10.1016\/j.jmb.2007.03.036","article-title":"HotPatch: a statistical approach to finding biologically relevant features on protein surfaces","volume":"369","author":"Pettit","year":"2007","journal-title":"J. Mol. Biol"},{"key":"2023012712224251800_btx585-B38","doi-asserted-by":"crossref","first-page":"630","DOI":"10.1002\/prot.21248","article-title":"Prediction-based fingerprints of protein\u2013protein interactions","volume":"66","author":"Porollo","year":"2006","journal-title":"Proteins Struct. Funct. Bioinf"},{"key":"2023012712224251800_btx585-B39","doi-asserted-by":"crossref","first-page":"4017","DOI":"10.1093\/bioinformatics\/btv482","article-title":"BiopLib and BiopTools \u2013 a C programming library and toolset for manipulating protein structure","volume":"31","author":"Porter","year":"2015","journal-title":"Bioinformatics"},{"key":"2023012712224251800_btx585-B40","doi-asserted-by":"crossref","first-page":"3386","DOI":"10.1093\/bioinformatics\/btm434","article-title":"meta-PPISP: a meta web server for protein\u2013protein interaction site prediction","volume":"23","author":"Qin","year":"2007","journal-title":"Bioinformatics"},{"key":"2023012712224251800_btx585-B41","doi-asserted-by":"crossref","first-page":"547","DOI":"10.1038\/modpathol.3800322","article-title":"Tumor classification by tissue microarray profiling: random forest clustering applied to renal cell carcinoma","volume":"18","author":"Shi","year":"2005","journal-title":"Mod. Pathol"},{"key":"2023012712224251800_btx585-B42","doi-asserted-by":"crossref","first-page":"1947","DOI":"10.1021\/ci034160g","article-title":"Random forest: a classification and regression tool for compound classification and QSAR modeling","volume":"43","author":"Svetnik","year":"2003","journal-title":"J. Chem. Inf. Comput. Sci"},{"key":"2023012712224251800_btx585-B43","doi-asserted-by":"crossref","first-page":"108","DOI":"10.1002\/1097-0134(20010101)42:1<108::AID-PROT110>3.0.CO;2-O","article-title":"Protein\u2013protein interfaces: analysis of amino acid conservation in homodimers","volume":"42","author":"Valdar","year":"2001","journal-title":"Proteins"},{"key":"2023012712224251800_btx585-B44","doi-asserted-by":"crossref","first-page":"380","DOI":"10.1016\/j.febslet.2005.11.081","article-title":"Predicting protein interaction sites from residue spatial sequence profile and evolution rate","volume":"580","author":"Wang","year":"2006","journal-title":"FEBS Lett"},{"key":"2023012712224251800_btx585-B45","doi-asserted-by":"crossref","first-page":"1589","DOI":"10.1093\/bioinformatics\/btg224","article-title":"PISCES: a protein sequence culling server","volume":"19","author":"Wang","year":"2003","journal-title":"Bioinformatics"},{"key":"2023012712224251800_btx585-B46","volume-title":"Data Mining: Practical Machine Learning Tools and Techniques","author":"Witten","year":"2011","edition":"3rd edn."},{"key":"2023012712224251800_btx585-B47","doi-asserted-by":"crossref","first-page":"2203","DOI":"10.1093\/bioinformatics\/btm323","article-title":"Interaction-site prediction for protein complexes: a critical assessment","volume":"23","author":"Zhou","year":"2007","journal-title":"Bioinformatics"},{"key":"2023012712224251800_btx585-B48","doi-asserted-by":"crossref","first-page":"336","DOI":"10.1002\/prot.1099","article-title":"Prediction of protein interaction sites from sequence profile and residue neighbor list","volume":"44","author":"Zhou","year":"2001","journal-title":"Proteins"},{"key":"2023012712224251800_btx585-B49","doi-asserted-by":"crossref","first-page":"27","DOI":"10.1186\/1471-2105-7-27","article-title":"NOXclass: prediction of protein\u2013protein interaction types","volume":"7","author":"Zhu","year":"2006","journal-title":"BMC Bioinformatics"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/34\/2\/223\/48912920\/bioinformatics_34_2_223.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/34\/2\/223\/48912920\/bioinformatics_34_2_223.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,8,25]],"date-time":"2023-08-25T23:14:41Z","timestamp":1693005281000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/34\/2\/223\/4160676"}},"subtitle":[],"editor":[{"given":"Alfonso","family":"Valencia","sequence":"additional","affiliation":[]}],"short-title":[],"issued":{"date-parts":[[2017,9,18]]},"references-count":49,"journal-issue":{"issue":"2","published-print":{"date-parts":[[2018,1,15]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btx585","relation":{},"ISSN":["1367-4803","1367-4811"],"issn-type":[{"value":"1367-4803","type":"print"},{"value":"1367-4811","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2018,1,15]]},"published":{"date-parts":[[2017,9,18]]}}}