{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,11,4]],"date-time":"2025-11-04T22:59:30Z","timestamp":1762297170492},"reference-count":38,"publisher":"Oxford University Press (OUP)","issue":"5","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2007,3,1]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>Motivation: We are motivated by the fast-growing number of protein structures in the Protein Data Bank with necessary information for prediction of protein\u2013protein interaction sites to develop methods for identification of residues participating in protein\u2013protein interactions. We would like to compare conditional random fields (CRFs)-based method with conventional classification-based methods that omit the relation between two labels of neighboring residues to show the advantages of CRFs-based method in predicting protein\u2013protein interaction sites.<\/jats:p><jats:p>Results: The prediction of protein\u2013protein interaction sites is solved as a sequential labeling problem by applying CRFs with features including protein sequence profile and residue accessible surface area. The CRFs-based method can achieve a comparable performance with state-of-the-art methods, when 1276 nonredundant hetero-complex protein chains are used as training and test set. Experimental result shows that CRFs-based method is a powerful and robust protein\u2013protein interaction site prediction method and can be used to guide biologists to make specific experiments on proteins.<\/jats:p><jats:p>Availability: \u00a0http:\/\/www.insun.hit.edu.cn\/~mhli\/site_CRFs\/index.html<\/jats:p><jats:p>Contact: \u00a0mhli@insun.hit.edu.cn<\/jats:p><jats:p>Supplementary information: Supplementary data are available at Bioinformatics online.<\/jats:p>","DOI":"10.1093\/bioinformatics\/btl660","type":"journal-article","created":{"date-parts":[[2007,1,19]],"date-time":"2007-01-19T01:13:12Z","timestamp":1169169192000},"page":"597-604","source":"Crossref","is-referenced-by-count":67,"title":["Protein\u2013protein interaction site prediction based on conditional random fields"],"prefix":"10.1093","volume":"23","author":[{"given":"Ming-Hui","family":"Li","sequence":"first","affiliation":[{"name":"Bioinformatics Research Group, ITNLP Lab, Department of Computer Science and Technology, Harbin Institute of Technology, Harbin, China"}]},{"given":"Lei","family":"Lin","sequence":"additional","affiliation":[{"name":"Bioinformatics Research Group, ITNLP Lab, Department of Computer Science and Technology, Harbin Institute of Technology, Harbin, China"}]},{"given":"Xiao-Long","family":"Wang","sequence":"additional","affiliation":[{"name":"Bioinformatics Research Group, ITNLP Lab, Department of Computer Science and Technology, Harbin Institute of Technology, Harbin, China"}]},{"given":"Tao","family":"Liu","sequence":"additional","affiliation":[{"name":"Bioinformatics Research Group, ITNLP Lab, Department of Computer Science and Technology, Harbin Institute of Technology, Harbin, China"}]}],"member":"286","published-online":{"date-parts":[[2007,1,18]]},"reference":[{"key":"2023041109375461500_","doi-asserted-by":"crossref","first-page":"3389","DOI":"10.1093\/nar\/25.17.3389","article-title":"Gapped BLAST and PSI-BLAST: a new generation of protein database search programs","volume":"25","author":"Altschul","year":"1997","journal-title":"Nucleic Acids Res."},{"key":"2023041109375461500_","doi-asserted-by":"crossref","first-page":"235","DOI":"10.1093\/nar\/28.1.235","article-title":"The Protein Data Bank","volume":"28","author":"Berman","year":"2000","journal-title":"Nucleic Acids Res."},{"key":"2023041109375461500_","doi-asserted-by":"crossref","first-page":"1487","DOI":"10.1093\/bioinformatics\/bti242","article-title":"Improved prediction of protein\u2013protein binding sites using a support vector machines approach","volume":"21","author":"Bradford","year":"2005","journal-title":"Bioinformatics"},{"key":"2023041109375461500_","doi-asserted-by":"crossref","first-page":"340","DOI":"10.1038\/35030019","article-title":"Functional insights from the structure of the 30S ribosomal subunit and its interactions with antibiotics","volume":"407","author":"Carter","year":"2000","journal-title":"Nature"},{"key":"2023041109375461500_","unstructured":"Chang C-C \u00a0LinC-J LIBSVM: a library for support vector machines 2001 Software available at http:\/\/www.csie.ntu.edu.tw\/~cjlin\/libsvm"},{"key":"2023041109375461500_","doi-asserted-by":"crossref","first-page":"21","DOI":"10.1002\/prot.20514","article-title":"Prediction of interface residues in protein\u2013protein complexes by a consensus neural network method: test against NMR data","volume":"61","author":"Chen","year":"2005","journal-title":"Proteins Struct. Funct. Bioinfo."},{"key":"2023041109375461500_","article-title":"MUC-7 Named Entity Task Definition","volume-title":"Proc. of the Seventh Message Understanding Conference","author":"Chinchor","year":"1998"},{"key":"2023041109375461500_","doi-asserted-by":"crossref","first-page":"630","DOI":"10.1002\/prot.20741","article-title":"Exploiting sequence and structure homologs to identify protein\u2013protein binding sites","volume":"63","author":"Chung","year":"2006","journal-title":"Proteins Struct. Funct. Bioinfo."},{"key":"2023041109375461500_","doi-asserted-by":"crossref","first-page":"1731","DOI":"10.1021\/ja026939x","article-title":"HADDOCK: a protein\u2013protein docking approach based on biochemical or biophysical information","volume":"125","author":"Dominguez","year":"2003","journal-title":"J. Am. Chem. Soc."},{"key":"2023041109375461500_","doi-asserted-by":"crossref","first-page":"1356","DOI":"10.1046\/j.1432-1033.2002.02767.x","article-title":"Prediction of protein-protein interaction sites in heterocomplexes with neural networks","volume":"269","author":"Fariselli","year":"2002","journal-title":"Eur. J. Biochem."},{"key":"2023041109375461500_","first-page":"584","article-title":"Information extraction with HMM structures learned by stochastic optimization","author":"Freitag","year":"2000"},{"key":"2023041109375461500_","doi-asserted-by":"crossref","first-page":"610","DOI":"10.1002\/prot.20305","article-title":"The ConSurf-HSSP database: the mapping of evolutionary conservation among homologs onto PDB structures","volume":"58","author":"Glaser","year":"2005","journal-title":"Proteins Struct. Funct. Bioinfo."},{"key":"2023041109375461500_","doi-asserted-by":"crossref","first-page":"951","DOI":"10.1016\/j.molcel.2004.08.030","article-title":"Structure and stability of cohesin's Smc1-kleisin interaction","volume":"15","author":"Haering","year":"2004","journal-title":"Mol. Cell"},{"key":"2023041109375461500_","doi-asserted-by":"crossref","first-page":"235","DOI":"10.1002\/bip.360221211","article-title":"Dictionary of protein secondary structure: pattern recognition of hydrogen bonded and geometrical features","volume":"22","author":"Kabsch","year":"1983","journal-title":"Biopolymers"},{"key":"2023041109375461500_","doi-asserted-by":"crossref","first-page":"553","DOI":"10.1093\/protein\/gzg072","article-title":"Protein secondary structure prediction based on an improved support vector machines approach","volume":"16","author":"Kim","year":"2003","journal-title":"Protein Eng. Des. Sel."},{"key":"2023041109375461500_","doi-asserted-by":"crossref","first-page":"165","DOI":"10.1093\/protein\/gzh020","article-title":"Prediction of protein\u2013protein interaction sites using support vector machines","volume":"17","author":"Koike","year":"2004","journal-title":"Protein Eng. Des. and Sel."},{"key":"2023041109375461500_","first-page":"282","article-title":"Conditional random fields: probabilistic models for segmenting and labeling sequence data","author":"Lafferty","year":"2001"},{"key":"2023041109375461500_","doi-asserted-by":"crossref","first-page":"i197","DOI":"10.1093\/bioinformatics\/btg1026","article-title":"Predicting protein function from protein\/protein interaction data: a probabilistic approach","volume":"19","author":"Letovsky","year":"2003","journal-title":"Bioinformatics"},{"key":"2023041109375461500_","doi-asserted-by":"crossref","first-page":"3698","DOI":"10.1093\/nar\/gkl454","article-title":"Protein binding site prediction using an empirical scoring function","volume":"34","author":"Liang","year":"2006","journal-title":"Nucleic Acids Res."},{"key":"2023041109375461500_","doi-asserted-by":"crossref","first-page":"536","DOI":"10.1016\/S0076-6879(02)44739-8","article-title":"Evolutionary traces of functional surfaces along G protein signaling pathway","volume":"344","author":"Lichtarge","year":"2002","journal-title":"Methods Enzymol."},{"key":"2023041109375461500_","doi-asserted-by":"crossref","first-page":"3099","DOI":"10.1093\/bioinformatics\/bth370","article-title":"Comparison of probabilistic combination methods for protein secondary structure prediction","volume":"20","author":"Liu","year":"2004","journal-title":"Bioinformatics"},{"key":"2023041109375461500_","first-page":"408","article-title":"Segmentation conditional random fields (SCRFs): a new approach for protein fold recognition","author":"Liu","year":"2005"},{"key":"2023041109375461500_","first-page":"591","article-title":"Maximum entropy Markov models for information extraction and segmentation","author":"Mccallum","year":"2000"},{"key":"2023041109375461500_","doi-asserted-by":"crossref","first-page":"W20","DOI":"10.1093\/nar\/gkh435","article-title":"BLAST: at the core of a powerful and diverse set of sequence analysis tools","volume":"32","author":"McGinnis","year":"2004","journal-title":"Nucleic Acids Res."},{"key":"2023041109375461500_","doi-asserted-by":"crossref","first-page":"i302","DOI":"10.1093\/bioinformatics\/bti1054","article-title":"Whole-proteome prediction of protein function via graph-theoretic analysis of interaction maps","volume":"21","author":"Nabieva","year":"2005","journal-title":"Bioinformatics"},{"key":"2023041109375461500_","doi-asserted-by":"crossref","first-page":"236","DOI":"10.1016\/S0014-5793(03)00456-3","article-title":"Predicted protein\u2013protein interaction sites from local sequence information","volume":"544","author":"Ofrana","year":"2003","journal-title":"FEBS Letters"},{"key":"2023041109375461500_","unstructured":"Phan X-H \u00a0NguyenL-M FlexCRFs: flexible conditional random field toolkit 2005 http:\/\/www.jaist.ac.jp\/~hieuxuan\/flexcrfs\/flexcrfs.html"},{"key":"2023041109375461500_","doi-asserted-by":"crossref","first-page":"257","DOI":"10.1109\/5.18626","article-title":"A tutorial on hidden markov models and selected applications in speech recognition","volume":"77","author":"Rabiner","year":"1989","journal-title":"Proc. of the IEEE"},{"key":"2023041109375461500_","article-title":"A maximum entropy model for part-of-speech tagging","author":"Ratnaparkhi","year":"1996"},{"key":"2023041109375461500_","doi-asserted-by":"crossref","first-page":"2496","DOI":"10.1093\/bioinformatics\/bti340","article-title":"An evolution based classifier for prediction of protein interfaces without using protein structures","volume":"21","author":"Res","year":"2005","journal-title":"Bioinformatics"},{"key":"2023041109375461500_","doi-asserted-by":"crossref","first-page":"187","DOI":"10.1006\/csla.1996.0011","article-title":"A maximum entropy approach to adaptive statistical language modeling","volume":"10","author":"Rosenfeld","year":"1996","journal-title":"Computer, Speech and Language"},{"key":"2023041109375461500_","doi-asserted-by":"crossref","first-page":"216","DOI":"10.1002\/prot.340200303","article-title":"Conservation and prediction of solvent accessibility in protein families","volume":"20","author":"Rost","year":"1994","journal-title":"Proteins Struct. Funct. Gen."},{"key":"2023041109375461500_","doi-asserted-by":"crossref","first-page":"234","DOI":"10.1038\/84974","article-title":"Prediction and confirmation of a site critical for effector regulation of RGS domain activity","volume":"8","author":"Sowa","year":"2001","journal-title":"Nat Struct Biol"},{"key":"2023041109375461500_","article-title":"An introduction to conditional random fields for relational learning","volume-title":"Introduction to Statistical Relational Learning","author":"Sutton","year":"2006"},{"key":"2023041109375461500_","doi-asserted-by":"crossref","first-page":"687","DOI":"10.1016\/S0969-2126(02)00759-1","article-title":"Structures of two streptococcal superantigens bound to TCR beta chains reveal diversity in the architecture of T cell signaling complexes","volume":"10","author":"Sundberg","year":"2002","journal-title":"Structure"},{"key":"2023041109375461500_","doi-asserted-by":"crossref","first-page":"380","DOI":"10.1016\/j.febslet.2005.11.081","article-title":"Predicting protein interaction sites from residue spatial sequence profile and evolution rate","volume":"580","author":"Wanga","year":"2006","journal-title":"FEBS Letters"},{"key":"2023041109375461500_","doi-asserted-by":"crossref","first-page":"539","DOI":"10.2174\/0929867043455800","article-title":"Improving the understanding of human genetic diseases through predictions of protein structures and protein\u2013protein interaction sites","volume":"11","author":"Zhou","year":"2004","journal-title":"Curr. Med. Chem."},{"key":"2023041109375461500_","doi-asserted-by":"crossref","first-page":"336","DOI":"10.1002\/prot.1099","article-title":"Prediction of protein interaction sites from sequence profile and residue neighbor list","volume":"44","author":"Zhou","year":"2001","journal-title":"Proteins Struct. Funct. and Gen."}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/23\/5\/597\/49830116\/bioinformatics_23_5_597.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/23\/5\/597\/49830116\/bioinformatics_23_5_597.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,2,10]],"date-time":"2024-02-10T09:04:02Z","timestamp":1707555842000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/23\/5\/597\/238481"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2007,1,18]]},"references-count":38,"journal-issue":{"issue":"5","published-print":{"date-parts":[[2007,3,1]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btl660","relation":{},"ISSN":["1367-4811","1367-4803"],"issn-type":[{"value":"1367-4811","type":"electronic"},{"value":"1367-4803","type":"print"}],"subject":[],"published-other":{"date-parts":[[2007,3]]},"published":{"date-parts":[[2007,1,18]]}}}