{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,10]],"date-time":"2026-03-10T15:31:39Z","timestamp":1773156699879,"version":"3.50.1"},"reference-count":54,"publisher":"Oxford University Press (OUP)","issue":"1","license":[{"start":{"date-parts":[[2016,10,1]],"date-time":"2016-10-01T00:00:00Z","timestamp":1475280000000},"content-version":"vor","delay-in-days":2536,"URL":"http:\/\/creativecommons.org\/licenses\/by-nc\/2.0\/uk\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2010,1,1]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:p>Motivation: Identification and characterization of protein\u2013protein interactions (PPIs) is one of the key aims in biological research. While previous research in text mining has made substantial progress in automatic PPI detection from literature, the need to improve the precision and recall of the process remains. More accurate PPI detection will also improve the ability to extract experimental data related to PPIs and provide multiple evidence for each interaction.<\/jats:p>\n               <jats:p>Results: We developed an interaction detection method and explored the usefulness of various features in automatically identifying PPIs in text. The results show that our approach outperforms other systems using the AImed dataset. In the tests where our system achieves better precision with reduced recall, we discuss possible approaches for improvement. In addition to test datasets, we evaluated the performance on interactions from five human-curated databases\u2014BIND, DIP, HPRD, IntAct and MINT\u2014where our system consistently identified evidence for \u223c60% of interactions when both proteins appear in at least one sentence in the PubMed abstract. We then applied the system to extract articles from PubMed to annotate known, high-throughput and interologous interactions in I2D.<\/jats:p>\n               <jats:p>Availability: The data and software are available at: http:\/\/www.cs.utoronto.ca\/\u223cjuris\/data\/BI09\/.<\/jats:p>\n               <jats:p>Contact: \u00a0yniu@uhnres.utoronto.ca; juris@ai.utoronto.ca<\/jats:p>\n               <jats:p>Supplementary information: \u00a0Supplementary data are available at Bioinformatics online.<\/jats:p>","DOI":"10.1093\/bioinformatics\/btp602","type":"journal-article","created":{"date-parts":[[2009,10,23]],"date-time":"2009-10-23T01:53:19Z","timestamp":1256262799000},"page":"111-119","source":"Crossref","is-referenced-by-count":61,"title":["Evaluation of linguistic features useful in extraction of interactions from PubMed; Application to annotating known, high-throughput and predicted interactions in I2D"],"prefix":"10.1093","volume":"26","author":[{"given":"Yun","family":"Niu","sequence":"first","affiliation":[{"name":"1 Ontario Cancer Institute, UHN, 101 College Street, Toronto, Ontario M5G 1L7, 2 Nanjing University of Aeronautics and Astronautics, Nanjing, China, 3 Department of Computer Science and 4 Department of Medical Biophysics, University of Toronto, Toronto, Canada"},{"name":"1 Ontario Cancer Institute, UHN, 101 College Street, Toronto, Ontario M5G 1L7, 2 Nanjing University of Aeronautics and Astronautics, Nanjing, China, 3 Department of Computer Science and 4 Department of Medical Biophysics, University of Toronto, Toronto, Canada"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"David","family":"Otasek","sequence":"additional","affiliation":[{"name":"1 Ontario Cancer Institute, UHN, 101 College Street, Toronto, Ontario M5G 1L7, 2 Nanjing University of Aeronautics and Astronautics, Nanjing, China, 3 Department of Computer Science and 4 Department of Medical Biophysics, University of Toronto, Toronto, Canada"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Igor","family":"Jurisica","sequence":"additional","affiliation":[{"name":"1 Ontario Cancer Institute, UHN, 101 College Street, Toronto, Ontario M5G 1L7, 2 Nanjing University of Aeronautics and Astronautics, Nanjing, China, 3 Department of Computer Science and 4 Department of Medical Biophysics, University of Toronto, Toronto, Canada"},{"name":"1 Ontario Cancer Institute, UHN, 101 College Street, Toronto, Ontario M5G 1L7, 2 Nanjing University of Aeronautics and Astronautics, Nanjing, China, 3 Department of Computer Science and 4 Department of Medical Biophysics, University of Toronto, Toronto, Canada"},{"name":"1 Ontario Cancer Institute, UHN, 101 College Street, Toronto, Ontario M5G 1L7, 2 Nanjing University of Aeronautics and Astronautics, Nanjing, China, 3 Department of Computer Science and 4 Department of Medical Biophysics, University of Toronto, Toronto, Canada"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"286","published-online":{"date-parts":[[2009,10,22]]},"reference":[{"key":"2023012507532499800_B1","doi-asserted-by":"crossref","first-page":"242","DOI":"10.1093\/nar\/29.1.242","article-title":"BIND \u2013 the biomolecular interaction network database","volume":"29","author":"Bader","year":"2001","journal-title":"Nucleic Acids Res."},{"key":"2023012507532499800_B2","doi-asserted-by":"crossref","first-page":"1621","DOI":"10.1126\/science.1105776","article-title":"High-throughput mapping of a dynamic signaling network in mammalian cells","volume":"307","author":"Barrios-Rodiles","year":"2005","journal-title":"Science"},{"key":"2023012507532499800_B3","author":"BioCreAtIve","year":"2004","journal-title":"Critical assessment for information extraction in biology."},{"key":"2023012507532499800_B4","author":"BioCreAtIvE","year":"2006","journal-title":"Critical assessment for information extraction in biology."},{"key":"2023012507532499800_B5","doi-asserted-by":"crossref","first-page":"R95","DOI":"10.1186\/gb-2007-8-5-r95","article-title":"Unequal evolutionary conservation of human protein interactions in interologous networks","volume":"8","author":"Brown","year":"2007","journal-title":"Genome Biol."},{"key":"2023012507532499800_B6","doi-asserted-by":"crossref","first-page":"2076","DOI":"10.1093\/bioinformatics\/bti273","article-title":"Online Predicted Human Interaction Database OPHID","volume":"21","author":"Brown","year":"2005","journal-title":"Bioinformatics"},{"key":"2023012507532499800_B7","doi-asserted-by":"crossref","DOI":"10.1093\/bioinformatics\/btp595","article-title":"NAViGaTOR: Network analysis, visualization & graphing Toronto","author":"Brown","year":"2009","journal-title":"Bioinformatics"},{"key":"2023012507532499800_B8","doi-asserted-by":"crossref","first-page":"139","DOI":"10.1016\/j.artmed.2004.07.016","article-title":"Comparative experiments on learning information extractors for proteins and their interactions","volume":"33","author":"Bunescu","year":"2005","journal-title":"Artif. Intell. Med."},{"key":"2023012507532499800_B9","first-page":"171","article-title":"Subsequence kernels for relation extraction","volume-title":"Proceedings of the 19th Annual Conference on Neural Information Processing Systems","author":"Bunescu","year":"2005"},{"key":"2023012507532499800_B10","first-page":"100","article-title":"Unsupervised models for named entity classification","volume-title":"Proceedings of the Joint SIGDAT Conference on Empirical Methods in Natural Language Processing and Very Large Corpora","author":"Collins","year":"1999"},{"key":"2023012507532499800_B11","doi-asserted-by":"crossref","first-page":"11","DOI":"10.1186\/1471-2105-4-11","article-title":"PreBIND and Textomy - mining the biomedical literature for protein-protein interactions using a support vector machine","volume":"4","author":"Donaldson","year":"2003","journal-title":"BMC Bioinformatics"},{"key":"2023012507532499800_B12","first-page":"287","article-title":"Extracting interacting protein pairs and evidence sentences by using dependency parsing and machine learning techniques","volume-title":"Proceedings of the 2nd BioCreAtivE Challenge Evaluation Workshop","author":"Erkan","year":"2006"},{"issue":"Suppl. 1","key":"2023012507532499800_B13","doi-asserted-by":"crossref","first-page":"s15","DOI":"10.1186\/1471-2105-6-S1-S15","article-title":"A simple approach for protein name identification: prospects and limits","volume":"6","author":"Fundel","year":"2005","journal-title":"BMC Bioinformatics"},{"key":"2023012507532499800_B14","doi-asserted-by":"crossref","first-page":"365","DOI":"10.1093\/bioinformatics\/btl616","article-title":"RelEx \u2013 relation extraction using dependency parse trees","volume":"23","author":"Fundel","year":"2007","journal-title":"Bioinformatics"},{"key":"2023012507532499800_B15","doi-asserted-by":"crossref","first-page":"141","DOI":"10.1038\/415141a","article-title":"Functional organization of the yeast proteome by systematic analysis of protein complexes","volume":"415","author":"Gavin","year":"2002","journal-title":"Nature"},{"key":"2023012507532499800_B16","doi-asserted-by":"crossref","first-page":"1727","DOI":"10.1126\/science.1090289","article-title":"A protein interaction map of Drosophila melanogaster","volume":"302","author":"Giot","year":"2003","journal-title":"Science"},{"key":"2023012507532499800_B17","first-page":"145","article-title":"The extraction of enriched protein-protein interactions from biomedical text","author":"Haddow","year":"2007","journal-title":"Proceedings of the BioNLP Workshop at ACL"},{"issue":"Suppl. 1","key":"2023012507532499800_B18","doi-asserted-by":"crossref","first-page":"s9","DOI":"10.1186\/1471-2105-6-S1-S9","article-title":"Systematic feature evaluation for gene name recognition","volume":"6","author":"Hakenberg","year":"2005","journal-title":"BMC Bioinformatics"},{"key":"2023012507532499800_B19","doi-asserted-by":"crossref","first-page":"3294","DOI":"10.1093\/bioinformatics\/bti493","article-title":"Discovering patterns to extract protein-protein interactions from the literature: Part II","volume":"21","author":"Hao","year":"2005","journal-title":"Bioinformatics"},{"key":"2023012507532499800_B20","doi-asserted-by":"crossref","first-page":"180","DOI":"10.1038\/415180a","article-title":"Systematic identification of protein complexes in Saccharomyces cerevisiae by mass spectrometry","volume":"415","author":"Ho","year":"2002","journal-title":"Nature"},{"key":"2023012507532499800_B21","doi-asserted-by":"crossref","first-page":"664","DOI":"10.1038\/ng0704-664","article-title":"A gene network for navigating the literature","volume":"36","author":"Hoffmann","year":"2004","journal-title":"Nat. Genet."},{"key":"2023012507532499800_B22","first-page":"237","article-title":"Mining physical protein-protein interactions by exploiting abundant features","volume-title":"Proceedings of the 2nd BioCreAtivE Challenge Evaluation Workshop","author":"Huang","year":"2007"},{"key":"2023012507532499800_B23","doi-asserted-by":"crossref","first-page":"7092","DOI":"10.1128\/MCB.25.16.7092-7106.2005","article-title":"WW domains provide a platform for the assembly of multi-protein networks","volume":"25","author":"Ingham","year":"2005","journal-title":"Mol. Cell Biol."},{"key":"2023012507532499800_B24","doi-asserted-by":"crossref","first-page":"1143","DOI":"10.1073\/pnas.97.3.1143","article-title":"Toward a protein-protein interaction map of the budding yeast: a comprehensive system to examine two-hybrid interactions in all possible combinations between the yeast proteins","volume":"97","author":"Ito","year":"2000","journal-title":"Proc. Natl Acad. Sci. USA"},{"key":"2023012507532499800_B25","doi-asserted-by":"crossref","first-page":"e220","DOI":"10.1093\/bioinformatics\/btl203","article-title":"Finding the evidence for protein-protein interactions from PubMed abstracts","volume":"22","author":"Jang","year":"2006","journal-title":"Bioinformatics"},{"key":"2023012507532499800_B26","author":"Joachims","year":"2002","journal-title":"SVMlightSupport Vector Machine."},{"key":"2023012507532499800_B27","doi-asserted-by":"crossref","first-page":"168","DOI":"10.1038\/nature04177","article-title":"A quantitative protein interaction network for the ErbB receptors using protein microarrays","volume":"439","author":"Jones","year":"2006","journal-title":"Nature"},{"key":"2023012507532499800_B28","doi-asserted-by":"crossref","first-page":"d561","DOI":"10.1093\/nar\/gkl958","article-title":"IntAct \u2013 open source resource for molecular interaction data","volume":"35","author":"Kerrien","year":"2007","journal-title":"Nucleic Acids Res."},{"key":"2023012507532499800_B29","first-page":"41","article-title":"Assessment of the second BioCreative PPI task: automatic extraction of protein-protein interactions","volume-title":"Proceedings of the 2nd BioCreative Challenge Evaluation Workshop","author":"Krallinger","year":"2007"},{"issue":"Suppl. 2","key":"2023012507532499800_B30","doi-asserted-by":"crossref","first-page":"S4","DOI":"10.1186\/gb-2008-9-s2-s4","article-title":"Overview of the protein-protein interaction annotation extraction task of BioCreative II","volume":"9","author":"Krallinger","year":"2008","journal-title":"Genome Biol."},{"issue":"Suppl. 2","key":"2023012507532499800_B31","doi-asserted-by":"crossref","first-page":"S6","DOI":"10.1186\/gb-2008-9-s2-s6","article-title":"Introducing meta-services for biomedical information extraction","volume":"9","author":"Leitner","year":"2008","journal-title":"Genome Biol."},{"key":"2023012507532499800_B32","doi-asserted-by":"crossref","first-page":"540","DOI":"10.1126\/science.1091403","article-title":"A map of the interactome network of the metazoan C. elegans","volume":"303","author":"Li","year":"2004","journal-title":"Science"},{"key":"2023012507532499800_B33","doi-asserted-by":"crossref","first-page":"482","DOI":"10.3115\/991886.991970","article-title":"Principar \u2013 an efficient, broad-coverage, principle-based parser","volume-title":"Proceedings of the 15th International Conference on Computational Linguistics","author":"Lin","year":"1994"},{"key":"2023012507532499800_B34","author":"LLL","year":"2005","journal-title":"Proceedings of the 4th Learning Language in Logic Workshop."},{"key":"2023012507532499800_B35","doi-asserted-by":"crossref","first-page":"31","DOI":"10.1093\/nar\/30.1.31","article-title":"MIPS: a database for genomes and protein sequences","volume":"30","author":"Mewes","year":"2002","journal-title":"Nucleic Acids Res."},{"key":"2023012507532499800_B36","doi-asserted-by":"crossref","first-page":"399","DOI":"10.1038\/nature750","article-title":"Comparative assessment of large-scale data sets of protein-protein interactions","volume":"417","author":"Mering","year":"2002","journal-title":"Nature"},{"key":"2023012507532499800_B37","doi-asserted-by":"crossref","first-page":"2464","DOI":"10.1093\/ietisy\/e89-d.8.2464","article-title":"Extracting protein-protein interaction information from biomedical text with SVM","volume":"E89-D","author":"Mitsumori","year":"2006","journal-title":"IEICE Trans. Inf. Syst."},{"key":"2023012507532499800_B38","first-page":"120","article-title":"Extracting protein-protein interactions using simple contextual features","volume-title":"Proceedings of the BioNLP Workshop at HLT\/NAACL","author":"Nielsen","year":"2006"},{"key":"2023012507532499800_B39","doi-asserted-by":"crossref","first-page":"352","DOI":"10.1007\/978-3-540-69858-6_42","article-title":"Detecting protein-protein interaction sentences using a mixture model","volume":"5039","author":"Niu","year":"2008","journal-title":"Proceedings of NLDB08, Lecture Notes in Computer Science"},{"key":"2023012507532499800_B40","article-title":"Confirming protein-protein interactions by text mining","volume-title":"Proceedings of SIAM Conference on Text Mining","author":"Otasek","year":"2006"},{"key":"2023012507532499800_B41","doi-asserted-by":"crossref","first-page":"2363","DOI":"10.1101\/gr.1680803","article-title":"Development of human protein reference database as an initial platform for approaching systems biology in humans","volume":"13","author":"Peri","year":"2003","journal-title":"Genome Res."},{"key":"2023012507532499800_B42","first-page":"195","article-title":"Optimizing syntax patterns for discovering protein-protein interactions","volume-title":"Proceedings of the ACM Symposium on Applied Computing","author":"Plake","year":"2005"},{"key":"2023012507532499800_B43","doi-asserted-by":"crossref","first-page":"e144","DOI":"10.1093\/nar\/gkn735","article-title":"Optimization of experimental design parameters for high-throughput chromatin immunoprecipitation studies","volume":"36","author":"Ponzielli","year":"2008","journal-title":"Nucleic Acids Res."},{"key":"2023012507532499800_B44","first-page":"46","article-title":"Using biomedical literature mining to consolidate the set of known human protein-protein interactions","volume-title":"ACL-ISMB Workshop on Linking Biological Literature, Ontologies and Databases: Minging Biological Semantics","author":"Ramani","year":"2005"},{"key":"2023012507532499800_B45","first-page":"409","article-title":"Investigating a generic paraphrase-based approach for relation extraction","volume-title":"Proceedings of the 11th Conference of the European Chapter of the Association for Computational Linguistics","author":"Romano","year":"2006"},{"key":"2023012507532499800_B46","doi-asserted-by":"crossref","first-page":"1173","DOI":"10.1038\/nature04209","article-title":"Towards a proteome-scale map of the human protein-protein interaction network","volume":"437","author":"Rual","year":"2005","journal-title":"Nature"},{"key":"2023012507532499800_B47","doi-asserted-by":"crossref","first-page":"957","DOI":"10.1016\/j.cell.2005.08.029","article-title":"A human protein-protein interaction network: A resource for annotating the proteome","volume":"122","author":"Stelzl","year":"2005","journal-title":"Cell"},{"key":"2023012507532499800_B48","doi-asserted-by":"crossref","first-page":"2046","DOI":"10.1093\/bioinformatics\/btg279","article-title":"Extraction of protein interaction information from unstructured text using a context-free grammar","volume":"19","author":"Temkin","year":"2003","journal-title":"Bioinformatics"},{"key":"2023012507532499800_B49","doi-asserted-by":"crossref","first-page":"461","DOI":"10.1016\/j.jbi.2004.08.003","article-title":"Improving the performance of dictionary-based approaches in protein name recognition","volume":"37","author":"Tsuruoka","year":"2004","journal-title":"J. Biomed. Inform."},{"key":"2023012507532499800_B50","doi-asserted-by":"crossref","first-page":"289","DOI":"10.1093\/nar\/28.1.289","article-title":"DIP: the database of interacting proteins","volume":"28","author":"Xenarios","year":"2000","journal-title":"Nucleic Acids Res"},{"key":"2023012507532499800_B51","first-page":"60","article-title":"Biomedical information extraction with predicate-argument structure patterns","volume-title":"Proceedings of the 1st International Symposium on Semantic Mining in Biomedicine","author":"Yakushiji","year":"2005"},{"key":"2023012507532499800_B52","doi-asserted-by":"crossref","first-page":"135","DOI":"10.1016\/S0014-5793(01)03293-8","article-title":"MINT: A Molecular INTeraction database","volume":"513","author":"Zanzoni","year":"2002","journal-title":"FEBS Lett."},{"key":"2023012507532499800_B53","doi-asserted-by":"crossref","first-page":"393","DOI":"10.1016\/j.jbi.2007.11.008","article-title":"Extracting interactions between proteins from the literature","volume":"41","author":"Zhou","year":"2008","journal-title":"J. Biomed. Inform."},{"key":"2023012507532499800_B54","first-page":"427","article-title":"Exploring various knowledge in relation extraction","volume-title":"Proceedings of the 43rd Annual Meeting of ACL","author":"Zhou","year":"2005"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/26\/1\/111\/48852286\/bioinformatics_26_1_111.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/26\/1\/111\/48852286\/bioinformatics_26_1_111.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,1,25]],"date-time":"2023-01-25T07:53:58Z","timestamp":1674633238000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/26\/1\/111\/181356"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2009,10,22]]},"references-count":54,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2010,1,1]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btp602","relation":{},"ISSN":["1367-4811","1367-4803"],"issn-type":[{"value":"1367-4811","type":"electronic"},{"value":"1367-4803","type":"print"}],"subject":[],"published-other":{"date-parts":[[2010,1,1]]},"published":{"date-parts":[[2009,10,22]]}}}