{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,11,6]],"date-time":"2025-11-06T05:54:06Z","timestamp":1762408446662},"reference-count":30,"publisher":"Oxford University Press (OUP)","issue":"17","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2006,9,1]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:p>Motivation: Recently, several information extraction systems have been developed to retrieve relevant information out of biomedical text. However, these methods represent individual efforts. In this paper, we show that by combining different algorithms and their outcome, the results improve significantly. For this reason, CONAN has been created, a system which combines different programs and their outcome. Its methods include tagging of gene\/protein names, finding interaction and mutation data, tagging of biological concepts and linking to MeSH and Gene Ontology terms.<\/jats:p>\n               <jats:p>Results: In this paper, we will present data that show that combining different text-mining algorithms significantly improves the results. Not only is CONAN a full-scale approach that will ultimately cover all of PubMed\/MEDLINE, we also show that this universality has no effect on quality: our system performs as well as or better than existing systems.<\/jats:p>\n               <jats:p>Availability: The LDD corpus presented is available by request to the author. The system will be available shortly. For information and updates on CONAN please visit<\/jats:p>\n               <jats:p>Contact: \u00a0rainer@cs.uu.nl<\/jats:p>\n               <jats:p>Supplementary information: Supplementary data are available at Bioinformatics online.<\/jats:p>","DOI":"10.1093\/bioinformatics\/btl281","type":"journal-article","created":{"date-parts":[[2006,6,10]],"date-time":"2006-06-10T00:49:11Z","timestamp":1149900551000},"page":"2151-2157","source":"Crossref","is-referenced-by-count":16,"title":["Combination of text-mining algorithms increases the performance"],"prefix":"10.1093","volume":"22","author":[{"given":"Rainer","family":"Malik","sequence":"first","affiliation":[{"name":"Universiteit Utrecht, Department of Information and Computing Sciences 1 \u00a0 1 \u00a0 \u00a0 Padualaan 14, 3584CH Utrecht, The Netherlands"}]},{"given":"Lude","family":"Franke","sequence":"additional","affiliation":[{"name":"Complex Genetics Section, Department of Medical Genetics 2 \u00a0 2 \u00a0 \u00a0 UMC Utrecht, The Netherlands"}]},{"given":"Arno","family":"Siebes","sequence":"additional","affiliation":[{"name":"Universiteit Utrecht, Department of Information and Computing Sciences 1 \u00a0 1 \u00a0 \u00a0 Padualaan 14, 3584CH Utrecht, The Netherlands"}]}],"member":"286","published-online":{"date-parts":[[2006,6,9]]},"reference":[{"key":"2023012409141810700_b1","doi-asserted-by":"crossref","first-page":"403","DOI":"10.1016\/S0022-2836(05)80360-2","article-title":"Basic local alignment search tool","volume":"215","author":"Altschul","year":"1990","journal-title":"J Mol. Biol."},{"key":"2023012409141810700_b2","doi-asserted-by":"crossref","first-page":"D115","DOI":"10.1093\/nar\/gkh131","article-title":"UniProt: the Universal Protein knowledgebase","volume":"32","author":"Apweiler","year":"2004","journal-title":"Nucleic Acids Res."},{"key":"2023012409141810700_b3","doi-asserted-by":"crossref","first-page":"25","DOI":"10.1038\/75556","article-title":"Gene ontology: tool for the unification of biology. The Gene Ontology Consortium","volume":"25","author":"Ashburner","year":"2000","journal-title":"Nat. Genet."},{"key":"2023012409141810700_b4","doi-asserted-by":"crossref","first-page":"248","DOI":"10.1093\/nar\/gkg056","article-title":"BIND: the biomolecular interaction network database","volume":"31","author":"Bader","year":"2003","journal-title":"Nucleic Acids Res."},{"key":"2023012409141810700_b5","doi-asserted-by":"crossref","first-page":"925","DOI":"10.1101\/gr.1860604","article-title":"An Overview of Ensembl","volume":"14","author":"Birney","year":"2004","journal-title":"Genome Res."},{"key":"2023012409141810700_b6","doi-asserted-by":"crossref","first-page":"267","DOI":"10.1093\/nar\/gkh061","article-title":"The Unified Medical Language System (UMLS): integrating biomedical terminology","volume":"32","author":"Bodenreider","year":"2004","journal-title":"Nucleic Acids Res."},{"key":"2023012409141810700_b7","doi-asserted-by":"crossref","first-page":"365","DOI":"10.1093\/nar\/gkg095","article-title":"The Swiss-Prot protein knowledgebase and its supplement TrEMBL in 2003","volume":"31","author":"Boeckmann","year":"2003","journal-title":"Nucleic Acids Res."},{"key":"2023012409141810700_b8","doi-asserted-by":"crossref","first-page":"262","DOI":"10.1093\/nar\/gkh021","article-title":"The Gene Ontology Annotation (GOA) Database: sharing knowledge in Uniprot with Gene Ontology","volume":"32","author":"Camon","year":"2004","journal-title":"Nucleic Acids Res."},{"key":"2023012409141810700_b9","doi-asserted-by":"crossref","first-page":"216","DOI":"10.1093\/bioinformatics\/btg393","article-title":"GAPSCORE: finding gene and protein names one word at a time","volume":"20","author":"Chang","year":"2004","journal-title":"Bioinformatics"},{"key":"2023012409141810700_b10","doi-asserted-by":"crossref","first-page":"11","DOI":"10.1186\/1471-2105-4-11","article-title":"PreBIND and Textomy\u2013mining the biomedical literature for protein\u2013protein interactions using a support vector machine","volume":"4","author":"Donaldson","year":"2003","journal-title":"BMC Bioinformatics"},{"key":"2023012409141810700_b11","doi-asserted-by":"crossref","DOI":"10.1016\/S1386-5056(02)00052-7","article-title":"Protein names and how to find them","volume":"67","author":"Franzen","year":"2002","journal-title":"Int. J. Med. Inf."},{"key":"2023012409141810700_b12","article-title":"Lll'05 challenge: genic interaction extraction with alignments and finite state automata","author":"Hakenberg","year":"2005"},{"key":"2023012409141810700_b13","doi-asserted-by":"crossref","first-page":"664","DOI":"10.1038\/ng0704-664","article-title":"A gene network for navigating the literature","volume":"36","author":"Hoffmann","year":"2004","journal-title":"Nat. Genet."},{"key":"2023012409141810700_b14","doi-asserted-by":"crossref","first-page":"ii252","DOI":"10.1093\/bioinformatics\/bti1142","article-title":"Implementing the iHOP concept for navigation of biomedical literature","volume":"21","author":"Hoffmann","year":"2005","journal-title":"Bioinformatics"},{"key":"2023012409141810700_b15","doi-asserted-by":"crossref","first-page":"557","DOI":"10.1093\/bioinformatics\/btg449","article-title":"Automated extraction of mutation data from the literature: application of MuteXt to G protein-coupled receptors and nuclear hormone receptors","volume":"20","author":"Horn","year":"2004","journal-title":"Bioinformatics"},{"issue":"5-6","key":"2023012409141810700_b16","doi-asserted-by":"crossref","first-page":"409","DOI":"10.1016\/j.compbiolchem.2004.09.010","article-title":"iProLINK: an integrated protein resource for literature mining","volume":"28","author":"Hu","year":"2004","journal-title":"Comput. Biol. Chem."},{"key":"2023012409141810700_b17","article-title":"Learning biological interactions from medline abstracts","author":"Katrenko","year":"2005"},{"key":"2023012409141810700_b18","doi-asserted-by":"crossref","first-page":"224","DOI":"10.1186\/gb-2005-6-7-224","article-title":"Text-mining and information-retrieval services for molecular biology","volume":"6","author":"Krallinger","year":"2005","journal-title":"Genome Biol."},{"key":"2023012409141810700_b19","doi-asserted-by":"crossref","DOI":"10.1016\/S0378-1119(00)00431-5","article-title":"Using BLAST for identifying gene and protein names in journal articles","volume":"259","author":"Krauthammer","year":"2000","journal-title":"Gene"},{"key":"2023012409141810700_b20","doi-asserted-by":"crossref","first-page":"125","DOI":"10.1093\/bioinformatics\/16.2.125","article-title":"SAWTED: structure assignment with text description\u2013enhanced detection of remote homologues with automated Swiss-Prot annotation comparisons","volume":"16","author":"MacCallum","year":"2000","journal-title":"Bioinformatics"},{"key":"2023012409141810700_b21","first-page":"248","article-title":"Conan: An integrative system for biomedical literature mining","volume-title":"LNAI 3808, EPIA05","author":"Malik","year":"2005"},{"key":"2023012409141810700_b22","doi-asserted-by":"crossref","DOI":"10.1093\/nar\/gkh427","article-title":"NLProt: extracting protein names and sequences from papers","volume":"32","author":"Mika","year":"2004","journal-title":"Nucleic Acids Res."},{"key":"2023012409141810700_b23","doi-asserted-by":"crossref","first-page":"I241","DOI":"10.1093\/bioinformatics\/bth904","article-title":"Protein names precisely peeled off free text","volume":"20","author":"Mika","year":"2004","journal-title":"Bioinformatics"},{"key":"2023012409141810700_b24","doi-asserted-by":"crossref","first-page":"e309","DOI":"10.1371\/journal.pbio.0020309","article-title":"Textpresso: an ontology-based information retrieval and extraction system for biological literature","volume":"2","author":"Muller","year":"2004","journal-title":"PLoS Biol."},{"key":"2023012409141810700_b25","article-title":"Learning language in logic\u2014genic interaction extraction challenge","author":"Nedellec","year":"2005"},{"key":"2023012409141810700_b26","doi-asserted-by":"crossref","first-page":"e65","DOI":"10.1371\/journal.pbio.0030065","article-title":"Facts from text\u2014is text mining ready to deliver?","volume":"3","author":"Rebholz-Schuhmann","year":"2005","journal-title":"PLoS Biol."},{"key":"2023012409141810700_b27","article-title":"The boosting approach to machine learning: an overview","author":"Schapire","year":"2002"},{"key":"2023012409141810700_b28","doi-asserted-by":"crossref","first-page":"153","DOI":"10.1023\/A:1007649029923","article-title":"BoosTexter: a boosting-based system for text categorization","volume":"39","author":"Schapire","year":"2000","journal-title":"Mach. Learn."},{"key":"2023012409141810700_b29","doi-asserted-by":"crossref","first-page":"1124","DOI":"10.1093\/bioinformatics\/18.8.1124","article-title":"Tagging gene and protein names in biomedical text","volume":"18","author":"Tanabe","year":"2002","journal-title":"Bioinformatics"},{"key":"2023012409141810700_b30","doi-asserted-by":"crossref","first-page":"303","DOI":"10.1093\/nar\/30.1.303","article-title":"DIP, the Database of Interacting Proteins: a research tool for studying cellular networks of protein interactions","volume":"30","author":"Xenarios","year":"2002","journal-title":"Nucleic Acids Res."}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/22\/17\/2151\/48840622\/bioinformatics_22_17_2151.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/22\/17\/2151\/48840622\/bioinformatics_22_17_2151.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,1,24]],"date-time":"2023-01-24T09:53:57Z","timestamp":1674554037000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/22\/17\/2151\/273326"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2006,6,9]]},"references-count":30,"journal-issue":{"issue":"17","published-print":{"date-parts":[[2006,9,1]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btl281","relation":{},"ISSN":["1367-4811","1367-4803"],"issn-type":[{"value":"1367-4811","type":"electronic"},{"value":"1367-4803","type":"print"}],"subject":[],"published-other":{"date-parts":[[2006,9,1]]},"published":{"date-parts":[[2006,6,9]]}}}