{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,29]],"date-time":"2026-03-29T12:06:35Z","timestamp":1774785995398,"version":"3.50.1"},"reference-count":49,"publisher":"Springer Science and Business Media LLC","issue":"1","content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["BMC Bioinformatics"],"published-print":{"date-parts":[[2008,12]]},"abstract":"<jats:title>Abstract<\/jats:title>\n          <jats:sec>\n            <jats:title>Background<\/jats:title>\n            <jats:p>The increasing amount of published literature in biomedicine represents an immense source of knowledge, which can only efficiently be accessed by a new generation of automated information extraction tools. Named entity recognition of well-defined objects, such as genes or proteins, has achieved a sufficient level of maturity such that it can form the basis for the next step: the extraction of relations that exist between the recognized entities. Whereas most early work focused on the mere detection of relations, the classification of the type of relation is also of great importance and this is the focus of this work. In this paper we describe an approach that extracts both the existence of a relation and its type. Our work is based on Conditional Random Fields, which have been applied with much success to the task of named entity recognition.<\/jats:p>\n          <\/jats:sec>\n          <jats:sec>\n            <jats:title>Results<\/jats:title>\n            <jats:p>We benchmark our approach on two different tasks. The first task is the identification of semantic relations between diseases and treatments. The available data set consists of manually annotated PubMed abstracts. The second task is the identification of relations between genes and diseases from a set of concise phrases, so-called GeneRIF (Gene Reference Into Function) phrases. In our experimental setting, we do not assume that the entities are given, as is often the case in previous relation extraction work. Rather the extraction of the entities is solved as a subproblem. Compared with other state-of-the-art approaches, we achieve very competitive results on both data sets. To demonstrate the scalability of our solution, we apply our approach to the complete human GeneRIF database. The resulting gene-disease network contains 34758 semantic associations between 4939 genes and 1745 diseases. The gene-disease network is publicly available as a machine-readable RDF graph.<\/jats:p>\n          <\/jats:sec>\n          <jats:sec>\n            <jats:title>Conclusion<\/jats:title>\n            <jats:p>We extend the framework of Conditional Random Fields towards the annotation of semantic relations from text and apply it to the biomedical domain. Our approach is based on a rich set of textual features and achieves a performance that is competitive to leading approaches. The model is quite general and can be extended to handle arbitrary biological entities and relation types. The resulting gene-disease network shows that the GeneRIF database provides a rich knowledge source for text mining. Current work is focused on improving the accuracy of detection of entities as well as entity boundaries, which will also greatly improve the relation extraction performance.<\/jats:p>\n          <\/jats:sec>","DOI":"10.1186\/1471-2105-9-207","type":"journal-article","created":{"date-parts":[[2008,4,24]],"date-time":"2008-04-24T06:13:32Z","timestamp":1209017612000},"update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":150,"title":["Extraction of semantic biomedical relations from text using conditional random fields"],"prefix":"10.1186","volume":"9","author":[{"given":"Markus","family":"Bundschus","sequence":"first","affiliation":[]},{"given":"Mathaeus","family":"Dejori","sequence":"additional","affiliation":[]},{"given":"Martin","family":"Stetter","sequence":"additional","affiliation":[]},{"given":"Volker","family":"Tresp","sequence":"additional","affiliation":[]},{"given":"Hans-Peter","family":"Kriegel","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2008,4,23]]},"reference":[{"key":"2192_CR1","doi-asserted-by":"crossref","unstructured":"Feldman R, Regev Y, Hurvitz E, Finkelstein-Landau M: Mining the biomedical literature using semantic analysis and natural language processing techniques. Drug Discovery Today: BIOSILICO 2003., 1(2):","DOI":"10.1016\/S1478-5382(03)02330-8"},{"key":"2192_CR2","unstructured":"BioCreAtIvE II \u2013 Protein-Protein Interaction Task[http:\/\/biocreative.sourceforge.net\/biocreative_2_ppi.html]"},{"key":"2192_CR3","unstructured":"TREC Genomics Track[http:\/\/ir.ohsu.edu\/genomics\/]"},{"key":"2192_CR4","first-page":"554","volume-title":"AMIA Annu Symp Proc, National Library of Medicine, National Institutes of Health, Department of Health and Human Services, Bethesda, Maryland 20894, USA","author":"TC Rindflesch","year":"2003","unstructured":"Rindflesch TC, Libbus B, Hristovski D, Aronson AR, Kilicoglu H: Semantic relations asserting the etiology of genetic diseases. AMIA Annu Symp Proc, National Library of Medicine, National Institutes of Health, Department of Health and Human Services, Bethesda, Maryland 20894, USA 2003, 554\u2013558."},{"key":"2192_CR5","doi-asserted-by":"crossref","unstructured":"Chun HW, Tsuruoka Y, Kim JD, Shiba R, Nagata N, Hishiki T, Tsujii J: Automatic recognition of topic-classified relations between prostate cancer and genes using MEDLINE abstracts. BMC Bioinformatics 2006., 7(Suppl 3):","DOI":"10.1186\/1471-2105-7-S3-S4"},{"key":"2192_CR6","volume-title":"Proceedings of the Annual Meeting of Association of Computational Linguistics (ACL '04)","author":"B Rosario","year":"2004","unstructured":"Rosario B, Hearst M: Classifying Semantic Relations in Bioscience Texts. Proceedings of the Annual Meeting of Association of Computational Linguistics (ACL '04) 2004."},{"key":"2192_CR7","doi-asserted-by":"publisher","first-page":"748","DOI":"10.3115\/1220575.1220669","volume-title":"HLT '05: Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing","author":"C Sutton","year":"2005","unstructured":"Sutton C, McCallum A: Composition of conditional random fields for transfer learning. In HLT '05: Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing. Morristown, NJ, USA: Association for Computational Linguistics; 2005:748\u2013754."},{"key":"2192_CR8","volume-title":"Human Language Technology Conference\/North American chapter of the Association for Computational Linguistics Annual Meeting","author":"A Culotta","year":"2006","unstructured":"Culotta A, McCallum A, Betz J: Integrating Probabilistic Extraction Models and Data Mining to Discover Relations and Patterns in Text. Human Language Technology Conference\/North American chapter of the Association for Computational Linguistics Annual Meeting 2006."},{"key":"2192_CR9","first-page":"460","volume-title":"AMIA Annu Symp Proc, University of Missouri \u2013 Columbia, USA","author":"JA Mitchell","year":"2003","unstructured":"Mitchell JA, Aronson AR, Mork JG, Folk LC, Humphrey SM, Ward JM: Gene indexing: characterization and analysis of NLM's GeneRIFs. AMIA Annu Symp Proc, University of Missouri \u2013 Columbia, USA 2003, 460\u2013464."},{"key":"2192_CR10","volume-title":"Scientific American","author":"T Berners-Lee","year":"2001","unstructured":"Berners-Lee T, Hendler J, Lassila O: The Semantic Web. Scientific American 2001."},{"issue":"21","key":"2192_CR11","doi-asserted-by":"publisher","first-page":"8685","DOI":"10.1073\/pnas.0701361104","volume":"104","author":"KI Goh","year":"2007","unstructured":"Goh KI, Cusick ME, Valle D, Childs B, Vidal M, Barabasi AL: The human disease network. PNAS 2007, 104(21):8685\u20138690.","journal-title":"PNAS"},{"issue":"2","key":"2192_CR12","doi-asserted-by":"publisher","first-page":"155","DOI":"10.1093\/bioinformatics\/17.2.155","volume":"17","author":"T Ono","year":"2001","unstructured":"Ono T, Hishigaki H, Tanigami A, Takagi T: Automated extraction of information on protein-protein interactions from the biological literature. Bioinformatics 2001, 17(2):155\u2013161.","journal-title":"Bioinformatics"},{"key":"2192_CR13","doi-asserted-by":"crossref","unstructured":"Ramani AK, Bunescu RC, Mooney RJ, Marcotte EM: Consolidating the set of known human protein-protein interactions in preparation for large-scale mapping of the human interactome. Genome Biol 2005., 6(5):","DOI":"10.3115\/1641484.1641491"},{"key":"2192_CR14","volume-title":"Proceedings of the 19th Conference on Neural Information Processing Systems","author":"RC Bunescu","year":"2005","unstructured":"Bunescu RC, Mooney RJ: Subsequence Kernels for Relation Extraction. Proceedings of the 19th Conference on Neural Information Processing Systems 2005."},{"key":"2192_CR15","first-page":"60","volume-title":"Proc Int Conf Intell Syst Mol Biol","author":"C Blaschke","year":"1999","unstructured":"Blaschke C, Andrade MA, Ouzounis C, Valencia A: Automatic extraction of biological information from scientific text: protein-protein interactions. Proc Int Conf Intell Syst Mol Biol 1999, 60\u201367."},{"key":"2192_CR16","doi-asserted-by":"publisher","first-page":"43","DOI":"10.1016\/j.jbi.2003.10.001","volume":"37","author":"A Rzhetsky","year":"2004","unstructured":"Rzhetsky A, Iossifov I, Koike T, Krauthammer M, Kra P, Morris M, Yu H, Dubou\u00e9 PA, Weng W, Wilbur WJ, Hatzivassiloglou V, Friedman C: GeneWays: a system for extracting, analyzing, visualizing, and integrating molecular pathway data. J Biomed Inform 2004, 37: 43\u201353.","journal-title":"J Biomed Inform"},{"issue":"2","key":"2192_CR17","doi-asserted-by":"publisher","first-page":"139","DOI":"10.1016\/j.artmed.2004.07.016","volume":"33","author":"R Bunescu","year":"2005","unstructured":"Bunescu R, Ge R, Kate RJ, Marcotte EM, Mooney RJ, Ramani AK, Wong YW: Comparative experiments on learning information extractors for proteins and their interactions. Artificial Intelligence in Medicine 2005, 33(2):139\u2013155.","journal-title":"Artificial Intelligence in Medicine"},{"key":"2192_CR18","volume-title":"Human Language Technology Conference on Empirical Methods in Natural Language Processing","author":"B Rosario","year":"2005","unstructured":"Rosario B, Hearst A: Multi-way Relation Classification: Application to Protein-Protein Interaction. Human Language Technology Conference on Empirical Methods in Natural Language Processing 2005."},{"key":"2192_CR19","volume-title":"Proceedings of the Seventh International Conference on Intelligent Systems for Molecular Biology","author":"M Craven","year":"1999","unstructured":"Craven M, Kumlien J: Constructing Biological Knowledge Bases by Extracting Information from Text Sources. Proceedings of the Seventh International Conference on Intelligent Systems for Molecular Biology 1999."},{"key":"2192_CR20","first-page":"517","volume-title":"Proceedings of Pacific Symposium on Biocomputing","author":"TC Rindflesch","year":"2000","unstructured":"Rindflesch TC, Tanabe L, Weinstein JN, Hunter L: EDGAR: Extraction of Drugs, Genes And Relations from the Biomedical Literature. Proceedings of Pacific Symposium on Biocomputing 2000, 517\u2013528."},{"key":"2192_CR21","first-page":"4","volume-title":"Pac Symp Biocomput","author":"HW Chun","year":"2006","unstructured":"Chun HW, Tsuruoka Y, Kim JD, Shiba R, Nagata N, Hishiki T, Tsujii J: Extraction of gene-disease relations from Medline using domain dictionaries and machine learning. Pac Symp Biocomput 2006, 4\u201315."},{"issue":"3","key":"2192_CR22","doi-asserted-by":"crossref","first-page":"321","DOI":"10.1055\/s-0038-1634069","volume":"45","author":"YT Yen","year":"2006","unstructured":"Yen YT, Chen B, Chiu HW, Lee YC, Li YC, Hsu CY: Developing an NLP and IR-based Algorithm for Analyzing Gene-disease Relationships. Methods Inf Med 2006, 45(3):321\u2013329.","journal-title":"Methods Inf Med"},{"issue":"8","key":"2192_CR23","doi-asserted-by":"publisher","first-page":"1124","DOI":"10.1093\/bioinformatics\/18.8.1124","volume":"18","author":"L Tanabe","year":"2002","unstructured":"Tanabe L, Wilbur WJ: Tagging gene and protein names in biomedical text. Bioinformatics 2002, 18(8):1124\u20131132.","journal-title":"Bioinformatics"},{"key":"2192_CR24","first-page":"17","volume-title":"Proc AMIA Symp","author":"AR Aronson","year":"2001","unstructured":"Aronson AR: Effective mapping of biomedical text to the UMLS Metathesaurus: the MetaMap program. Proc AMIA Symp 2001, 17\u201321."},{"key":"2192_CR25","doi-asserted-by":"crossref","unstructured":"Bodenreider O: The Unified Medical Language System (UMLS): integrating biomedical terminology. Nucleic Acids Res 2004., (32 Database):","DOI":"10.1093\/nar\/gkh061"},{"key":"2192_CR26","doi-asserted-by":"publisher","first-page":"291+","DOI":"10.1186\/1471-2105-7-291","volume":"7","author":"M Masseroli","year":"2006","unstructured":"Masseroli M, Kilicoglu H, Lang FM, Rindflesch TC: Argument-predicate distance as a filter for enhancing precision in extracting predications on the genetic etiology of disease. BMC Bioinformatics 2006, 7: 291+.","journal-title":"BMC Bioinformatics"},{"key":"2192_CR27","unstructured":"BioText: Data Collections[http:\/\/biotext.berkeley.edu\/data.html]"},{"key":"2192_CR28","doi-asserted-by":"crossref","unstructured":"Maglott D, Ostell J, Pruitt KD, Tatusova T: Entrez Gene: gene-centered information at NCBI. Nucleic Acids Res 2005., (33 Database):","DOI":"10.1093\/nar\/gki031"},{"key":"2192_CR29","unstructured":"MUC Scoring Software User's Manual[http:\/\/www.itl.nist.gov\/iaui\/894.02\/related_projects\/muc\/muc_sw\/muc_sw_manual.html]"},{"key":"2192_CR30","unstructured":"BioNLP\/NLPBA 2004 Shared Task Report[http:\/\/www-tsujii.is.s.u-tokyo.ac.jp\/GENIA\/ERtask\/report.html]"},{"key":"2192_CR31","volume-title":"BMC Bioinformatics","author":"THH Tsai","year":"2006","unstructured":"Tsai THH, Wu SHH, Chou WCC, Lin YCC, He D, Hsiang J, Sung TYY, Hsu WLL: Various criteria in the evaluation of biomedical named entity recognition. BMC Bioinformatics 2006., 7:"},{"key":"2192_CR32","volume-title":"Combining SVMs with various feature selection strategies, Springer","author":"YW Chen","year":"2006","unstructured":"Chen YW, Lin CJ: Combining SVMs with various feature selection strategies, Springer. 2006."},{"key":"2192_CR33","volume-title":"LIBSVM: a library for support vector machines","author":"CC Chang","year":"2001","unstructured":"Chang CC, Lin CJ: LIBSVM: a library for support vector machines. 2001."},{"issue":"2","key":"2192_CR34","doi-asserted-by":"publisher","first-page":"101","DOI":"10.1038\/nrg1272","volume":"5","author":"AL Barab\u00e1si","year":"2004","unstructured":"Barab\u00e1si AL, Oltvai ZN: Network biology: understanding the cell's functional organization. Nat Rev Genet 2004, 5(2):101\u2013113.","journal-title":"Nat Rev Genet"},{"key":"2192_CR35","doi-asserted-by":"publisher","first-page":"15","DOI":"10.1002\/bult.105","volume":"25","author":"E Miller","year":"1998","unstructured":"Miller E: An Introduction to the Resource Description Framework. Bulletin of the American Society for Information Science and Technology 1998, 25: 15\u201319.","journal-title":"Bulletin of the American Society for Information Science and Technology"},{"key":"2192_CR36","volume-title":"16th International World Wide Web Conference","author":"F Belleau","year":"2007","unstructured":"Belleau F, Nolin MA, Tourigny N, Rigault P, Morissette J: Bio2RDF: Towards a Mashup to Build Bioinformatics Knowledge System. 16th International World Wide Web Conference 2007."},{"issue":"Suppl 1","key":"2192_CR37","doi-asserted-by":"publisher","first-page":"S6","DOI":"10.1186\/1471-2105-6-S1-S6","volume":"6","author":"R McDonald","year":"2005","unstructured":"McDonald R, Pereira F: Identifying gene and protein mentions in text using conditional random fields. BMC Bioinformatics 2005, 6(Suppl 1):S6.","journal-title":"BMC Bioinformatics"},{"issue":"3","key":"2192_CR38","doi-asserted-by":"publisher","first-page":"89","DOI":"10.1067\/mcp.2001.113989","volume":"69","author":"AJ Atkinson","year":"2001","unstructured":"Atkinson AJ, Colburn WA, Degruttola VG, Demets DL, Downing GJ, Hoth DF, Oates JA, Peck CC, Schooley RT, Spilker BA, Woodcock J, Zeger SL: Biomarkers and surrogate endpoints: Preferred definitions and conceptual framework*. Clin Pharmacol Ther 2001, 69(3):89\u201395.","journal-title":"Clin Pharmacol Ther"},{"issue":"8","key":"2192_CR39","doi-asserted-by":"publisher","first-page":"656","DOI":"10.1093\/bioinformatics\/14.8.656","volume":"14","author":"M Rebhan","year":"1998","unstructured":"Rebhan M, Chalifa-Caspi V, Prilusky J, Lancet D: GeneCards: a novel functional genomics compendium with automated data mining and query reformulation support. Bioinformatics (Oxford, England) 1998, 14(8):656\u2013664.","journal-title":"Bioinformatics (Oxford, England)"},{"key":"2192_CR40","unstructured":"GeneRIF Statistics[http:\/\/www.ncbi.nlm.nih.gov\/projects\/GeneRIF\/stats\/]"},{"key":"2192_CR41","volume-title":"BMC Bioinformatics","author":"R Rubinstein","year":"2005","unstructured":"Rubinstein R, Simon I: MILANO \u2013 custom annotation of microarray results using automatic literature searches. BMC Bioinformatics 2005., 6:"},{"key":"2192_CR42","first-page":"269","volume-title":"Pac Symp Biocomput","author":"Z Lu","year":"2007","unstructured":"Lu Z, Cohen KB, Hunter L: GeneRIF quality assurance as summary revision. Pac Symp Biocomput 2007, 269\u2013280."},{"key":"2192_CR43","doi-asserted-by":"publisher","first-page":"15","DOI":"10.1007\/3-540-70659-3_2","volume-title":"Proceedings of the Joint IAPR International Workshop on Structural, Syntactic, and Statistical Pattern Recognition","author":"TG Dietterich","year":"2002","unstructured":"Dietterich TG: Machine Learning for Sequential Data: A Review. In Proceedings of the Joint IAPR International Workshop on Structural, Syntactic, and Statistical Pattern Recognition. London, UK: Springer-Verlag; 2002:15\u201330."},{"key":"2192_CR44","first-page":"282","volume-title":"Proc. 18th International Conf. on Machine Learning","author":"J Lafferty","year":"2001","unstructured":"Lafferty J, McCallum A, Pereira F: Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data. In Proc. 18th International Conf. on Machine Learning. Morgan Kaufmann, San Francisco, CA; 2001:282\u2013289."},{"key":"2192_CR45","volume-title":"MALLET: A Machine Learning for Language Toolkit","author":"AK McCallum","year":"2002","unstructured":"McCallum AK: MALLET: A Machine Learning for Language Toolkit. 2002."},{"key":"2192_CR46","volume-title":"Proc. 18th International Conf. on Machine Learning","author":"C Sutton","year":"2006","unstructured":"Sutton C, McCallum A: An Introduction to Conditional Random Fields for Relational Learning. Proc. 18th International Conf. on Machine Learning 2006."},{"key":"2192_CR47","volume-title":"Proceedings of the International Joint Workshop on Natural Language Processing in Biomedicine and its Applications (NLPBA)","author":"B Settles","year":"2004","unstructured":"Settles B: Biomedical Named Entity Recognition Using Conditional Random Fields. Proceedings of the International Joint Workshop on Natural Language Processing in Biomedicine and its Applications (NLPBA) 2004."},{"key":"2192_CR48","doi-asserted-by":"publisher","first-page":"419","DOI":"10.3115\/1219840.1219892","volume-title":"Proceedings of the 43rd Annual Meeting of the Association for Computational Linguistics (ACL'05)","author":"S Zhao","year":"2005","unstructured":"Zhao S, Grishman R: Extracting Relations with Integrated Information Using Kernel Methods. In Proceedings of the 43rd Annual Meeting of the Association for Computational Linguistics (ACL'05). Ann Arbor, Michigan: Association for Computational Linguistics; 2005:419\u2013426."},{"key":"2192_CR49","volume-title":"Proceedings of CoNLL-2003","author":"A McCallum","year":"2003","unstructured":"McCallum A, Li W: Early results for named entity recognition with conditional random fields. Proceedings of CoNLL-2003 2003."}],"container-title":["BMC Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/1471-2105-9-207.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2021,9,1]],"date-time":"2021-09-01T11:01:51Z","timestamp":1630494111000},"score":1,"resource":{"primary":{"URL":"https:\/\/bmcbioinformatics.biomedcentral.com\/articles\/10.1186\/1471-2105-9-207"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2008,4,23]]},"references-count":49,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2008,12]]}},"alternative-id":["2192"],"URL":"https:\/\/doi.org\/10.1186\/1471-2105-9-207","relation":{},"ISSN":["1471-2105"],"issn-type":[{"value":"1471-2105","type":"electronic"}],"subject":[],"published":{"date-parts":[[2008,4,23]]},"assertion":[{"value":"2 October 2007","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"23 April 2008","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"23 April 2008","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}}],"article-number":"207"}}