{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,26]],"date-time":"2025-10-26T22:47:46Z","timestamp":1761518866179},"reference-count":39,"publisher":"Springer Science and Business Media LLC","issue":"1","content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["BMC Bioinformatics"],"published-print":{"date-parts":[[2007,12]]},"abstract":"<jats:title>Abstract<\/jats:title>\n          <jats:sec>\n            <jats:title>Background<\/jats:title>\n            <jats:p>Bioinformatics tools for automatic processing of biomedical literature are invaluable for both the design and interpretation of large-scale experiments. Many information extraction (IE) systems that incorporate natural language processing (NLP) techniques have thus been developed for use in the biomedical field. A key IE task in this field is the extraction of biomedical relations, such as protein-protein and gene-disease interactions. However, most biomedical relation extraction systems usually ignore adverbial and prepositional phrases and words identifying location, manner, timing, and condition, which are essential for describing biomedical relations. Semantic role labeling (SRL) is a natural language processing technique that identifies the semantic roles of these words or phrases in sentences and expresses them as predicate-argument structures. We construct a biomedical SRL system called BIOSMILE that uses a maximum entropy (ME) machine-learning model to extract biomedical relations. BIOSMILE is trained on BioProp, our semi-automatic, annotated biomedical proposition bank. Currently, we are focusing on 30 biomedical verbs that are frequently used or considered important for describing molecular events.<\/jats:p>\n          <\/jats:sec>\n          <jats:sec>\n            <jats:title>Results<\/jats:title>\n            <jats:p>To evaluate the performance of BIOSMILE, we conducted two experiments to (1) compare the performance of SRL systems trained on newswire and biomedical corpora; and (2) examine the effects of using biomedical-specific features. The experimental results show that using BioProp improves the F-score of the SRL system by 21.45% over an SRL system that uses a newswire corpus. It is noteworthy that adding automatically generated template features improves the overall F-score by a further 0.52%. Specifically, ArgM-LOC, ArgM-MNR, and Arg2 achieve statistically significant performance improvements of 3.33%, 2.27%, and 1.44%, respectively.<\/jats:p>\n          <\/jats:sec>\n          <jats:sec>\n            <jats:title>Conclusion<\/jats:title>\n            <jats:p>We demonstrate the necessity of using a biomedical proposition bank for training SRL systems in the biomedical domain. Besides the different characteristics of biomedical and newswire sentences, factors such as cross-domain framesets and verb usage variations also influence the performance of SRL systems. For argument classification, we find that NE (named entity) features indicating if the target node matches with NEs are not effective, since NEs may match with a node of the parsing tree that does not have semantic role labels in the training set. We therefore incorporate templates composed of specific words, NE types, and POS tags into the SRL system. As a result, the classification accuracy for adjunct arguments, which is especially important for biomedical SRL, is improved significantly.<\/jats:p>\n          <\/jats:sec>","DOI":"10.1186\/1471-2105-8-325","type":"journal-article","created":{"date-parts":[[2007,9,1]],"date-time":"2007-09-01T18:13:24Z","timestamp":1188670404000},"update-policy":"http:\/\/dx.doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":40,"title":["BIOSMILE: A semantic role labeling system for biomedical verbs using a maximum-entropy model with automatically generated template features"],"prefix":"10.1186","volume":"8","author":[{"given":"Richard Tzong-Han","family":"Tsai","sequence":"first","affiliation":[]},{"given":"Wen-Chi","family":"Chou","sequence":"additional","affiliation":[]},{"given":"Ying-Shan","family":"Su","sequence":"additional","affiliation":[]},{"given":"Yu-Chun","family":"Lin","sequence":"additional","affiliation":[]},{"given":"Cheng-Lung","family":"Sung","sequence":"additional","affiliation":[]},{"given":"Hong-Jie","family":"Dai","sequence":"additional","affiliation":[]},{"given":"Irene Tzu-Hsuan","family":"Yeh","sequence":"additional","affiliation":[]},{"given":"Wei","family":"Ku","sequence":"additional","affiliation":[]},{"given":"Ting-Yi","family":"Sung","sequence":"additional","affiliation":[]},{"given":"Wen-Lian","family":"Hsu","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2007,9,1]]},"reference":[{"key":"1697_CR1","volume-title":"Foundations of Statistical Natural Language Processing","author":"CD Manning","year":"1999","unstructured":"Manning CD, Sch\u00fctze H: Foundations of Statistical Natural Language Processing. 1999, Cambridge, MA , MIT Press"},{"issue":"5701","key":"1697_CR2","doi-asserted-by":"publisher","first-page":"1555","DOI":"10.1126\/science.1099511","volume":"306","author":"I Lee","year":"2004","unstructured":"Lee I, Date SV, Adai AT, Marcotte EM: A probabilistic functional network of yeast genes. Science. 2004, 306 (5701): 1555-1558. 10.1126\/science.1099511.","journal-title":"Science"},{"issue":"5","key":"1697_CR3","doi-asserted-by":"publisher","first-page":"r40","DOI":"10.1186\/gb-2005-6-5-r40","volume":"6","author":"AK Ramani","year":"2005","unstructured":"Ramani AK, Bunescu RC, Mooney RJ, Marcotte EM: Consolidating the set of know human proteinprotein interactions in preparation for large-scale mapping of the human interactome. Genome Biology. 2005, 6 (5): r40-10.1186\/gb-2005-6-5-r40.","journal-title":"Genome Biology"},{"key":"1697_CR4","doi-asserted-by":"crossref","unstructured":"Wren JD: Extending the mutual information measure to rank inferred literature relationships. BMC Bioinformatics. 2004, 5 (145):","DOI":"10.1186\/1471-2105-5-145"},{"key":"1697_CR5","doi-asserted-by":"crossref","unstructured":"Chen H, Sharp BM: Content-rich biological network constructed by mining PubMed abstracts. BMC Bioinformatics. 2004, 5 (147):","DOI":"10.1186\/1471-2105-5-147"},{"key":"1697_CR6","doi-asserted-by":"crossref","unstructured":"Donaldson I, Martin J, Bruijn B, Wolting C, Lay V, Tuekam B, Zhang S, Baskin B, Bader GD, Michalickova K, Pawson T, Hogue CWV: PreBIND and Textomy - mining the biomedical literature for protein-protein interactions using a support vector machine. BMC Bioinformatics. 2003, 4 (11):","DOI":"10.1186\/1471-2105-4-11"},{"key":"1697_CR7","volume-title":"Representing sentence structure in hidden Markov models for information extraction","author":"S Ray","year":"2001","unstructured":"Ray S, Craven M: Representing sentence structure in hidden Markov models for information extraction. 2001"},{"key":"1697_CR8","doi-asserted-by":"publisher","DOI":"10.3115\/1219840.1219892","volume-title":"Extracting relations with integrated information using kernel methods:","author":"S Zhao","year":"2005","unstructured":"Zhao S, Grishman R: Extracting relations with integrated information using kernel methods: Arbor, Michigan. 2005"},{"key":"1697_CR9","volume-title":"Subsequence kernels for relation extraction:","author":"RC Bunescu","year":"2005","unstructured":"Bunescu RC, Mooney RJ: Subsequence kernels for relation extraction: Vancouver, BC.2005"},{"key":"1697_CR10","volume-title":"Shallow Semantic Parsing using Support Vector Machines","author":"K Hacioglu","year":"2003","unstructured":"Hacioglu K, Pradhan S, WayneWard, Martin JH, Jurafsky D: Shallow Semantic Parsing using Support Vector Machines. 2003"},{"key":"1697_CR11","doi-asserted-by":"publisher","first-page":"113","DOI":"10.1016\/S0166-4115(08)62655-2","volume-title":"The cognitive psychology of knowledge","author":"R H\u00f6rnig","year":"1993","unstructured":"H\u00f6rnig R, Rauh R, Strube G: EVENTS-II: Modeling event recognition . The cognitive psychology of knowledge. Edited by: Strube G, Wender KF. 1993, Amsterdam , Elsevier Science Publishers, 113-138."},{"key":"1697_CR12","volume-title":"The Necessity of Parsing for Predicate Argument Recognition","author":"D Gildea","year":"2002","unstructured":"Gildea D, Palmer M: The Necessity of Parsing for Predicate Argument Recognition. 2002"},{"key":"1697_CR13","doi-asserted-by":"crossref","unstructured":"Gildea D, Jurafsky D: Automatic labeling of semantic roles. Computational Linguistics. 2002, 28 (3):","DOI":"10.1162\/089120102760275983"},{"key":"1697_CR14","volume-title":"Calibrating Features for Semantic Role Labeling","author":"N Xue","year":"2004","unstructured":"Xue N, Palmer M: Calibrating Features for Semantic Role Labeling. 2004"},{"key":"1697_CR15","doi-asserted-by":"publisher","DOI":"10.3115\/1220355.1220552","volume-title":"Semantic Role Labeling via Integer Linear Programming Inference","author":"V Punyakanok","year":"2004","unstructured":"Punyakanok V, Roth D, Yih W, Zimak D: Semantic Role Labeling via Integer Linear Programming Inference. 2004"},{"key":"1697_CR16","volume-title":"Shallow Semantics for Relation Extraction","author":"P Morarescu","year":"2005","unstructured":"Morarescu P, Bejan C, Harabagiu S: Shallow Semantics for Relation Extraction. 2005"},{"key":"1697_CR17","volume-title":"Journal of Machine Learning","author":"S Pradhan","year":"2004","unstructured":"Pradhan S, Hacioglu K, Kruglery V, Ward W, Martin JH, Jurafsky D: Support Vector Learning for Semantic Argument Classification. Journal of Machine Learning. 2004"},{"key":"1697_CR18","volume-title":"The Necessity of Syntactic Parsing for Semantic Role Labeling","author":"V Punyakanok","year":"2005","unstructured":"Punyakanok V, Roth D, Yih W: The Necessity of Syntactic Parsing for Semantic Role Labeling. 2005"},{"key":"1697_CR19","doi-asserted-by":"publisher","DOI":"10.3115\/1706543.1706589","volume-title":"Exploiting Full Parsing Information to Label Semantic Roles Using an Ensemble of ME and SVM via Integer Linear Programming.","author":"TH Tsai","year":"2005","unstructured":"Tsai TH, Wu CW, Lin YC, Hsu WL: Exploiting Full Parsing Information to Label Semantic Roles Using an Ensemble of ME and SVM via Integer Linear Programming. 2005"},{"key":"1697_CR20","doi-asserted-by":"crossref","unstructured":"Palmer M, Gildea D, Kingsbury P: The Proposition Bank: An Annotated Corpus of Semantic Roles. Computational Linguistics. 2005, 31 (1):","DOI":"10.1162\/0891201053630264"},{"key":"1697_CR21","doi-asserted-by":"publisher","DOI":"10.3115\/1641991.1641993","volume-title":"A Semi-Automatic Method for Annotating a Biomedical Proposition Bank","author":"WC Chou","year":"2006","unstructured":"Chou WC, Tsai RTH, Su YS, Ku W, Sung TY, Hsu WL: A Semi-Automatic Method for Annotating a Biomedical Proposition Bank. 2006"},{"key":"1697_CR22","doi-asserted-by":"publisher","first-page":"i180","DOI":"10.1093\/bioinformatics\/btg1023","volume":"19 Suppl 1","author":"JD Kim","year":"2003","unstructured":"Kim JD, Ohta T, Tateisi Y, Tsujii J: GENIA corpus--semantically annotated corpus for bio-textmining. Bioinformatics. 2003, 19 Suppl 1: i180-2. 10.1093\/bioinformatics\/btg1023.","journal-title":"Bioinformatics"},{"key":"1697_CR23","volume-title":"Introduction to the Bio-Entity Task at JNLPBA","author":"JD Kim","year":"2004","unstructured":"Kim JD, Ohta T, Tsuruoka Y, Tateisi Y, Collier N: Introduction to the Bio-Entity Task at JNLPBA. 2004"},{"key":"1697_CR24","first-page":"pp. 222","volume-title":"Syntax Annotation for the GENIA corpus","author":"Y Tateisi","year":"2005","unstructured":"Tateisi Y, Yakushiji A, Ohta T, Tsujii J: Syntax Annotation for the GENIA corpus. 2005, Companion volume: pp. 222--227."},{"key":"1697_CR25","volume-title":"Bracketing Guidelines for Treebank II Style Penn Treebank Project","author":"A Bies","year":"1995","unstructured":"Bies A, Ferguson M, Katz K, MacIntyre R, Tredinnick V, Kim G, Marcinkiewicz MA, Schasberger B: Bracketing Guidelines for Treebank II Style Penn Treebank Project . 1995"},{"key":"1697_CR26","doi-asserted-by":"publisher","first-page":"155","DOI":"10.1186\/1471-2105-5-155","volume":"5","author":"T Wattarujeekrit","year":"2004","unstructured":"Wattarujeekrit T, Shah PK, Collier N: PASBio: predicate-argument structures for event extraction in molecular biology. BMC Bioinformatics. 2004, 5: 155-10.1186\/1471-2105-5-155.","journal-title":"BMC Bioinformatics"},{"key":"1697_CR27","doi-asserted-by":"publisher","DOI":"10.3115\/1706543.1706571","volume-title":"Introduction to the CoNLL-2005 Shared Task: Semantic Role Labeling","author":"X Carreras","year":"2005","unstructured":"Carreras X, M\u00e0rquez L: Introduction to the CoNLL-2005 Shared Task: Semantic Role Labeling. 2005"},{"issue":"3","key":"1697_CR28","doi-asserted-by":"publisher","first-page":"547\u2013619","DOI":"10.1353\/lan.1991.0021","volume":"67","author":"DR Dowty","year":"1991","unstructured":"Dowty DR: Thematic proto-roles and argument selection. Language. 1991, 67 (3): 547\u2013619-10.2307\/415037.","journal-title":"Language"},{"issue":"13","key":"1697_CR29","doi-asserted-by":"publisher","first-page":"1668","DOI":"10.1093\/bioinformatics\/btl159","volume":"22","author":"X Yuan","year":"2006","unstructured":"Yuan X, Hu ZZ, Wu HT, Torii M, Narayanaswamy M, Ravikumar KE, Vijay-Shanker K, Wu CH: An online literature mining tool for protein phosphorylation. Bioinformatics. 2006, 22 (13): 1668-1669. 10.1093\/bioinformatics\/btl159.","journal-title":"Bioinformatics"},{"key":"1697_CR30","volume-title":"Nonparametric statistics for the behavioral sciences","author":"S Siegel","year":"1988","unstructured":"Siegel S, Castellan JN: Nonparametric statistics for the behavioral sciences. 1988, Boston, MA , McGraw Hill"},{"issue":"7","key":"1697_CR31","doi-asserted-by":"publisher","first-page":"857","DOI":"10.1093\/bioinformatics\/btk044","volume":"22","author":"PK Shah","year":"2006","unstructured":"Shah PK, Bork P: LSAT: learning about alternative transcripts in MEDLINE. BMC Bioinformatics. 2006, 22 (7): 857-865.","journal-title":"BMC Bioinformatics"},{"key":"1697_CR32","series-title":"Springer Series on Computational Biology","volume-title":"Artificial Intelligence and Systems Biology","author":"KB Cohen","year":"2005","unstructured":"Cohen KB, Hunter L: Natural Language Processing and Systems Biology Artificial Intelligence and Systems Biology. Springer Series on Computational Biology Edited by: Dubitzky W, Azuaje F. 2005, Springer,,"},{"key":"1697_CR33","first-page":"410","volume-title":"Towards semantic role labeling & IE in the medical literature","author":"Y Kogan","year":"2005","unstructured":"Kogan Y, Collier N, Pakhomov S, Krauthammer M: Towards semantic role labeling & IE in the medical literature. 2005, 410-414."},{"key":"1697_CR34","volume-title":"BIOSMILE: Adapting Semantic Role Labeling for Biomedical Verbs: An Exponential Model Coupled with Automatically Generated Template Features:","author":"RTH Tsai","year":"2006","unstructured":"Tsai RTH, Chou WC, Lin YC, Ku W, Su YS, Sung TY, Hsu WL: BIOSMILE: Adapting Semantic Role Labeling for Biomedical Verbs: An Exponential Model Coupled with Automatically Generated Template Features: New York.2006, ,"},{"key":"1697_CR35","doi-asserted-by":"publisher","first-page":"195","DOI":"10.1016\/0022-2836(81)90087-5","volume":"147","author":"TF Smith","year":"1981","unstructured":"Smith TF, Waterman MS: Identification of common molecular subsequences. Journal of Molecular Biology. 1981, 147: 195-197. 10.1016\/0022-2836(81)90087-5.","journal-title":"Journal of Molecular Biology"},{"key":"1697_CR36","volume-title":"Semantic Role Labeling by Tagging Syntactic Chunks","author":"K Hacioglu","year":"2004","unstructured":"Hacioglu K, Pradhan S, Ward W, Martin JH, Jurafsky D: Semantic Role Labeling by Tagging Syntactic Chunks. 2004"},{"key":"1697_CR37","volume-title":"The Annals of Mathematical Statistics","author":"JN Darroch","year":"1972","unstructured":"Darroch JN, Ratcliff D: Generalized Iterative Scaling for Log-Linear Models. The Annals of Mathematical Statistics. 1972"},{"key":"1697_CR38","doi-asserted-by":"publisher","DOI":"10.1007\/b98874","volume-title":"Numerical Optimization","author":"J Nocedal","year":"1999","unstructured":"Nocedal J, Wright SJ: Numerical Optimization. 1999, Springer"},{"key":"1697_CR39","first-page":"8","volume-title":"Using Predicate-Argument Structures for Information Extraction","author":"M Surdeanu","year":"2003","unstructured":"Surdeanu M, Harabagiu SM, Williams J, Aarseth P: Using Predicate-Argument Structures for Information Extraction. 2003, 8-15."}],"container-title":["BMC Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/1471-2105-8-325.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2021,9,1]],"date-time":"2021-09-01T01:55:54Z","timestamp":1630461354000},"score":1,"resource":{"primary":{"URL":"https:\/\/bmcbioinformatics.biomedcentral.com\/articles\/10.1186\/1471-2105-8-325"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2007,9,1]]},"references-count":39,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2007,12]]}},"alternative-id":["1697"],"URL":"https:\/\/doi.org\/10.1186\/1471-2105-8-325","relation":{},"ISSN":["1471-2105"],"issn-type":[{"value":"1471-2105","type":"electronic"}],"subject":[],"published":{"date-parts":[[2007,9,1]]},"assertion":[{"value":"20 November 2006","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"1 September 2007","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"1 September 2007","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}}],"article-number":"325"}}