{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,9,3]],"date-time":"2025-09-03T10:03:23Z","timestamp":1756893803454},"reference-count":33,"publisher":"Oxford University Press (OUP)","issue":"16","license":[{"start":{"date-parts":[[2016,10,2]],"date-time":"2016-10-02T00:00:00Z","timestamp":1475366400000},"content-version":"vor","delay-in-days":1568,"URL":"http:\/\/creativecommons.org\/licenses\/by-nc\/3.0"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2012,8,15]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:p>Motivation: Although the amount of data in biology is rapidly increasing, critical information for understanding biological events like phosphorylation or gene expression remains locked in the biomedical literature. Most current text mining (TM) approaches to extract information about biological events are focused on either limited-scale studies and\/or abstracts, with data extracted lacking context and rarely available to support further research.<\/jats:p>\n               <jats:p>Results: Here we present BioContext, an integrated TM system which extracts, extends and integrates results from a number of tools performing entity recognition, biomolecular event extraction and contextualization. Application of our system to 10.9 million MEDLINE abstracts and 234 000 open-access full-text articles from PubMed Central yielded over 36 million mentions representing 11.4 million distinct events. Event participants included over 290 000 distinct genes\/proteins that are mentioned more than 80 million times and linked where possible to Entrez Gene identifiers. Over a third of events contain contextual information such as the anatomical location of the event occurrence or whether the event is reported as negated or speculative.<\/jats:p>\n               <jats:p>Availability: The BioContext pipeline is available for download (under the BSD license) at http:\/\/www.biocontext.org, along with the extracted data which is also available for online browsing.<\/jats:p>\n               <jats:p>Contact: \u00a0martin.gerner@gmail.com<\/jats:p>\n               <jats:p>Supplementary information: \u00a0Supplementary data are available at Bioinformatics online.<\/jats:p>","DOI":"10.1093\/bioinformatics\/bts332","type":"journal-article","created":{"date-parts":[[2012,6,19]],"date-time":"2012-06-19T00:19:39Z","timestamp":1340065179000},"page":"2154-2161","source":"Crossref","is-referenced-by-count":46,"title":["BioContext: an integrated text mining system for large-scale extraction and contextualization of biomolecular events"],"prefix":"10.1093","volume":"28","author":[{"given":"Martin","family":"Gerner","sequence":"first","affiliation":[{"name":"1 Faculty of Life Sciences, University of Manchester, Manchester, M13 9PT and 2School of Computer Science, University of Manchester, Manchester, M13 9PL, UK"}]},{"given":"Farzaneh","family":"Sarafraz","sequence":"additional","affiliation":[{"name":"1 Faculty of Life Sciences, University of Manchester, Manchester, M13 9PT and 2School of Computer Science, University of Manchester, Manchester, M13 9PL, UK"}]},{"given":"Casey M.","family":"Bergman","sequence":"additional","affiliation":[{"name":"1 Faculty of Life Sciences, University of Manchester, Manchester, M13 9PT and 2School of Computer Science, University of Manchester, Manchester, M13 9PL, UK"}]},{"given":"Goran","family":"Nenadic","sequence":"additional","affiliation":[{"name":"1 Faculty of Life Sciences, University of Manchester, Manchester, M13 9PT and 2School of Computer Science, University of Manchester, Manchester, M13 9PL, UK"}]}],"member":"286","published-online":{"date-parts":[[2012,6,17]]},"reference":[{"key":"2023012512524487900_B1","doi-asserted-by":"crossref","first-page":"e24716","DOI":"10.1371\/journal.pone.0024716","article-title":"pubmed2ensembl: a resource for mining the biological literature on genes","volume":"6","author":"Baran","year":"2011","journal-title":"PLoS ONE"},{"key":"2023012512524487900_B2","doi-asserted-by":"crossref","first-page":"10","DOI":"10.3115\/1572340.1572343","article-title":"Extracting complex biological events with rich graph-based feature sets","volume-title":"Proceedings of the Workshop on BioNLP: Shared Task.","author":"Bj\u00f6rne","year":"2009"},{"key":"2023012512524487900_B3","doi-asserted-by":"crossref","first-page":"i382","DOI":"10.1093\/bioinformatics\/btq180","article-title":"Complex event extraction at PubMed scale","volume":"26","author":"Bj\u00f6rne","year":"2010","journal-title":"Bioinformatics"},{"key":"2023012512524487900_B4","first-page":"28","article-title":"Scaling up Biomedical Event Extraction to the Entire PubMed","volume-title":"BioNLP 2010","author":"Bj\u00f6rne","year":"2010"},{"key":"2023012512524487900_B5","doi-asserted-by":"crossref","first-page":"173","DOI":"10.1016\/j.jbi.2009.11.001","article-title":"Beyond genes, proteins, and abstracts: identifying scientific claims from full-text biomedical articles","volume":"43","author":"Blake","year":"2010","journal-title":"J. Biomed. Inform."},{"key":"2023012512524487900_B6","doi-asserted-by":"crossref","first-page":"D532","DOI":"10.1093\/nar\/gkp983","article-title":"MINT, the molecular interaction database: 2009 update","volume":"38","author":"Ceol","year":"2009","journal-title":"Nucleic Acids Res."},{"key":"2023012512524487900_B7","volume-title":"Processing with GATE.","author":"Cunningham","year":"2011"},{"key":"2023012512524487900_B8","doi-asserted-by":"crossref","first-page":"85","DOI":"10.1186\/1471-2105-11-85","article-title":"LINNAEUS: a species name identification system for biomedical literature","volume":"11","author":"Gerner","year":"2010","journal-title":"BMC Bioinformatics"},{"key":"2023012512524487900_B9","first-page":"72","article-title":"An exploration of mining gene expression mentions and their anatomical locations from biomedical text","volume-title":"Proceedings of the BioNLP workshop","author":"Gerner","year":"2010"},{"key":"2023012512524487900_B10","doi-asserted-by":"crossref","first-page":"i126","DOI":"10.1093\/bioinformatics\/btn299","article-title":"Inter-species normalization of gene mentions with GNAT","volume":"24","author":"Hakenberg","year":"2008","journal-title":"Bioinformatics"},{"key":"2023012512524487900_B11","doi-asserted-by":"crossref","first-page":"2769","DOI":"10.1093\/bioinformatics\/btr455","article-title":"The GNAT library for local and remote gene mention normalization","volume":"27","author":"Hakenberg","year":"2011","journal-title":"Bioinformatics"},{"key":"2023012512524487900_B12","doi-asserted-by":"crossref","first-page":"17","DOI":"10.1186\/1758-2946-3-17","article-title":"ChemicalTagger: a tool for semantic text-mining in chemistry","volume":"3","author":"Hawizy","year":"2011","journal-title":"J. Cheminform."},{"key":"2023012512524487900_B13","doi-asserted-by":"crossref","first-page":"1032","DOI":"10.1093\/bioinformatics\/btr042","article-title":"GeneTUKit: a software for document-level gene normalization","volume":"27","author":"Huang","year":"2011","journal-title":"Bioinformatics"},{"key":"2023012512524487900_B14","doi-asserted-by":"crossref","first-page":"481","DOI":"10.1186\/1471-2105-12-481","article-title":"U-Compare bio-event meta-service: compatible BioNLP event extraction services","volume":"12","author":"Kano","year":"2011","journal-title":"BMC Bioinformatics"},{"key":"2023012512524487900_B15","doi-asserted-by":"crossref","first-page":"10","DOI":"10.1186\/1471-2105-9-10","article-title":"Corpus annotation for mining biomedical events from literature","volume":"9","author":"Kim","year":"2008","journal-title":"BMC Bioinformatics"},{"key":"2023012512524487900_B16","doi-asserted-by":"crossref","first-page":"1","DOI":"10.3115\/1572340.1572342","article-title":"Overview of BioNLP'09 shared task on event extraction","volume-title":"Proceedings of the Workshop on BioNLP: Shared Task","author":"Kim","year":"2009"},{"key":"2023012512524487900_B17","first-page":"1","article-title":"Overview of Genia event task in BioNLP Shared Task 2011","volume-title":"BioNLP Shared Task 2011","author":"Kim","year":"2011"},{"key":"2023012512524487900_B18","doi-asserted-by":"crossref","first-page":"S4","DOI":"10.1186\/gb-2008-9-s2-s4","article-title":"Overview of the protein-protein interaction annotation extraction task of BioCreative II","volume":"9","author":"Krallinger","year":"2008","journal-title":"Genome Biol."},{"key":"2023012512524487900_B19","doi-asserted-by":"crossref","first-page":"S8","DOI":"10.1186\/gb-2008-9-s2-s8","article-title":"Linking genes to literature: text mining, information extraction, and retrieval applications for biology","volume":"9","author":"Krallinger","year":"2008","journal-title":"Genome Biol."},{"key":"2023012512524487900_B20","first-page":"652","article-title":"BANNER: an executable survey of advances in biomedical named entity recognition","volume-title":"Pacific Symp. on Biocomputing.","author":"Leaman","year":"2008"},{"key":"2023012512524487900_B21","doi-asserted-by":"crossref","first-page":"baq036","DOI":"10.1093\/database\/baq036","article-title":"PubMed and beyond: a survey of web tools for searching biomedical literature","volume":"2011","author":"Lu","year":"2011","journal-title":"Database"},{"key":"2023012512524487900_B22","first-page":"152","article-title":"Effective self-training for parsing","volume-title":"HLT-NAACL","author":"McClosky","year":"2006"},{"key":"2023012512524487900_B23","first-page":"1626","article-title":"Event extraction as dependency parsing","volume-title":"Association for Computational Linguistics - Human Language Technologies 2011 Conference (ACL-HLT 2011)","author":"McClosky","year":"2011"},{"key":"2023012512524487900_B24","first-page":"779","article-title":"Evaluating dependency representation for event extraction","volume-title":"The 23rd International Conference on Computational Linguistics (COLING 2010).","author":"Miwa","year":"2010"},{"key":"2023012512524487900_B25","first-page":"106","article-title":"Incorporating GENETAG-style annotation to GENIA corpus","volume-title":"BioNLP Workshop.","author":"Ohta","year":"2009"},{"key":"2023012512524487900_B26","first-page":"1044","article-title":"Dependency parsing and domain adaptation with LR models and parser ensembles","volume-title":"CoNLL 2007 Shared Task. Joint Conferences on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL'07).","author":"Sagae","year":"2007"},{"key":"2023012512524487900_B27","first-page":"545","article-title":"Comparative parser performance analysis across grammar frameworks through automatic tree conversion using synchronous grammars","volume-title":"COLING 2008","author":"Sagae","year":"2008"},{"key":"2023012512524487900_B28","first-page":"115","article-title":"Biomedical event detection using rules, conditional random fields and parse tree distances","volume-title":"BioNLP Workshop.","author":"Sarafraz","year":"2009"},{"key":"2023012512524487900_B29","article-title":"Using SVMs with the command relation features to identify negated events in biomedical literature","volume-title":"The Workshop on Negation and Speculation in Natural Language Processing.","author":"Sarafraz","year":"2010"},{"key":"2023012512524487900_B30","doi-asserted-by":"crossref","first-page":"3191","DOI":"10.1093\/bioinformatics\/bti475","article-title":"ABNER: an open source tool for automatically tagging genes, proteins, and other entity names in text","volume":"21","author":"Settles","year":"2005","journal-title":"Bioinformatics"},{"key":"2023012512524487900_B31","first-page":"137","article-title":"Gene mention normalization in full texts using GNAT and LINNAEUS","volume-title":"Proceedings of the BioCreative III Workshop.","author":"Solt","year":"2010"},{"key":"2023012512524487900_B32","doi-asserted-by":"crossref","first-page":"D561","DOI":"10.1093\/nar\/gkq973","article-title":"The STRING database in 2011: functional interaction networks of proteins, globally integrated and scored","volume":"39","author":"Szklarczyk","year":"2011","journal-title":"Nucleic Acids Res."},{"key":"2023012512524487900_B33","doi-asserted-by":"crossref","first-page":"393","DOI":"10.1016\/j.jbi.2007.11.008","article-title":"Extracting interactions between proteins from the literature","volume":"41","author":"Zhou","year":"2008","journal-title":"J. Biomed. Inform."}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/28\/16\/2154\/48869938\/bioinformatics_28_16_2154.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/28\/16\/2154\/48869938\/bioinformatics_28_16_2154.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,1,25]],"date-time":"2023-01-25T17:50:40Z","timestamp":1674669040000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/28\/16\/2154\/323694"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2012,6,17]]},"references-count":33,"journal-issue":{"issue":"16","published-print":{"date-parts":[[2012,8,15]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/bts332","relation":{},"ISSN":["1367-4811","1367-4803"],"issn-type":[{"value":"1367-4811","type":"electronic"},{"value":"1367-4803","type":"print"}],"subject":[],"published-other":{"date-parts":[[2012,8,15]]},"published":{"date-parts":[[2012,6,17]]}}}