{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,9,30]],"date-time":"2025-09-30T04:08:51Z","timestamp":1759205331157,"version":"3.37.3"},"reference-count":42,"publisher":"Oxford University Press (OUP)","issue":"12","license":[{"start":{"date-parts":[[2016,10,2]],"date-time":"2016-10-02T00:00:00Z","timestamp":1475366400000},"content-version":"vor","delay-in-days":2315,"URL":"http:\/\/creativecommons.org\/licenses\/by-nc\/2.0\/uk\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2010,6,15]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>Motivation: Ontologies and taxonomies have proven highly beneficial for biocuration. The Open Biomedical Ontology (OBO) Foundry alone lists over 90 ontologies mainly built with OBO-Edit. Creating and maintaining such ontologies is a labour-intensive, difficult, manual process. Automating parts of it is of great importance for the further development of ontologies and for biocuration.<\/jats:p><jats:p>Results: We have developed the Dresden Ontology Generator for Directed Acyclic Graphs (DOG4DAG), a system which supports the creation and extension of OBO ontologies by semi-automatically generating terms, definitions and parent\u2013child relations from text in PubMed, the web and PDF repositories. DOG4DAG is seamlessly integrated into OBO-Edit. It generates terms by identifying statistically significant noun phrases in text. For definitions and parent\u2013child relations it employs pattern-based web searches. We systematically evaluate each generation step using manually validated benchmarks. The term generation leads to high-quality terms also found in manually created ontologies. Up to 78% of definitions are valid and up to 54% of child\u2013ancestor relations can be retrieved. There is no other validated system that achieves comparable results.<\/jats:p><jats:p>By combining the prediction of high-quality terms, definitions and parent\u2013child relations with the ontology editor OBO-Edit we contribute a thoroughly validated tool for all OBO ontology engineers.<\/jats:p><jats:p>Availability: DOG4DAG is available within OBO-Edit 2.1 at http:\/\/www.oboedit.org<\/jats:p><jats:p>Contact: \u00a0thomas.waechter@biotec.tu-dresden.de;<\/jats:p><jats:p>Supplementary Information: \u00a0Supplementary data are available at Bioinformatics online.<\/jats:p>","DOI":"10.1093\/bioinformatics\/btq188","type":"journal-article","created":{"date-parts":[[2010,6,7]],"date-time":"2010-06-07T07:28:13Z","timestamp":1275895693000},"page":"i88-i96","source":"Crossref","is-referenced-by-count":26,"title":["Semi-automated ontology generation within OBO-Edit"],"prefix":"10.1093","volume":"26","author":[{"given":"Thomas","family":"W\u00e4chter","sequence":"first","affiliation":[{"name":"Biotechnology Center (BIOTEC), Technische Universit\u00e4t Dresden, 01062 Dresden, Germany"}]},{"given":"Michael","family":"Schroeder","sequence":"additional","affiliation":[{"name":"Biotechnology Center (BIOTEC), Technische Universit\u00e4t Dresden, 01062 Dresden, Germany"}]}],"member":"286","published-online":{"date-parts":[[2010,6,1]]},"reference":[{"issue":"Suppl. 9","key":"2023012508092309300_B1","doi-asserted-by":"crossref","first-page":"S2","DOI":"10.1186\/1471-2105-9-S4-S2","article-title":"Terminologies for text-mining; an experiment in the lipoprotein metabolism domain","volume":"9","author":"Alexopoulou","year":"2008","journal-title":"BMC Bioinformatics"},{"issue":"Suppl. 5","key":"2023012508092309300_B2","doi-asserted-by":"crossref","first-page":"S1","DOI":"10.1186\/1471-2105-9-S5-S1","article-title":"Ontology design patterns for bio-ontologies: a case study on the cell cycle ontology","volume":"9","author":"Aranguren","year":"2008","journal-title":"BMC Bioinformatics"},{"key":"2023012508092309300_B3","doi-asserted-by":"crossref","first-page":"25","DOI":"10.1038\/75556","article-title":"Gene Ontology: tool for the unification of biology. The Gene Ontology Consortium","volume":"25","author":"Ashburner","year":"2000","journal-title":"Nat. Genet."},{"key":"2023012508092309300_B4","doi-asserted-by":"crossref","first-page":"256","DOI":"10.1093\/bib\/bbl027","article-title":"Bio-ontologies: current trends and future directions","volume":"7","author":"Bodenreider","year":"2006","journal-title":"Brief. Bioinform."},{"key":"2023012508092309300_B5","doi-asserted-by":"crossref","first-page":"224","DOI":"10.3115\/974147.974178","article-title":"TnT: a statistical part-of-speech tagger","volume-title":"Proceedings of the 6th Conference on Applied Natural Language Processing","author":"Brants","year":"2000"},{"key":"2023012508092309300_B6","doi-asserted-by":"crossref","first-page":"31","DOI":"10.1007\/978-3-540-25956-5_3","article-title":"A Prot\u00e9g\u00e9 plug-in for ontology extraction from text based on linguistic analysis","volume-title":"The Semantic Web: Research and Applications","author":"Buitelaar","year":"2004"},{"key":"2023012508092309300_B7","doi-asserted-by":"crossref","first-page":"120","DOI":"10.3115\/1034678.1034705","article-title":"Automatic construction of a hypernym-labeled noun hierarchy from text","volume-title":"Proceedings of the 37th annual meeting of the Association for Computational Linguistics on Computational Linguistics","author":"Caraballo","year":"1999"},{"key":"2023012508092309300_B8","first-page":"227","article-title":"Text2Onto - a framework for ontology learning and data-driven change discovery","volume-title":"Proceedings of the 10th International Conference on Applications of Natural Language to Information Systems (NLDB)","author":"Cimiano","year":"2005"},{"key":"2023012508092309300_B9","first-page":"305","article-title":"Learning concept hierarchies from text corpora using formal concept analysis","volume":"24","author":"Cimiano","year":"2005","journal-title":"J. Artif. Int. Res."},{"key":"2023012508092309300_B10","doi-asserted-by":"crossref","first-page":"97","DOI":"10.1186\/1471-2105-7-97","article-title":"The Ontology Lookup Service, a lightweight cross-platform tool for controlled vocabulary queries","volume":"7","author":"Cote","year":"2006","journal-title":"BMC Bioinformatics"},{"key":"2023012508092309300_B11","doi-asserted-by":"crossref","first-page":"2198","DOI":"10.1093\/bioinformatics\/btm112","article-title":"Obo-edit\u2014an ontology editor for biologists","volume":"23","author":"Day-Richter","year":"2007","journal-title":"Bioinformatics"},{"key":"2023012508092309300_B12","article-title":"Definition extraction using a sequential combination of baseline grammars and machine learning classifiers","volume-title":"Proceedings of the Sixth International Language Resources and Evaluation (LREC'08)","author":"Deg\u00f3#rski","year":"2008"},{"key":"2023012508092309300_B13","doi-asserted-by":"crossref","first-page":"772","DOI":"10.6028\/NIST.SP.500-255.qa-usc-isi.hermjakob","article-title":"Multiple-engine question answering in TextMap","volume-title":"Proceedings of the 12th Text Retrieval Conference (TREC-2003)","author":"Echihabi","year":"2003"},{"key":"2023012508092309300_B14","doi-asserted-by":"crossref","first-page":"17","DOI":"10.3115\/981863.981866","article-title":"Noun-phrase analysis in unrestricted text for information retrieval","volume-title":"Proceedings of the 34th annual meeting on Association for Computational Linguistics","author":"Evans","year":"1996"},{"key":"2023012508092309300_B15","article-title":"Statistical measures for terminological extraction","volume-title":"Technical report","author":"Frantzi","year":"1995"},{"key":"2023012508092309300_B16","doi-asserted-by":"crossref","first-page":"115","DOI":"10.1007\/s007999900023","article-title":"Automatic recognition of multi-word terms: the C-value\/NC-value method","volume":"3","author":"Frantzi","year":"2000","journal-title":"Int J. on Dig. Lib."},{"key":"2023012508092309300_B17","doi-asserted-by":"crossref","first-page":"1601","DOI":"10.1093\/ietisy\/e89-d.4.1601","article-title":"A definitional question answering system based on phrase extraction using syntactic patterns","volume":"E89-D","author":"Han","year":"2006","journal-title":"IEICE - Trans. Inf. Syst."},{"key":"2023012508092309300_B18","doi-asserted-by":"crossref","first-page":"539","DOI":"10.3115\/992133.992154","article-title":"Automatic acquisition of hyponyms from large text corpora","volume-title":"Proceedings of the 14th conference on Computational linguistics","author":"Hearst","year":"1992"},{"key":"2023012508092309300_B19","article-title":"Collaborative creation of communal hierarchical taxonomies in social tagging systems","volume-title":"Technical Report 2006\u201310","author":"Heymann","year":"2006"},{"issue":"Suppl. 5","key":"2023012508092309300_B20","doi-asserted-by":"crossref","first-page":"S2","DOI":"10.1186\/1471-2105-9-S5-S2","article-title":"Gene ontology annotations: what they mean and where they come from","volume":"9","author":"Hill","year":"2008","journal-title":"BMC Bioinformatics"},{"key":"2023012508092309300_B21","doi-asserted-by":"crossref","first-page":"47","DOI":"10.1038\/455047a","article-title":"Big data: The future of biocuration","volume":"455","author":"Howe","year":"2008","journal-title":"Nature"},{"key":"2023012508092309300_B22","doi-asserted-by":"crossref","first-page":"665","DOI":"10.1093\/bioinformatics\/btl010","article-title":"Automatic extension of Gene Ontology with flexible identification of candidate terms","volume":"22","author":"Lee","year":"2006","journal-title":"Bioinformatics"},{"key":"2023012508092309300_B23","doi-asserted-by":"crossref","first-page":"251","DOI":"10.1145\/775152.775188","article-title":"Mining topic-specific concepts and definitions on the web","volume-title":"WWW '03: Proceedings of the 12th international conference on World Wide Web","author":"Liu","year":"2003"},{"key":"2023012508092309300_B24","doi-asserted-by":"crossref","first-page":"e309","DOI":"10.1371\/journal.pbio.0020309","article-title":"Textpresso: an ontology-based information retrieval and extraction system for biological literature","volume":"2","author":"M\u00fcller","year":"2004","journal-title":"PLoS Biol."},{"key":"2023012508092309300_B25","doi-asserted-by":"crossref","first-page":"509","DOI":"10.1002\/cfg.435","article-title":"Obol: integrating language and meaning in bio-ontologies: conference papers","volume":"5","author":"Mungall","year":"2004","journal-title":"Comp. Funct. Genomics"},{"key":"2023012508092309300_B26","doi-asserted-by":"crossref","first-page":"151","DOI":"10.1162\/089120104323093276","article-title":"Learning domain ontologies from document warehouses and dedicated web sites","volume":"30","author":"Navigli","year":"2004","journal-title":"Comput. Linguist."},{"key":"2023012508092309300_B27","first-page":"214","article-title":"The compositional structure of gene ontology terms","author":"Ogren","year":"2004","journal-title":"Pacific Symposium on Biocomputing"},{"key":"2023012508092309300_B28","doi-asserted-by":"crossref","first-page":"610","DOI":"10.1016\/j.ajhg.2008.09.017","article-title":"The human phenotype ontology: a tool for annotating and analyzing human hereditary disease","volume":"83","author":"Robinson","year":"2008","journal-title":"Am. J. Hum. Genet."},{"key":"2023012508092309300_B29","first-page":"41","article-title":"Taxonomy learning using term specificity and similarity","volume-title":"Proceedings of the 2nd Workshop on Ontology Learning and Population: Bridging the Gap between Text and Knowledge","author":"Ryu","year":"2006"},{"key":"2023012508092309300_B30","doi-asserted-by":"crossref","first-page":"206","DOI":"10.1145\/312624.312679","article-title":"Deriving concept hierarchies from text","volume-title":"SIGIR '99: Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval","author":"Sanderson","year":"1999"},{"key":"2023012508092309300_B31","doi-asserted-by":"crossref","first-page":"125","DOI":"10.1186\/1471-2105-10-125","article-title":"Survey-based naming conventions for use in obo foundry ontology development","volume":"10","author":"Schober","year":"2009","journal-title":"BMC Bioinformatics"},{"key":"2023012508092309300_B32","doi-asserted-by":"crossref","first-page":"R46","DOI":"10.1186\/gb-2005-6-5-r46","article-title":"Relations in biomedical ontologies","volume":"6","author":"Smith","year":"2005","journal-title":"Genome Biol."},{"key":"2023012508092309300_B33","doi-asserted-by":"crossref","first-page":"1251","DOI":"10.1038\/nbt1346","article-title":"The OBO Foundry: coordinated evolution of ontologies to support biomedical data integration","volume":"25","author":"Smith","year":"2007","journal-title":"Nat. Biotechnol."},{"key":"2023012508092309300_B34","first-page":"1297","article-title":"Learning syntactic patterns for automatic hypernym discovery","volume-title":"Advances in Neural Information Processing Systems 17","author":"Snow","year":"2004"},{"key":"2023012508092309300_B35","first-page":"801","article-title":"Semantic taxonomy induction from heterogenous evidence","volume-title":"ACL-44: Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics","author":"Snow","year":"2006"},{"key":"2023012508092309300_B36","doi-asserted-by":"crossref","first-page":"1095","DOI":"10.1038\/nbt0905-1095","article-title":"Are the current ontologies in biology good ontologies?","volume":"23","author":"Soldatova","year":"2005","journal-title":"Nat. Biotechnol."},{"key":"2023012508092309300_B37","doi-asserted-by":"crossref","first-page":"228","DOI":"10.1186\/1471-2105-10-228","article-title":"Semi-automated curation of protein subcellular localization: a text mining-based approach to gene ontology (go) cellular component curation","volume":"10","author":"Van Auken","year":"2009","journal-title":"BMC Bioinformatics"},{"key":"2023012508092309300_B38","first-page":"54","article-title":"Overview of the TREC 2003 Question Answering Track","author":"Voorhees","year":"2003","journal-title":"Proceedings of the 12th Text Retrieval Conference (TREC-2003)"},{"key":"2023012508092309300_B39","first-page":"785","article-title":"You can't beat frequency (unless you use linguistic knowledge): a qualitative evaluation of association measures for collocation and term extraction","volume-title":"ACL-44: Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics","author":"Wermter","year":"2006"},{"key":"2023012508092309300_B40","doi-asserted-by":"crossref","first-page":"466","DOI":"10.1093\/bib\/bbn043","article-title":"Facts from text: can text mining help to scale-up high-quality manual curation of gene products with ontologies?","volume":"9","author":"Winnenburg","year":"2008","journal-title":"Brief. Bioinform."},{"key":"2023012508092309300_B41","first-page":"98","article-title":"Trec 2003 QA at BBN: Answering definitional questions","author":"Xu","year":"2003","journal-title":"Proceedings of the 12th Text Retrieval Conference (TREC-2003)"},{"key":"2023012508092309300_B42","first-page":"480","article-title":"Qualifier in TREC-12 QA main task","author":"Yang","year":"2003","journal-title":"Proceedings of the 12th Text Retrieval Conference (TREC-2003)"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/26\/12\/i88\/48859663\/bioinformatics_26_12_i88.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/26\/12\/i88\/48859663\/bioinformatics_26_12_i88.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,2,21]],"date-time":"2025-02-21T18:29:02Z","timestamp":1740162542000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/26\/12\/i88\/283065"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2010,6,1]]},"references-count":42,"journal-issue":{"issue":"12","published-print":{"date-parts":[[2010,6,15]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btq188","relation":{},"ISSN":["1367-4811","1367-4803"],"issn-type":[{"type":"electronic","value":"1367-4811"},{"type":"print","value":"1367-4803"}],"subject":[],"published-other":{"date-parts":[[2010,6,15]]},"published":{"date-parts":[[2010,6,1]]}}}