{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,5,27]],"date-time":"2025-05-27T17:06:45Z","timestamp":1748365605264,"version":"3.37.3"},"reference-count":36,"publisher":"Oxford University Press (OUP)","issue":"13","license":[{"start":{"date-parts":[[2018,6,27]],"date-time":"2018-06-27T00:00:00Z","timestamp":1530057600000},"content-version":"vor","delay-in-days":0,"URL":"http:\/\/creativecommons.org\/licenses\/by-nc\/4.0\/"}],"funder":[{"DOI":"10.13039\/100000001","name":"National Science Foundation","doi-asserted-by":"publisher","award":["DBI-1458477"],"award-info":[{"award-number":["DBI-1458477"]}],"id":[{"id":"10.13039\/100000001","id-type":"DOI","asserted-by":"publisher"}]},{"name":"Indiana University Precision Health Initiative"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2018,7,1]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:sec>\n                  <jats:title>Motivation<\/jats:title>\n                  <jats:p>Modern problems of concept annotation associate an object of interest (gene, individual, text document) with a set of interrelated textual descriptors (functions, diseases, topics), often organized in concept hierarchies or ontologies. Most ontology can be seen as directed acyclic graphs (DAGs), where nodes represent concepts and edges represent relational ties between these concepts. Given an ontology graph, each object can only be annotated by a consistent sub-graph; that is, a sub-graph such that if an object is annotated by a particular concept, it must also be annotated by all other concepts that generalize it. Ontologies therefore provide a compact representation of a large space of possible consistent sub-graphs; however, until now we have not been aware of a practical algorithm that can enumerate such annotation spaces for a given ontology.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Results<\/jats:title>\n                  <jats:p>We propose an algorithm for enumerating consistent sub-graphs of DAGs. The algorithm recursively partitions the graph into strictly smaller graphs until the resulting graph becomes a rooted tree (forest), for which a linear-time solution is computed. It then combines the tallies from graphs created in the recursion to obtain the final count. We prove the correctness of this algorithm, propose several practical accelerations, evaluate it on random graphs and then apply it to characterize four major biomedical ontologies. We believe this work provides valuable insights into the complexity of concept annotation spaces and its potential influence on the predictability of ontological annotation.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Availability and implementation<\/jats:title>\n                  <jats:p>https:\/\/github.com\/shawn-peng\/counting-consistent-sub-DAG<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Supplementary information<\/jats:title>\n                  <jats:p>Supplementary data are available at Bioinformatics online.<\/jats:p>\n               <\/jats:sec>","DOI":"10.1093\/bioinformatics\/bty268","type":"journal-article","created":{"date-parts":[[2018,4,16]],"date-time":"2018-04-16T19:11:48Z","timestamp":1523905908000},"page":"i313-i322","source":"Crossref","is-referenced-by-count":9,"title":["Enumerating consistent sub-graphs of directed acyclic graphs: an insight into biomedical ontologies"],"prefix":"10.1093","volume":"34","author":[{"given":"Yisu","family":"Peng","sequence":"first","affiliation":[{"name":"Department of Computer Science, Indiana University, Bloomington, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Yuxiang","family":"Jiang","sequence":"additional","affiliation":[{"name":"Department of Computer Science, Indiana University, Bloomington, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Predrag","family":"Radivojac","sequence":"additional","affiliation":[{"name":"Department of Computer Science, Indiana University, Bloomington, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"286","published-online":{"date-parts":[[2018,6,27]]},"reference":[{"key":"2023051604223389400_bty268-B1","doi-asserted-by":"crossref","first-page":"25","DOI":"10.1038\/75556","article-title":"Gene ontology: tool for the unification of biology. The Gene Ontology Consortium","volume":"25","author":"Ashburner","year":"2000","journal-title":"Nat. Genet"},{"key":"2023051604223389400_bty268-B2","first-page":"101","article-title":"Strength in numbers: exploring redundancy in hierarchical relations across biomedical terminologies","author":"Bodenreider","year":"2003","journal-title":"AMIA Annu. Symp. Proc"},{"key":"2023051604223389400_bty268-B3","first-page":"376","article-title":"A theorem on trees","volume":"23","author":"Cayley","year":"1889","journal-title":"Quart. J. Math"},{"key":"2023051604223389400_bty268-B4","doi-asserted-by":"crossref","first-page":"i53","DOI":"10.1093\/bioinformatics\/btt228","article-title":"Information-theoretic evaluation of predicted ontological annotations","volume":"29","author":"Clark","year":"2013","journal-title":"Bioinformatics"},{"key":"2023051604223389400_bty268-B5","doi-asserted-by":"crossref","first-page":"133","DOI":"10.1007\/978-1-4939-3743-1_10","article-title":"Community-wide evaluation of computational function prediction","volume":"1446","author":"Friedberg","year":"2017","journal-title":"Methods Mol. Biol"},{"key":"2023051604223389400_bty268-B6","doi-asserted-by":"crossref","first-page":"253","DOI":"10.1016\/0012-365X(95)00119-H","article-title":"Counting acyclic digraphs by sources and sinks","volume":"160","author":"Gessel","year":"1996","journal-title":"Discrete Math"},{"volume-title":"Handbook of Graph Theory","year":"2004","author":"Gross","key":"2023051604223389400_bty268-B7"},{"year":"2014","author":"Grosshans","key":"2023051604223389400_bty268-B8"},{"key":"2023051604223389400_bty268-B9","doi-asserted-by":"crossref","first-page":"D1057","DOI":"10.1093\/nar\/gku1113","article-title":"The GOA database: gene Ontology annotation updates for 2015","volume":"43","author":"Huntley","year":"2015","journal-title":"Nucleic Acids Res"},{"key":"2023051604223389400_bty268-B10","doi-asserted-by":"crossref","first-page":"i609","DOI":"10.1093\/bioinformatics\/btu472","article-title":"The impact of incomplete knowledge on the evaluation of protein function prediction: a structured-output learning perspective","volume":"30","author":"Jiang","year":"2014","journal-title":"Bioinformatics"},{"key":"2023051604223389400_bty268-B11","doi-asserted-by":"crossref","first-page":"184.","DOI":"10.1186\/s13059-016-1037-6","article-title":"An expanded evaluation of protein function prediction methods shows an improvement in accuracy","volume":"17","author":"Jiang","year":"2016","journal-title":"Genome Biol"},{"key":"2023051604223389400_bty268-B12","doi-asserted-by":"crossref","first-page":"27","DOI":"10.1007\/s10994-009-5108-8","article-title":"Cutting-plane training of structural SVMs","volume":"77","author":"Joachims","year":"2009","journal-title":"Mach. Learn"},{"key":"2023051604223389400_bty268-B13","doi-asserted-by":"crossref","first-page":"i169","DOI":"10.1093\/bioinformatics\/bth921","article-title":"The gene ontology categorizer","volume":"20","author":"Joslyn","year":"2004","journal-title":"Bioinformatics"},{"key":"2023051604223389400_bty268-B14","doi-asserted-by":"crossref","first-page":"1275","DOI":"10.1093\/bioinformatics\/btg153","article-title":"Investigating semantic similarity measures across the Gene Ontology: the relationship between sequence and annotation","volume":"19","author":"Lord","year":"2003","journal-title":"Bioinformatics"},{"key":"2023051604223389400_bty268-B15","article-title":"Acyclic digraphs and eigenvalues of (0, 1)-matrices","volume":"7","author":"McKay","year":"2004","journal-title":"J. Integer Seq"},{"key":"2023051604223389400_bty268-B16","doi-asserted-by":"crossref","first-page":"523","DOI":"10.1038\/nrg3253","article-title":"Computational tools for prioritizing candidate genes: boosting disease gene discovery","volume":"13","author":"Moreau","year":"2012","journal-title":"Nat. Rev. Genet"},{"year":"2015","author":"Movshovitz-Attias","key":"2023051604223389400_bty268-B17"},{"key":"2023051604223389400_bty268-B18","doi-asserted-by":"crossref","first-page":"161","DOI":"10.1007\/978-1-4939-3743-1_12","article-title":"Semantic similarity in the Gene Ontology","volume":"1446","author":"Pesquita","year":"2017","journal-title":"Methods Mol. Biol"},{"key":"2023051604223389400_bty268-B19","doi-asserted-by":"crossref","first-page":"41","DOI":"10.1007\/978-1-4939-3743-1_4","article-title":"Best practices in manual annotation with the Gene Ontology","volume":"1446","author":"Poux","year":"2017","journal-title":"Methods Mol. Biol"},{"key":"2023051604223389400_bty268-B20","doi-asserted-by":"crossref","first-page":"221","DOI":"10.1038\/nmeth.2340","article-title":"A large-scale evaluation of computational protein function prediction","volume":"10","author":"Radivojac","year":"2013","journal-title":"Nat. Methods"},{"key":"2023051604223389400_bty268-B21","doi-asserted-by":"crossref","DOI":"10.1201\/b10967","volume-title":"Introduction to Bio-Ontologies","author":"Robinson","year":"2011"},{"key":"2023051604223389400_bty268-B22","doi-asserted-by":"crossref","first-page":"525","DOI":"10.1111\/j.1399-0004.2010.01436.x","article-title":"The human phenotype ontology","volume":"77","author":"Robinson","year":"2010","journal-title":"Clin. Genet"},{"year":"1971","author":"Robinson","key":"2023051604223389400_bty268-B23"},{"key":"2023051604223389400_bty268-B24","doi-asserted-by":"crossref","first-page":"319","DOI":"10.1016\/0012-365X(92)90155-9","article-title":"On the number of labeled acyclic digraphs","volume":"105","author":"Rodionov","year":"1992","journal-title":"Discrete Math"},{"key":"2023051604223389400_bty268-B25","doi-asserted-by":"crossref","first-page":"141","DOI":"10.1137\/0210011","article-title":"Listing and counting subtrees of a tree","volume":"10","author":"Ruskey","year":"1981","journal-title":"SIAM J. Comput"},{"key":"2023051604223389400_bty268-B26","doi-asserted-by":"crossref","first-page":"117","DOI":"10.1016\/S0306-4379(00)00012-0","article-title":"Analyzing process models using graph reduction techniques","volume":"25","author":"Sadiq","year":"2000","journal-title":"Inf. Syst. J"},{"key":"2023051604223389400_bty268-B27","doi-asserted-by":"crossref","first-page":"e1000605.","DOI":"10.1371\/journal.pcbi.1000605","article-title":"Annotation error in public databases: misannotation of molecular function in enzyme superfamilies","volume":"5","author":"Schnoes","year":"2009","journal-title":"PLoS Comput. Biol"},{"key":"2023051604223389400_bty268-B28","doi-asserted-by":"crossref","first-page":"e1003063.","DOI":"10.1371\/journal.pcbi.1003063","article-title":"Biases in the experimental annotations of protein function and their effect on our understanding of protein function space","volume":"9","author":"Schnoes","year":"2013","journal-title":"PLoS Comput. Biol"},{"key":"2023051604223389400_bty268-B29","doi-asserted-by":"crossref","first-page":"357","DOI":"10.1142\/S0219720010004744","article-title":"Hierarchical classification of gene ontology terms using the GOstruct method","volume":"8","author":"Sokolov","year":"2010","journal-title":"J. Bioinform. Comput. Biol"},{"key":"2023051604223389400_bty268-B30","doi-asserted-by":"crossref","first-page":"171","DOI":"10.1016\/0012-365X(73)90108-8","article-title":"Acyclic orientations of graphs","volume":"5","author":"Stanley","year":"1973","journal-title":"Discrete Math"},{"key":"2023051604223389400_bty268-B31","doi-asserted-by":"crossref","first-page":"1544","DOI":"10.1110\/ps.062184006","article-title":"A categorization approach to automated ontological function annotation","volume":"15","author":"Verspoor","year":"2006","journal-title":"Protein Sci"},{"key":"2023051604223389400_bty268-B32","doi-asserted-by":"crossref","first-page":"i77","DOI":"10.1093\/bioinformatics\/btp195","article-title":"Ontology quality assurance through analysis of term transformations","volume":"25","author":"Verspoor","year":"2009","journal-title":"Bioinformatics"},{"key":"2023051604223389400_bty268-B33","doi-asserted-by":"crossref","first-page":"356","DOI":"10.1101\/gr.157495.113","article-title":"Variation Ontology for annotation of variation effects and mechanisms","volume":"24","author":"Vihinen","year":"2014","journal-title":"Genome Res"},{"key":"2023051604223389400_bty268-B34","doi-asserted-by":"crossref","first-page":"31.","DOI":"10.1186\/s13040-016-0110-8","article-title":"FEDRR: fast, exhaustive detection of redundant hierarchical relations for quality improvement of large biomedical ontologies","volume":"9","author":"Xing","year":"2016","journal-title":"BioData Min"},{"key":"2023051604223389400_bty268-B35","doi-asserted-by":"crossref","first-page":"256","DOI":"10.1016\/j.tcs.2006.09.002","article-title":"Enumeration of subtrees of trees","volume":"369","author":"Yan","year":"2006","journal-title":"Theor. Comput. Sci"},{"year":"2010","author":"Zaharia","key":"2023051604223389400_bty268-B36"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/34\/13\/i313\/50315580\/bioinformatics_34_13_i313.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/34\/13\/i313\/50315580\/bioinformatics_34_13_i313.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,5,16]],"date-time":"2023-05-16T04:23:27Z","timestamp":1684211007000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/34\/13\/i313\/5045754"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2018,6,27]]},"references-count":36,"journal-issue":{"issue":"13","published-print":{"date-parts":[[2018,7,1]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/bty268","relation":{},"ISSN":["1367-4803","1367-4811"],"issn-type":[{"type":"print","value":"1367-4803"},{"type":"electronic","value":"1367-4811"}],"subject":[],"published-other":{"date-parts":[[2018,7,1]]},"published":{"date-parts":[[2018,6,27]]}}}