{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,14]],"date-time":"2026-02-14T00:04:35Z","timestamp":1771027475850,"version":"3.50.1"},"reference-count":40,"publisher":"Oxford University Press (OUP)","issue":"3","license":[{"start":{"date-parts":[[2022,4,14]],"date-time":"2022-04-14T00:00:00Z","timestamp":1649894400000},"content-version":"vor","delay-in-days":1,"URL":"https:\/\/creativecommons.org\/licenses\/by-nc\/4.0\/"}],"funder":[{"DOI":"10.13039\/100000002","name":"National Institutes of Health","doi-asserted-by":"publisher","award":["R01LM013335"],"award-info":[{"award-number":["R01LM013335"]}],"id":[{"id":"10.13039\/100000002","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100000002","name":"National Institutes of Health","doi-asserted-by":"publisher","award":["R01NS116287"],"award-info":[{"award-number":["R01NS116287"]}],"id":[{"id":"10.13039\/100000002","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100000002","name":"National Institutes of Health","doi-asserted-by":"publisher","award":["1UL1TR003167"],"award-info":[{"award-number":["1UL1TR003167"]}],"id":[{"id":"10.13039\/100000002","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100004917","name":"Cancer Prevention and Research Institute of Texas","doi-asserted-by":"publisher","award":["RP170668"],"award-info":[{"award-number":["RP170668"]}],"id":[{"id":"10.13039\/100004917","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100008982","name":"National Science Foundation","doi-asserted-by":"publisher","award":["2047001"],"award-info":[{"award-number":["2047001"]}],"id":[{"id":"10.13039\/501100008982","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2022,5,13]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:p>Gene Ontology (GO) is widely used in the biological domain. It is the most comprehensive ontology providing formal representation of gene functions (GO concepts) and relations between them. However, unintentional quality defects (e.g. missing or erroneous relations) in GO may exist due to the large size of GO concepts and complexity of GO structures. Such quality defects would impact the results of GO-based analyses and applications. In this work, we introduce a novel evidence-based lexical pattern approach for quality assurance of GO relations. We leverage two layers of evidence to suggest potentially missing relations in GO as follows. We first utilize related concept pairs (i.e. existing relations) in GO to extract relationship-specific lexical patterns, which serve as the first layer evidence to automatically suggest potentially missing relations between unrelated concept pairs. For each suggested missing relation, we further identify two other existing relations as the second layer of evidence that resemble the difference between the missing relation and the existing relation based on which the missing relation is suggested. Applied to the 15 December 2021 release of GO, this approach suggested a total of 866 potentially missing relations. Local domain experts evaluated the entire set of potentially missing relations, and identified 821 as missing relations and 45 indicate erroneous existing relations. We submitted these findings to the GO consortium for further validation and received encouraging feedback. These indicate that our evidence-based approach can be utilized to uncover missing relations and erroneous existing relations in GO.<\/jats:p>","DOI":"10.1093\/bib\/bbac122","type":"journal-article","created":{"date-parts":[[2022,3,15]],"date-time":"2022-03-15T20:18:00Z","timestamp":1647375480000},"source":"Crossref","is-referenced-by-count":7,"title":["An evidence-based lexical pattern approach for quality assurance of Gene Ontology relations"],"prefix":"10.1093","volume":"23","author":[{"given":"Rashmie","family":"Abeysinghe","sequence":"first","affiliation":[{"name":"Department of Neurology, The University of Texas Health Science Center at Houston, Houston, TX, 77030, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Yuntao","family":"Yang","sequence":"additional","affiliation":[{"name":"School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX, 77030, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Mason","family":"Bartels","sequence":"additional","affiliation":[{"name":"School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX, 77030, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"W Jim","family":"Zheng","sequence":"additional","affiliation":[{"name":"School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX, 77030, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Licong","family":"Cui","sequence":"additional","affiliation":[{"name":"School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX, 77030, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"286","published-online":{"date-parts":[[2022,4,13]]},"reference":[{"issue":"1","key":"2022051813445601100_ref1","doi-asserted-by":"crossref","first-page":"75","DOI":"10.1093\/bib\/bbm059","article-title":"Biomedical ontologies: a functional perspective","volume":"9","author":"Rubin","year":"2008","journal-title":"Brief Bioinform"},{"issue":"D1","key":"2022051813445601100_ref2","doi-asserted-by":"crossref","first-page":"D330","DOI":"10.1093\/nar\/gky1055","article-title":"The Gene Ontology resource: 20 years and still GOing strong","volume":"47","author":"The Gene Ontology Consortium","year":"2019","journal-title":"Nucleic Acids Res"},{"key":"2022051813445601100_ref3","volume-title":"About the GO","author":"The Gene Ontology Consortium"},{"key":"2022051813445601100_ref4","article-title":"GOLink: finding cooccurring terms across Gene Ontology namespaces. Int","volume":"2013","author":"Francis","year":"2013","journal-title":"J Genomics"},{"key":"2022051813445601100_ref5","volume-title":"Relations in the Gene Ontology","author":"The Gene Ontology Consortium"},{"key":"2022051813445601100_ref6","doi-asserted-by":"crossref","first-page":"106","DOI":"10.1016\/j.jbi.2018.09.006","article-title":"Quality assurance of biomedical terminologies and ontologies","volume":"86","author":"Geller","year":"2018","journal-title":"J Biomed Inform"},{"issue":"3","key":"2022051813445601100_ref7","doi-asserted-by":"crossref","first-page":"413","DOI":"10.1016\/j.jbi.2009.03.003","article-title":"A review of auditing methods applied to the content of controlled biomedical terminologies","volume":"42","author":"Zhu","year":"2009","journal-title":"J Biomed Inform"},{"key":"2022051813445601100_ref8","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1016\/j.jbi.2018.02.010","article-title":"Assessing the practice of biomedical ontology evaluation: Gaps and opportunities","volume":"80","author":"Amith","year":"2018","journal-title":"J Biomed Inform"},{"issue":"2","key":"2022051813445601100_ref9","doi-asserted-by":"crossref","first-page":"199","DOI":"10.1016\/j.jbi.2011.10.002","article-title":"Lexically suggest, logically define: quality assurance of the use of qualifiers and expected results of post-coordination in SNOMED CT","volume":"45","author":"Rector","year":"2012","journal-title":"J Biomed Inform"},{"key":"2022051813445601100_ref10","doi-asserted-by":"crossref","first-page":"59","DOI":"10.1016\/j.jbi.2018.06.008","article-title":"From lexical regularities to axiomatic patterns for the quality assurance of biomedical terminologies and ontologies","volume":"84","author":"Damme","year":"2018","journal-title":"J Biomed Inform"},{"key":"2022051813445601100_ref11","doi-asserted-by":"crossref","first-page":"476","DOI":"10.1109\/BIBM.2015.7359731","volume-title":"2015 IEEE international conference on bioinformatics and biomedicine (BIBM)","author":"Agrawal","year":"2015"},{"key":"2022051813445601100_ref12","doi-asserted-by":"crossref","first-page":"292","DOI":"10.1109\/BIBM.2017.8217666","volume-title":"2017 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)","author":"Agrawal","year":"2017"},{"issue":"4","key":"2022051813445601100_ref13","first-page":"27","article-title":"Evaluating lexical similarity and modeling discrepancies in the procedure hierarchy of SNOMED CT","volume":"18","author":"Agrawal","year":"2018","journal-title":"BMC Med Inform Decis Mak"},{"key":"2022051813445601100_ref14","doi-asserted-by":"crossref","first-page":"111","DOI":"10.1016\/j.ymeth.2020.05.019","article-title":"Detecting modeling inconsistencies in SNOMED CT using a machine learning technique","volume":"179","author":"Agrawal","year":"2020","journal-title":"Methods"},{"key":"2022051813445601100_ref15","article-title":"Identifying Missing Hierarchical Relations in SNOMED CT from Logical Definitions Based on the Lexical Features of Concept Names","author":"Bodenreider","year":"2016"},{"issue":"1","key":"2022051813445601100_ref16","doi-asserted-by":"crossref","first-page":"38","DOI":"10.1038\/nbt.2463","article-title":"A Gene Ontology inferred from molecular networks","volume":"31","author":"Dutkowski","year":"2013","journal-title":"Nat Biotechnol"},{"issue":"1","key":"2022051813445601100_ref17","first-page":"1","article-title":"Gene Ontology enrichment improves performances of functional similarity of genes","volume":"8","author":"Liu","year":"2018","journal-title":"Sci Rep"},{"issue":"8","key":"2022051813445601100_ref18","doi-asserted-by":"crossref","first-page":"1185","DOI":"10.1093\/bioinformatics\/btv712","article-title":"Extending Gene Ontology with gene association networks","volume":"32","author":"Peng","year":"2016","journal-title":"Bioinformatics"},{"issue":"03","key":"2022051813445601100_ref19","doi-asserted-by":"crossref","first-page":"1642001","DOI":"10.1142\/S0219720016420014","article-title":"Quality assurance of the Gene Ontology using abstraction networks","volume":"14","author":"Ochs","year":"2016","journal-title":"J Bioinform Comput Biol"},{"issue":"1","key":"2022051813445601100_ref20","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1016\/j.artmed.2015.03.005","article-title":"Abstraction networks for terminologies: supporting management of \u201cbig knowledge\u201d","volume":"64","author":"Halper","year":"2015","journal-title":"Artif Intell Med"},{"issue":"3","key":"2022051813445601100_ref21","doi-asserted-by":"crossref","first-page":"628","DOI":"10.1136\/amiajnl-2014-003173","article-title":"A tribal abstraction network for SNOMED CT target hierarchies without attribute relationships","volume":"22","author":"Ochs","year":"2015","journal-title":"J Am Med Inform Assoc"},{"key":"2022051813445601100_ref22","first-page":"195","article-title":"Identifying redundant and missing relations in the Gene Ontology","volume":"210","author":"Mougin","year":"2015","journal-title":"Stud Health Technol Inform"},{"key":"2022051813445601100_ref23","doi-asserted-by":"crossref","first-page":"31","DOI":"10.1186\/s13040-016-0110-8","article-title":"FEDRR: fast, exhaustive detection of redundant hierarchical relations for quality improvement of large biomedical ontologies","volume":"9","author":"Xing","year":"2016","journal-title":"BioData Min"},{"key":"2022051813445601100_ref24","doi-asserted-by":"crossref","first-page":"1242","DOI":"10.1109\/BIBM.2017.8217835","volume-title":"2017 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)","author":"Abeysinghe","year":"2017"},{"issue":"10","key":"2022051813445601100_ref25","doi-asserted-by":"crossref","first-page":"3207","DOI":"10.1093\/bioinformatics\/btaa106","article-title":"SSIF: Subsumption-based Sub-term Inference Framework to audit Gene Ontology","volume":"36","author":"Abeysinghe","year":"2020","journal-title":"Bioinformatics"},{"issue":"22","key":"2022051813445601100_ref26","doi-asserted-by":"crossref","first-page":"3045","DOI":"10.1093\/bioinformatics\/btp536","article-title":"QuickGO: a web-based tool for Gene Ontology searching","volume":"25","author":"Binns","year":"2009","journal-title":"Bioinformatics"},{"key":"2022051813445601100_ref27","volume-title":"spaCy: Industrial-Strength Natural Language Processing in Python","author":"Explosion"},{"key":"2022051813445601100_ref28","author":"The OBO Foundry. Relations Ontology"},{"issue":"1","key":"2022051813445601100_ref29","doi-asserted-by":"crossref","first-page":"10872","DOI":"10.1038\/s41598-018-28948-z","article-title":"GOATOOLS: A Python library for Gene Ontology analyses","volume":"8","author":"Klopfenstein","year":"2018","journal-title":"Sci Rep"},{"issue":"9","key":"2022051813445601100_ref30","doi-asserted-by":"crossref","first-page":"3307","DOI":"10.1210\/en.2016-1500","article-title":"CORT, Cort, B, Corticosterone, and now Cortistatin: Enough Already!","volume":"157","author":"Raff","year":"2016","journal-title":"Endocrinology"},{"key":"2022051813445601100_ref31","volume-title":"Proceedings of the ACL-02 Workshop on Effective tools and methodologies for teaching natural language processing and computational linguistics","author":"Loper"},{"key":"2022051813445601100_ref32","doi-asserted-by":"crossref","first-page":"55","DOI":"10.3115\/v1\/P14-5010","volume-title":"52nd annual meeting of the association for computational linguistics: system demonstrations","author":"Manning","year":"2014"},{"issue":"11","key":"2022051813445601100_ref33","doi-asserted-by":"crossref","DOI":"10.2196\/22333","article-title":"Automatic Structuring of Ontology Terms Based on Lexical Granularity and Machine Learning: Algorithm Development and Validation","volume":"8","author":"Luo","year":"2020","journal-title":"JMIR Med Inform"},{"issue":"4","key":"2022051813445601100_ref34","doi-asserted-by":"crossref","first-page":"788","DOI":"10.1093\/jamia\/ocw175","article-title":"Mining non-lattice subgraphs for detecting missing hierarchical relations and concepts in SNOMED CT","volume":"24","author":"Cui","year":"2017","journal-title":"J Am Med Inform Assoc"},{"key":"2022051813445601100_ref35","doi-asserted-by":"crossref","first-page":"177","DOI":"10.1016\/j.jbi.2017.12.010","article-title":"Auditing SNOMED CT hierarchical relations based on lexical features of concepts in non-lattice subgraphs","volume":"78","author":"Cui","year":"2018","journal-title":"J Biomed Inform"},{"key":"2022051813445601100_ref36","article-title":"Quality assurance of NCI Thesaurus by mining structural-lexical patterns. In: AMIA annual symposium proceedings 2017, p. 364","volume":"2017","author":"Abeysinghe","journal-title":"American Medical Informatics Association"},{"key":"2022051813445601100_ref37","first-page":"982","volume-title":"AMIA annual symposium proceedings","author":"Abeysinghe","year":"2019"},{"issue":"suppl_1","key":"2022051813445601100_ref38","doi-asserted-by":"crossref","first-page":"D344","DOI":"10.1093\/nar\/gkm791","article-title":"ChEBI: a database and ontology for chemical entities of biological interest","volume":"36","author":"Degtyarenko","year":"2007","journal-title":"Nucleic Acids Res"},{"key":"2022051813445601100_ref39","volume-title":"GO-ontology tracking system","author":"The Gene Ontology Consortium"},{"issue":"1","key":"2022051813445601100_ref40","doi-asserted-by":"crossref","first-page":"42","DOI":"10.1016\/j.tig.2015.10.007","article-title":"Lateral thinking: how histone modifications regulate gene expression","volume":"32","author":"Lawrence","year":"2016","journal-title":"Trends Genet"}],"container-title":["Briefings in Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bib\/article-pdf\/23\/3\/bbac122\/43745610\/bbac122.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bib\/article-pdf\/23\/3\/bbac122\/43745610\/bbac122.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2022,5,18]],"date-time":"2022-05-18T13:49:26Z","timestamp":1652881766000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bib\/article\/doi\/10.1093\/bib\/bbac122\/6567717"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,4,13]]},"references-count":40,"journal-issue":{"issue":"3","published-print":{"date-parts":[[2022,5,13]]}},"URL":"https:\/\/doi.org\/10.1093\/bib\/bbac122","relation":{},"ISSN":["1467-5463","1477-4054"],"issn-type":[{"value":"1467-5463","type":"print"},{"value":"1477-4054","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2022,5]]},"published":{"date-parts":[[2022,4,13]]},"article-number":"bbac122"}}