{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,26]],"date-time":"2026-02-26T20:34:17Z","timestamp":1772138057632,"version":"3.50.1"},"reference-count":12,"publisher":"Oxford University Press (OUP)","issue":"2","license":[{"start":{"date-parts":[[2023,2,14]],"date-time":"2023-02-14T00:00:00Z","timestamp":1676332800000},"content-version":"vor","delay-in-days":13,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/100000001","name":"National Science Foundation","doi-asserted-by":"publisher","award":["17284077"],"award-info":[{"award-number":["17284077"]}],"id":[{"id":"10.13039\/100000001","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2023,2,3]]},"abstract":"<jats:title>Abstract<\/jats:title>\n                  <jats:sec>\n                    <jats:title>Motivation<\/jats:title>\n                    <jats:p>Gene annotation is the problem of mapping proteins to their functions represented as Gene Ontology (GO) terms, typically inferred based on the primary sequences. Gene annotation is a multi-label multi-class classification problem, which has generated growing interest for its uses in the characterization of millions of proteins with unknown functions. However, there is no standard GO dataset used for benchmarking the newly developed new machine learning models within the bioinformatics community. Thus, the significance of improvements for these models remains unclear.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Results<\/jats:title>\n                    <jats:p>The Gene Benchmarking database is the first effort to provide an easy-to-use and configurable hub for the learning and evaluation of gene annotation models. It provides easy access to pre-specified datasets and takes the non-trivial steps of preprocessing and filtering all data according to custom presets using a web interface. The GO bench web application can also be used to evaluate and display any trained model on leaderboards for annotation tasks.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Availability and implementation<\/jats:title>\n                    <jats:p>The GO Benchmarking dataset is freely available at www.gobench.org. Code is hosted at github.com\/mofradlab, with repositories for website code, core utilities and examples of usage (Supplementary Section S.7).<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Supplementary information<\/jats:title>\n                    <jats:p>Supplementary data are available at Bioinformatics online.<\/jats:p>\n                  <\/jats:sec>","DOI":"10.1093\/bioinformatics\/btad081","type":"journal-article","created":{"date-parts":[[2023,2,13]],"date-time":"2023-02-13T18:23:59Z","timestamp":1676312639000},"source":"Crossref","is-referenced-by-count":3,"title":["GO Bench: shared hub for universal benchmarking of machine learning-based protein functional annotations"],"prefix":"10.1093","volume":"39","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-1146-6346","authenticated-orcid":false,"given":"Andrew","family":"Dickson","sequence":"first","affiliation":[{"name":"Molecular Cell Biomechanics Laboratory, Departments of Bioengineering and Mechanical Engineering, University of California , Berkeley, CA 94720, USA"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-6518-7238","authenticated-orcid":false,"given":"Ehsaneddin","family":"Asgari","sequence":"additional","affiliation":[{"name":"Molecular Cell Biomechanics Laboratory, Departments of Bioengineering and Mechanical Engineering, University of California , Berkeley, CA 94720, USA"},{"name":"Computational Biology of Infection Research, Helmholtz Centre for Infection Research , 38124 Brunswick, Germany"}]},{"given":"Alice C","family":"McHardy","sequence":"additional","affiliation":[{"name":"Computational Biology of Infection Research, Helmholtz Centre for Infection Research , 38124 Brunswick, Germany"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-7004-4859","authenticated-orcid":false,"given":"Mohammad R K","family":"Mofrad","sequence":"additional","affiliation":[{"name":"Molecular Cell Biomechanics Laboratory, Departments of Bioengineering and Mechanical Engineering, University of California , Berkeley, CA 94720, USA"}]}],"member":"286","published-online":{"date-parts":[[2023,2,14]]},"reference":[{"key":"2023042616104228200_btad081-B1","doi-asserted-by":"crossref","DOI":"10.1371\/journal.pone.0141287","article-title":"Continuous distributed representation of biological sequences for deep proteomics and genomics","volume":"10","author":"Asgari","year":"2015","journal-title":"PLoS One"},{"key":"2023042616104228200_btad081-B2","doi-asserted-by":"crossref","first-page":"25","DOI":"10.1038\/75556","article-title":"Gene ontology: tool for the unification of biology. The Gene Ontology Consortium","volume":"25","author":"Ashburner","year":"2000","journal-title":"Nat. Genet"},{"key":"2023042616104228200_btad081-B3","doi-asserted-by":"crossref","first-page":"719","DOI":"10.1007\/s10994-020-05877-5","article-title":"Learning from positive and unlabeled data: a survey","volume":"109","author":"Bekker","year":"2020","journal-title":"Mach. Learn"},{"key":"2023042616104228200_btad081-B4","doi-asserted-by":"crossref","first-page":"245","DOI":"10.1007\/978-1-4939-3743-1_18","volume-title":"The Gene Ontology Handbook","author":"Chibucos","year":"2017"},{"key":"2023042616104228200_btad081-B5","doi-asserted-by":"crossref","first-page":"i53","DOI":"10.1093\/bioinformatics\/btt228","article-title":"Information-theoretic evaluation of predicted ontological annotations","volume":"29","author":"Clark","year":"2013","journal-title":"Bioinformatics (Oxford, England)"},{"key":"2023042616104228200_btad081-B6","doi-asserted-by":"crossref","first-page":"2503","DOI":"10.1101\/gr.3152604","article-title":"EAnnot: a genome annotation tool using experimental evidence","volume":"14","author":"Ding","year":"2004","journal-title":"Genome Res"},{"key":"2023042616104228200_btad081-B7","doi-asserted-by":"crossref","first-page":"2446","DOI":"10.1093\/nar\/gkz030","article-title":"The y-ome defines the 35% of Escherichia coli genes that lack experimental evidence of function","volume":"47","author":"Ghatak","year":"2019","journal-title":"Nucleic Acids Res"},{"key":"2023042616104228200_btad081-B8","doi-asserted-by":"crossref","first-page":"D1057","DOI":"10.1093\/nar\/gku1113","article-title":"The GOA database: gene ontology annotation updates for 2015","volume":"43","author":"Huntley","year":"2015","journal-title":"Nucleic Acids Res"},{"key":"2023042616104228200_btad081-B9","doi-asserted-by":"crossref","first-page":"422","DOI":"10.1093\/bioinformatics\/btz595","article-title":"DeepGOPlus: improved protein function prediction from sequence","volume":"36","author":"Kulmanov","year":"2020","journal-title":"Bioinformatics"},{"key":"2023042616104228200_btad081-B11","doi-asserted-by":"crossref","first-page":"D480","DOI":"10.1093\/nar\/gkaa1100","article-title":"UniProt: the universal protein knowledgebase in 2021","volume":"49","author":"The UniProt Consortium","year":"2020","journal-title":"Nucleic Acids Res"},{"key":"2023042616104228200_btad081-B12","doi-asserted-by":"crossref","first-page":"i210","DOI":"10.1093\/bioinformatics\/btaa466","article-title":"Benchmarking gene ontology function predictions using negative annotations","volume":"36","author":"Warwick Vesztrocy","year":"2020","journal-title":"Bioinformatics (Oxford, England)"},{"key":"2023042616104228200_btad081-B13","doi-asserted-by":"crossref","first-page":"244","DOI":"10.1186\/s13059-019-1835-8","article-title":"The CAFA challenge reports improved protein function prediction and new functional annotations for hundreds of genes through experimental screens","volume":"20","author":"Zhou","year":"2019","journal-title":"Genome Biol"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/advance-article-pdf\/doi\/10.1093\/bioinformatics\/btad081\/49177347\/btad081.pdf","content-type":"application\/pdf","content-version":"am","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/39\/2\/btad081\/50100014\/btad081.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/39\/2\/btad081\/50100014\/btad081.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,4,26]],"date-time":"2023-04-26T12:20:15Z","timestamp":1682511615000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/doi\/10.1093\/bioinformatics\/btad081\/7036335"}},"subtitle":[],"editor":[{"given":"Lenore","family":"Cowen","sequence":"additional","affiliation":[]}],"short-title":[],"issued":{"date-parts":[[2023,2,1]]},"references-count":12,"journal-issue":{"issue":"2","published-print":{"date-parts":[[2023,2,3]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btad081","relation":{"has-preprint":[{"id-type":"doi","id":"10.1101\/2022.07.19.500685","asserted-by":"object"}]},"ISSN":["1367-4811"],"issn-type":[{"value":"1367-4811","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2023,2,1]]},"published":{"date-parts":[[2023,2,1]]},"article-number":"btad081"}}