{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,1,29]],"date-time":"2026-01-29T17:37:08Z","timestamp":1769708228591,"version":"3.49.0"},"reference-count":51,"publisher":"Frontiers Media SA","license":[{"start":{"date-parts":[[2026,1,29]],"date-time":"2026-01-29T00:00:00Z","timestamp":1769644800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100004263","name":"Funda\u00e7\u00e3o de Amparo \u00e0 Pesquisa do Estado do Rio Grande do Sul","doi-asserted-by":"publisher","award":["24\/2551-0001277-0"],"award-info":[{"award-number":["24\/2551-0001277-0"]}],"id":[{"id":"10.13039\/501100004263","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100002322","name":"Coordena\u00e7\u00e3o de Aperfei\u00e7oamento de Pessoal de N\u00edvel Superior","doi-asserted-by":"publisher","award":["001"],"award-info":[{"award-number":["001"]}],"id":[{"id":"10.13039\/501100002322","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["frontiersin.org"],"crossmark-restriction":true},"short-container-title":["Front. Bioinform."],"abstract":"<jats:p>Functional enrichment analysis (FEA) provides biological meaning from lists of differentially expressed genes and proteins obtained through omics experiments. FEA tools can employ numerous statistical methods and rely on different pathway databases. In this sense, Overrepresentation Analysis (ORA) is one of the most popular methods to perform FEA. Gene Ontology (GO) is arguably the most widely used pathway knowledgebase in FEA. Hence, benchmarking the biological accuracy of ORA-based GO enrichment tools is crucial. Nevertheless, benchmark studies in FEA tend to focus excessively on performance-based metrics rather than on the biological information contained in enrichment results. To identify the differences between popular ORA-based GO enrichment tools and provide data that brings insights into the tools\u2019 biological accuracy and, thus, better suits the application of FEA, we tested 12 popular GO enrichment tools (i.e., DAVID, PANTHER, WebGestalt, Enrichr, ShinyGO, limma, topGO, GOstats, clusterProfiler, g:Profiler, ClueGO, and BiNGO) with randomized datasets as negative controls, a target-oriented and a hallmark datasets as positive controls, and an experiment-derived dataset. Gene sets with 500, 200, 100, and 50 genes were built for each dataset to investigate the impact of input sizes. Using the control datasets, we calculated the FPR and accuracy of the tools based on the semantic similarity between the enriched terms and the target ontologies and assessed overlooked, insightful metrics that reflect the biological informativeness of the results, such as the specificity of enriched GO terms and the prioritization of target ontologies. Additionally, we clustered the FEA results based on term semantic similarity, enabling us to directly compare the biological profiles generated by each tool. Despite employing the same method and functional database, the tools\u2019 results diverged significantly. Our findings reveal considerable variation among tools in terms of informativeness and interpretability of results. Some tools demonstrated strong capabilities in prioritizing target pathways, while others struggled, especially as input size increased. Additionally, we observed that the degree to which the enriched ontologies are related to the expected targets varies across tools, with some being more conservative than others. Together, these results provide powerful insights into the performance characteristics of the analyzed GO enrichment tools and yield new, relevant data for benchmarking FEA tools.<\/jats:p>","DOI":"10.3389\/fbinf.2026.1755664","type":"journal-article","created":{"date-parts":[[2026,1,29]],"date-time":"2026-01-29T06:42:27Z","timestamp":1769668947000},"update-policy":"https:\/\/doi.org\/10.3389\/crossmark-policy","source":"Crossref","is-referenced-by-count":0,"title":["Benchmarking multiple gene ontology enrichment tools reveals high biological significance, ranking, and stringency heterogeneity among datasets"],"prefix":"10.3389","volume":"6","author":[{"given":"F\u00e1bio Henrique Schuster","family":"de Oliveira","sequence":"first","affiliation":[{"name":"Laboratory of DNA Repair and Aging, Department of Biophysics, Institute of Biosciences, Federal University of Rio Grande do Sul","place":["Porto Alegre, Brazil"]}]},{"given":"Felipe Acker","family":"Gomes","sequence":"additional","affiliation":[{"name":"Laboratory of DNA Repair and Aging, Department of Biophysics, Institute of Biosciences, Federal University of Rio Grande do Sul","place":["Porto Alegre, Brazil"]}]},{"given":"Bruno C\u00e9sar","family":"Feltes","sequence":"additional","affiliation":[{"name":"Laboratory of DNA Repair and Aging, Department of Biophysics, Institute of Biosciences, Federal University of Rio Grande do Sul","place":["Porto Alegre, Brazil"]}]}],"member":"1965","published-online":{"date-parts":[[2026,1,29]]},"reference":[{"key":"B1","doi-asserted-by":"publisher","first-page":"D679","DOI":"10.1093\/NAR\/GKAD960","article-title":"WikiPathways 2024: next generation pathway database","volume":"52","author":"Agrawal","year":"2024","journal-title":"Nucleic Acids Res."},{"key":"B2","doi-asserted-by":"publisher","DOI":"10.18129\/B9.bioc.topGO","article-title":"topGO: enrichment analysis for gene ontology","author":"Alexa","year":"2024"},{"key":"B3","doi-asserted-by":"publisher","first-page":"25","DOI":"10.1038\/75556","article-title":"Gene ontology: tool for the unification of biology","volume":"25","author":"Ashburner","year":"2000","journal-title":"Nat. Genet."},{"key":"B4","doi-asserted-by":"publisher","DOI":"10.1186\/s12859-015-0751-5","article-title":"Comparative study on gene set and pathway topology-based enrichment methods","volume":"16","author":"Bayerlov\u00e1","year":"2015","journal-title":"BMC Bioinforma."},{"key":"B5","doi-asserted-by":"crossref","DOI":"10.1111\/j.2517-6161.1995.tb02031.x","article-title":"Controlling the false discovery rate: a practical and powerful approach to multiple","author":"Benjamini","year":"1995"},{"key":"B6","doi-asserted-by":"publisher","first-page":"1091","DOI":"10.1093\/BIOINFORMATICS\/BTP101","article-title":"ClueGO: a cytoscape plug-in to decipher functionally grouped gene ontology and pathway annotation networks","volume":"25","author":"Bindea","year":"2009","journal-title":"Bioinformatics"},{"key":"B7","doi-asserted-by":"publisher","first-page":"488","DOI":"10.1186\/1471-2105-7-488","article-title":"Evaluation of clustering algorithms for protein-protein interaction networks","volume":"7","author":"Broh\u00e9e","year":"2006","journal-title":"BMC Bioinforma."},{"key":"B8","doi-asserted-by":"publisher","first-page":"bbae069","DOI":"10.1093\/bib\/bbae069","article-title":"Benchmarking enrichment analysis methods with the disease pathway network","volume":"25","author":"Buzzao","year":"2024","journal-title":"Brief. Bioinform"},{"key":"B9","doi-asserted-by":"publisher","first-page":"128","DOI":"10.1186\/1471-2105-14-128","article-title":"Enrichr: interactive and collaborative HTML5 gene list enrichment analysis tool","volume":"14","author":"Chen","year":"2013","journal-title":"BMC Bioinforma."},{"key":"B10","doi-asserted-by":"publisher","first-page":"iyad031","DOI":"10.1093\/GENETICS\/IYAD031","article-title":"The gene ontology knowledgebase in 2023","volume":"224","author":"Consortium","year":"2023","journal-title":"Genetics"},{"key":"B11","doi-asserted-by":"publisher","first-page":"18871","DOI":"10.1038\/srep18871","article-title":"LEGO: a novel method for gene set over-representation analysis by incorporating network-based gene weights","volume":"6","author":"Dong","year":"2016","journal-title":"Sci. Rep."},{"key":"B12","doi-asserted-by":"publisher","first-page":"3439","DOI":"10.1093\/BIOINFORMATICS\/BTI525","article-title":"BioMart and bioconductor: a powerful link between biological databases and microarray data analysis","volume":"21","author":"Durinck","year":"2005","journal-title":"Bioinformatics"},{"key":"B13","doi-asserted-by":"publisher","first-page":"1184","DOI":"10.1038\/nprot.2009.97","article-title":"Mapping identifiers for the integration of genomic datasets with the R\/Bioconductor package biomaRt","volume":"4","author":"Durinck","year":"2009","journal-title":"Nat. Protoc."},{"key":"B14","doi-asserted-by":"publisher","first-page":"W415","DOI":"10.1093\/NAR\/GKAE456","article-title":"WebGestalt 2024: faster gene set analysis and new support for metabolomics and multi-omics","volume":"52","author":"Elizarraras","year":"2024","journal-title":"Nucleic Acids Res."},{"key":"B15","doi-asserted-by":"publisher","first-page":"257","DOI":"10.1093\/BIOINFORMATICS\/BTL567","article-title":"Using GOstats to test gene lists for GO term association","volume":"23","author":"Falcon","year":"2007","journal-title":"Bioinformatics"},{"key":"B16","doi-asserted-by":"publisher","first-page":"2628","DOI":"10.1093\/BIOINFORMATICS\/BTZ931","article-title":"ShinyGO: a graphical gene-set enrichment tool for animals and plants","volume":"36","author":"Ge","year":"2020","journal-title":"Bioinformatics"},{"key":"B17","doi-asserted-by":"publisher","first-page":"545","DOI":"10.1093\/bib\/bbz158","article-title":"Toward a gold standard for benchmarking gene set enrichment analysis","volume":"22","author":"Geistlinger","year":"2021","journal-title":"Brief. Bioinform"},{"key":"B18","doi-asserted-by":"publisher","first-page":"44","DOI":"10.1038\/NPROT.2008.211","article-title":"Systematic and integrative analysis of large gene lists using DAVID bioinformatics resources","volume":"4","author":"Huang","year":"2009","journal-title":"Nat. Protoc."},{"key":"B19","doi-asserted-by":"publisher","first-page":"281","DOI":"10.1093\/bib\/bbr049","article-title":"Gene set enrichment analysis: performance evaluation and usage guidelines","volume":"13","author":"Hung","year":"2012","journal-title":"Brief. Bioinform"},{"key":"B49","doi-asserted-by":"publisher","first-page":"267","DOI":"10.1186\/1756-0500-4-267","article-title":"GO trimming: systematically reducing redundancy in large Gene Ontology datasets","volume":"4","author":"Jantzen","year":"2011","journal-title":"BMC Res. Notes"},{"key":"B20","doi-asserted-by":"publisher","first-page":"354","DOI":"10.1016\/j.ijmedinf.2007.07.004","article-title":"Literature-based concept profiles for gene annotation: the issue of weighting","volume":"77","author":"Jelier","year":"2008","journal-title":"Int. J. Med. Inf."},{"key":"B21","doi-asserted-by":"publisher","first-page":"27","DOI":"10.1093\/NAR\/28.1.27","article-title":"KEGG: kyoto encyclopedia of genes and genomes","volume":"28","author":"Kanehisa","year":"2000","journal-title":"Nucleic Acids Res."},{"key":"B22","doi-asserted-by":"publisher","first-page":"D672","DOI":"10.1093\/NAR\/GKAE909","article-title":"KEGG: biological systems database as a model of the real world","volume":"53","author":"Kanehisa","year":"2025","journal-title":"Nucleic Acids Res."},{"key":"B23","doi-asserted-by":"publisher","first-page":"10872","DOI":"10.1038\/s41598-018-28948-z","article-title":"GOATOOLS: a python library for gene ontology analyses","volume":"8","author":"Klopfenstein","year":"2018","journal-title":"Sci. Rep."},{"key":"B24","doi-asserted-by":"publisher","first-page":"W207","DOI":"10.1093\/NAR\/GKAD347","article-title":"g:Profiler\u2014interoperable web service for functional enrichment analysis and gene identifier mapping (2023 update)","volume":"51","author":"Kolberg","year":"2023","journal-title":"Nucleic Acids Res."},{"key":"B25","doi-asserted-by":"publisher","first-page":"426","DOI":"10.1186\/1471-2105-7-426","article-title":"Grouping gene ontology terms to improve the assessment of gene set enrichment in microarray data","volume":"7","author":"Lewin","year":"2006","journal-title":"BMC Bioinforma."},{"key":"B26","doi-asserted-by":"publisher","first-page":"1739","DOI":"10.1093\/BIOINFORMATICS\/BTR260","article-title":"Molecular signatures database (MSigDB) 3.0","volume":"27","author":"Liberzon","year":"2011","journal-title":"Bioinformatics"},{"key":"B27","doi-asserted-by":"publisher","first-page":"417","DOI":"10.1016\/J.CELS.2015.12.004","article-title":"The molecular signatures database (MSigDB) hallmark gene set collection","volume":"1","author":"Liberzon","year":"2015","journal-title":"Cell Syst."},{"key":"B28","doi-asserted-by":"publisher","first-page":"36","DOI":"10.1093\/bib\/bby097","article-title":"Comprehensive and critical evaluation of individualized pathway activity measurement tools on pan-cancer data","volume":"21","author":"Lim","year":"2018","journal-title":"Brief. Bioinform"},{"key":"B29","doi-asserted-by":"publisher","first-page":"381","DOI":"10.1186\/s12859-019-2856-8","article-title":"PS-MCL: parallel shotgun coarsened Markov clustering of protein interaction networks","volume":"20","author":"Lim","year":"2019","journal-title":"BMC Bioinforma."},{"key":"B30","doi-asserted-by":"publisher","first-page":"238","DOI":"10.4056\/sigs.561626","article-title":"Quantifying protein function specificity in the gene ontology","volume":"2","author":"Louie","year":"2010","journal-title":"Stand Genomic Sci."},{"key":"B51","doi-asserted-by":"crossref","first-page":"3448","DOI":"10.1093\/bioinformatics\/bti551","article-title":"BiNGO: a cytoscape plugin to assess overrepresentation of gene ontology categories in biological networks","volume":"21","author":"Maere","year":"2005","journal-title":"Bioinformatics"},{"key":"B31","doi-asserted-by":"publisher","first-page":"703","DOI":"10.1038\/s41596-019-0128-8","article-title":"Protocol update for large-scale genome and gene function analysis with the PANTHER classification system (v.14.0)","volume":"14","author":"Mi","year":"2019","journal-title":"Nat. Protoc."},{"key":"B32","doi-asserted-by":"publisher","first-page":"D672","DOI":"10.1093\/NAR\/GKAD1025","article-title":"The reactome pathway knowledgebase 2024","volume":"52","author":"Milacic","year":"2024","journal-title":"Nucleic Acids Res."},{"key":"B33","doi-asserted-by":"publisher","first-page":"203","DOI":"10.1186\/s13059-019-1790-4","article-title":"Identifying significantly impacted pathways: a comprehensive review and assessment","volume":"20","author":"Nguyen","year":"2019","journal-title":"Genome Biol."},{"key":"B34","doi-asserted-by":"publisher","first-page":"e20210077","DOI":"10.1590\/1678-4685-GMB-2021-0077","article-title":"Gene expression analysis platform (GEAP): a highly customizable, fast, versatile and ready-to-use microarray analysis platform","volume":"45","author":"Nunes","year":"2022","journal-title":"Genet. Mol. Biol."},{"key":"B50","doi-asserted-by":"publisher","DOI":"10.1186\/s12859-022-04828-2","article-title":"Orsum: a python package for filtering and comparing enrichment analyses using a simple principle","volume":"23","author":"Ozisik","year":"2022","journal-title":"BMC Bioinform."},{"key":"B35","doi-asserted-by":"publisher","first-page":"e47","DOI":"10.1093\/NAR\/GKV007","article-title":"Limma powers differential expression analyses for RNA-sequencing and microarray studies","volume":"43","author":"Ritchie","year":"2015","journal-title":"Nucleic Acids Res."},{"key":"B48","doi-asserted-by":"crossref","first-page":"355","DOI":"10.1002\/ijc.25704","article-title":"Gene expression profiling reveals novel biomarkers in nonsmall cell lung cancer","volume":"129","author":"Sanchez-Palencia","year":"2011","journal-title":"Int. J. Cancer"},{"key":"B36","first-page":"247","article-title":"Markov clustering of protein interaction networks with improved balance and scalability","author":"Satuluri","year":"2010"},{"key":"B37","doi-asserted-by":"publisher","first-page":"W216","DOI":"10.1093\/nar\/gkac194","article-title":"DAVID: a web server for functional enrichment analysis and functional annotation of gene lists (2021 update)","volume":"50","author":"Sherman","year":"2022","journal-title":"Nucleic Acids Res."},{"key":"B38","doi-asserted-by":"publisher","first-page":"15545","DOI":"10.1073\/PNAS.0506580102","article-title":"Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles","volume":"102","author":"Subramanian","year":"2005","journal-title":"Proc. Natl. Acad. Sci. U. S. A."},{"key":"B39","doi-asserted-by":"publisher","first-page":"e79217","DOI":"10.1371\/journal.pone.0079217","article-title":"A comparison of gene set analysis methods in terms of sensitivity, prioritization and specificity","volume":"8","author":"Tarca","year":"2013","journal-title":"PLoS One"},{"key":"B40","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1038\/s41598-018-23395-2","article-title":"Interpretation of biological experiments changes with evolution of the gene ontology and its annotations","volume":"8","author":"Tomczak","year":"2018","journal-title":"Sci. Rep. 2018"},{"key":"B41","doi-asserted-by":"publisher","first-page":"121","DOI":"10.1137\/040608635","article-title":"Graph clustering via a discrete uncoupling process","volume":"30","author":"Van Dongen","year":"2008","journal-title":"SIAM J. Matrix Analysis Appl."},{"key":"B42","doi-asserted-by":"publisher","first-page":"1274","DOI":"10.1093\/bioinformatics\/btm087","article-title":"A new method to measure the semantic similarity of GO terms","volume":"23","author":"Wang","year":"2007","journal-title":"Bioinformatics"},{"key":"B43","doi-asserted-by":"publisher","first-page":"e1009935","DOI":"10.1371\/journal.pcbi.1009935","article-title":"Urgent need for consistent standards in functional enrichment analysis","volume":"18","author":"Wijesooriya","year":"2022","journal-title":"PLoS Comput. Biol."},{"key":"B44","doi-asserted-by":"publisher","first-page":"69","DOI":"10.1186\/s12859-022-04594-1","article-title":"MonaGO: a novel gene ontology enrichment analysis visualisation system","volume":"23","author":"Xin","year":"2022","journal-title":"BMC Bioinforma."},{"key":"B45","doi-asserted-by":"publisher","first-page":"3292","DOI":"10.1038\/s41596-024-01020-z","article-title":"Using clusterProfiler to characterize multiomics data","volume":"19","author":"Xu","year":"2024","journal-title":"Nat. Protoc."},{"key":"B46","doi-asserted-by":"publisher","first-page":"vbae159","DOI":"10.1093\/BIOADV\/VBAE159","article-title":"Two subtle problems with overrepresentation analysis","volume":"4","author":"Ziemann","year":"2024","journal-title":"Bioinforma. Adv."},{"key":"B47","doi-asserted-by":"publisher","first-page":"5146","DOI":"10.1093\/bioinformatics\/btz447","article-title":"Gene set enrichment for reproducible science: comparison of CERNO and eight other algorithms","volume":"35","author":"Zyla","year":"2019","journal-title":"Bioinformatics"}],"container-title":["Frontiers in Bioinformatics"],"original-title":[],"link":[{"URL":"https:\/\/www.frontiersin.org\/articles\/10.3389\/fbinf.2026.1755664\/full","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2026,1,29]],"date-time":"2026-01-29T06:42:30Z","timestamp":1769668950000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.frontiersin.org\/articles\/10.3389\/fbinf.2026.1755664\/full"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2026,1,29]]},"references-count":51,"alternative-id":["10.3389\/fbinf.2026.1755664"],"URL":"https:\/\/doi.org\/10.3389\/fbinf.2026.1755664","relation":{},"ISSN":["2673-7647"],"issn-type":[{"value":"2673-7647","type":"electronic"}],"subject":[],"published":{"date-parts":[[2026,1,29]]},"article-number":"1755664"}}