{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,1,8]],"date-time":"2026-01-08T04:12:15Z","timestamp":1767845535047,"version":"3.49.0"},"reference-count":16,"publisher":"Oxford University Press (OUP)","issue":"9","license":[{"start":{"date-parts":[[2016,10,2]],"date-time":"2016-10-02T00:00:00Z","timestamp":1475366400000},"content-version":"vor","delay-in-days":2757,"URL":"http:\/\/creativecommons.org\/licenses\/by-nc\/2.0\/uk\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2009,5,1]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:p>Motivation: The growing availability of genome-scale datasets has attracted increasing attention to the development of computational methods for automated inference of functional similarities among genes and their products. One class of such methods measures the functional similarity of genes based on their distance in the Gene Ontology (GO). To measure the functional relatedness of a gene set, these measures consider every pair of genes in the set, and the average of all pairwise distances is calculated. However, as more data becomes available and gene sets used for analysis become larger, such pair-based calculation becomes prohibitive.<\/jats:p>\n               <jats:p>Results: In this article, we propose GS2 (GO-based similarity of gene sets), a novel GO-based measure of gene set similarity that is computable in linear time in the size of the gene set. The measure quantifies the similarity of the GO annotations among a set of genes by averaging the contribution of each gene's GO terms and their ancestor terms with respect to the GO vocabulary graph. To study the performance of our method, we compared our measure with an established pair-based measure when run on gene sets with varying degrees of functional similarities. In addition to a significant speed improvement, our method produced comparable similarity scores to the established method. Our method is available as a web-based tool and an open-source Python library.<\/jats:p>\n               <jats:p>Availability: The web-based tools and Python code are available at: http:\/\/bioserver.cs.rice.edu\/gs2.<\/jats:p>\n               <jats:p>Contact: \u00a0troy.ruths@rice.edu<\/jats:p>","DOI":"10.1093\/bioinformatics\/btp128","type":"journal-article","created":{"date-parts":[[2009,3,17]],"date-time":"2009-03-17T00:34:13Z","timestamp":1237250053000},"page":"1178-1184","source":"Crossref","is-referenced-by-count":36,"title":["GS2: an efficiently computable measure of GO-based similarity of gene sets"],"prefix":"10.1093","volume":"25","author":[{"given":"Troy","family":"Ruths","sequence":"first","affiliation":[{"name":"Department of Computer Science, Rice University, 6100 Main Street, MS 132, Houston, TX, USA"}]},{"given":"Derek","family":"Ruths","sequence":"additional","affiliation":[{"name":"Department of Computer Science, Rice University, 6100 Main Street, MS 132, Houston, TX, USA"}]},{"given":"Luay","family":"Nakhleh","sequence":"additional","affiliation":[{"name":"Department of Computer Science, Rice University, 6100 Main Street, MS 132, Houston, TX, USA"}]}],"member":"286","published-online":{"date-parts":[[2009,3,16]]},"reference":[{"key":"2023013110280427300_B1","doi-asserted-by":"crossref","first-page":"25","DOI":"10.1038\/75556","article-title":"Gene ontology: tool for the unification of biology","volume":"25","author":"Ashburner","year":"2000","journal-title":"Nat. Genet."},{"key":"2023013110280427300_B2","doi-asserted-by":"crossref","first-page":"470","DOI":"10.1186\/1471-2105-7-470","article-title":"Genetools\u2014application for functional annotation and statistical hypothesis testing","volume":"7","author":"Beisvag","year":"2006","journal-title":"BMC Bioinformatics"},{"key":"2023013110280427300_B3","doi-asserted-by":"crossref","first-page":"W169","DOI":"10.1093\/nar\/gkm415","article-title":"David bioinformatics resources: expanded annotation database and novel algorithms to better extract biology from large gene lists","volume":"35","author":"Huang","year":"2007","journal-title":"Nucleic Acids Res."},{"key":"2023013110280427300_B4","article-title":"Semantic similarity based on corpus statistics and lexical taxonomy","volume-title":"Proceedings of International Conference Research on Computational Linguistics (ROCLING X).","author":"Jiang","year":"1997"},{"key":"2023013110280427300_B5","doi-asserted-by":"crossref","first-page":"27","DOI":"10.1093\/nar\/28.1.27","article-title":"KEGG: Kyoto encyclopedia of genes and genomes","volume":"28","author":"Kanehisa","year":"2000","journal-title":"Nucleic Acids Res."},{"key":"2023013110280427300_B6","doi-asserted-by":"crossref","first-page":"375","DOI":"10.1016\/S0168-9525(97)01223-7","article-title":"A database for post-genome analysis","volume":"13","author":"Kanehisa","year":"1997","journal-title":"Trends Genet."},{"key":"2023013110280427300_B7","doi-asserted-by":"crossref","first-page":"3416","DOI":"10.1093\/bioinformatics\/bti538","article-title":"A semantic analysis of the annotations of the human genome","volume":"21","author":"Khatri","year":"2005","journal-title":"Bioinformatics"},{"key":"2023013110280427300_B8","doi-asserted-by":"crossref","first-page":"1929","DOI":"10.1126\/science.1132939","article-title":"The connectivity map: using gene-expression signatures to connect small molecules, genes, and disease","volume":"313","author":"Lamb","year":"2006","journal-title":"Science"},{"key":"2023013110280427300_B9","doi-asserted-by":"crossref","first-page":"168","DOI":"10.1038\/nature05453","article-title":"Genome-wide atlas of gene expression in the adult mouse brain","volume":"445","author":"Lein","year":"2007","journal-title":"Nature"},{"key":"2023013110280427300_B10","first-page":"296","article-title":"An information-theoretic definition of similarity, semantic similarity based on corpus statistics and lexical taxonomy","volume-title":"Fifteenth International Conference on Machine Learning.","author":"Lin","year":"1998"},{"key":"2023013110280427300_B11","doi-asserted-by":"crossref","first-page":"3448","DOI":"10.1093\/bioinformatics\/bti551","article-title":"Bingo: a cytoscape plugin to assess overrepresentation of gene ontology categories in biological networks","volume":"21","author":"Maere","year":"2005","journal-title":"Bioinformatics"},{"key":"2023013110280427300_B12","doi-asserted-by":"crossref","first-page":"93","DOI":"10.1613\/jair.514","article-title":"Semantic similarity in a taxonomy: an information-based measure and its application to problems of ambiguity in natural language","volume":"11","author":"Resnik","year":"1999","journal-title":"J. Artif. Intell. Res."},{"key":"2023013110280427300_B13","doi-asserted-by":"crossref","first-page":"330","DOI":"10.1109\/TCBB.2005.50","article-title":"Correlation between gene expression and go semantic similarity","volume":"2","author":"Sevilla","year":"2005","journal-title":"IEEE\/ACM Trans. Comput. Biol. Bioinform."},{"key":"2023013110280427300_B14","doi-asserted-by":"crossref","first-page":"4465","DOI":"10.1073\/pnas.012025199","article-title":"Large-scale analysis of the human and mouse transcriptomes","volume":"99","author":"Su","year":"2002","journal-title":"Proc. Natl Acad. Sci."},{"key":"2023013110280427300_B15","doi-asserted-by":"crossref","first-page":"1274","DOI":"10.1093\/bioinformatics\/btm087","article-title":"A new method to measure the semantic similarity of go terms","volume":"23","author":"Wang","year":"2007","journal-title":"Bioinformatics"},{"key":"2023013110280427300_B16","doi-asserted-by":"crossref","first-page":"16","DOI":"10.1186\/1471-2105-5-16","article-title":"GOTree Machine (GOTM): a web-based platform for interpreting sets of interesting genes using gene ontology hierarchies","volume":"5","author":"Zhang","year":"2004","journal-title":"BMC Bioinformatics"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/25\/9\/1178\/48983911\/bioinformatics_25_9_1178.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/25\/9\/1178\/48983911\/bioinformatics_25_9_1178.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,1,31]],"date-time":"2023-01-31T20:33:48Z","timestamp":1675197228000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/25\/9\/1178\/204335"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2009,3,16]]},"references-count":16,"journal-issue":{"issue":"9","published-print":{"date-parts":[[2009,5,1]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btp128","relation":{},"ISSN":["1367-4811","1367-4803"],"issn-type":[{"value":"1367-4811","type":"electronic"},{"value":"1367-4803","type":"print"}],"subject":[],"published-other":{"date-parts":[[2009,5,1]]},"published":{"date-parts":[[2009,3,16]]}}}