{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,12]],"date-time":"2026-03-12T00:28:13Z","timestamp":1773275293289,"version":"3.50.1"},"reference-count":29,"publisher":"Oxford University Press (OUP)","issue":"14","license":[{"start":{"date-parts":[[2016,10,2]],"date-time":"2016-10-02T00:00:00Z","timestamp":1475366400000},"content-version":"vor","delay-in-days":2678,"URL":"http:\/\/creativecommons.org\/licenses\/by-nc\/2.0\/uk\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2009,7,15]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:p>Motivation: There is a growing interest in improving the cluster analysis of expression data by incorporating into it prior knowledge, such as the Gene Ontology (GO) annotations of genes, in order to improve the biological relevance of the clusters that are subjected to subsequent scrutiny. The structure of the GO is another source of background knowledge that can be exploited through the use of semantic similarity.<\/jats:p>\n               <jats:p>Results: We propose here a novel algorithm that integrates semantic similarities (derived from the ontology structure) into the procedure of deriving clusters from the dendrogram constructed during expression-based hierarchical clustering. Our approach can handle the multiple annotations, from different levels of the GO hierarchy, which most genes have. Moreover, it treats annotated and unannotated genes in a uniform manner. Consequently, the clusters obtained by our algorithm are characterized by significantly enriched annotations. In both cross-validation tests and when using an external index such as protein\u2013protein interactions, our algorithm performs better than previous approaches. When applied to human cancer expression data, our algorithm identifies, among others, clusters of genes related to immune response and glucose metabolism. These clusters are also supported by protein\u2013protein interaction data.<\/jats:p>\n               <jats:p>Contact: \u00a0dotna@cs.bgu.ac.il<\/jats:p>\n               <jats:p>Supplementary information: \u00a0Supplementary data are available at Bioinformatics online.<\/jats:p>","DOI":"10.1093\/bioinformatics\/btp327","type":"journal-article","created":{"date-parts":[[2009,6,5]],"date-time":"2009-06-05T00:24:32Z","timestamp":1244161472000},"page":"1789-1795","source":"Crossref","is-referenced-by-count":20,"title":["Seeing the forest for the trees: using the Gene Ontology to restructure hierarchical clustering"],"prefix":"10.1093","volume":"25","author":[{"given":"Dikla","family":"Dotan-Cohen","sequence":"first","affiliation":[{"name":"1 Department of Computer Science, Ben-Gurion University, Beer Sheva, Israel 84105, 2 Department of Biomedical Engineering, 3 Center for Advanced Genomic Technology, 4 Bioinformatics Program, Boston University, MA 02215 and 5 Children's Hospital Boston, Harvard\/MIT Program in Health Sciences and Technology, 300 Longwood Avenue, Boston, MA 02115, USA"}]},{"given":"Simon","family":"Kasif","sequence":"additional","affiliation":[{"name":"1 Department of Computer Science, Ben-Gurion University, Beer Sheva, Israel 84105, 2 Department of Biomedical Engineering, 3 Center for Advanced Genomic Technology, 4 Bioinformatics Program, Boston University, MA 02215 and 5 Children's Hospital Boston, Harvard\/MIT Program in Health Sciences and Technology, 300 Longwood Avenue, Boston, MA 02115, USA"},{"name":"1 Department of Computer Science, Ben-Gurion University, Beer Sheva, Israel 84105, 2 Department of Biomedical Engineering, 3 Center for Advanced Genomic Technology, 4 Bioinformatics Program, Boston University, MA 02215 and 5 Children's Hospital Boston, Harvard\/MIT Program in Health Sciences and Technology, 300 Longwood Avenue, Boston, MA 02115, USA"},{"name":"1 Department of Computer Science, Ben-Gurion University, Beer Sheva, Israel 84105, 2 Department of Biomedical Engineering, 3 Center for Advanced Genomic Technology, 4 Bioinformatics Program, Boston University, MA 02215 and 5 Children's Hospital Boston, Harvard\/MIT Program in Health Sciences and Technology, 300 Longwood Avenue, Boston, MA 02115, USA"},{"name":"1 Department of Computer Science, Ben-Gurion University, Beer Sheva, Israel 84105, 2 Department of Biomedical Engineering, 3 Center for Advanced Genomic Technology, 4 Bioinformatics Program, Boston University, MA 02215 and 5 Children's Hospital Boston, Harvard\/MIT Program in Health Sciences and Technology, 300 Longwood Avenue, Boston, MA 02115, USA"}]},{"given":"Avraham A.","family":"Melkman","sequence":"additional","affiliation":[{"name":"1 Department of Computer Science, Ben-Gurion University, Beer Sheva, Israel 84105, 2 Department of Biomedical Engineering, 3 Center for Advanced Genomic Technology, 4 Bioinformatics Program, Boston University, MA 02215 and 5 Children's Hospital Boston, Harvard\/MIT Program in Health Sciences and Technology, 300 Longwood Avenue, Boston, MA 02115, USA"}]}],"member":"286","published-online":{"date-parts":[[2009,6,3]]},"reference":[{"key":"2023013112044846800_B1","doi-asserted-by":"crossref","first-page":"787","DOI":"10.1016\/j.jbi.2007.06.005","article-title":"Towards knowledge-based gene expression data mining","volume":"6","author":"Bellazzi","year":"2007","journal-title":"J. Biomed. Inform."},{"key":"2023013112044846800_B2","doi-asserted-by":"crossref","first-page":"3266","DOI":"10.1093\/bioinformatics\/bth362","article-title":"The CRASSS plug-in for integrating annotation data with hierarchical clustering results","volume":"20","author":"Buehler","year":"2004","journal-title":"Bioinformatics"},{"key":"2023013112044846800_B3","doi-asserted-by":"crossref","first-page":"687","DOI":"10.1081\/BIP-200025659","article-title":"A knowledge-based clustering algorithm driven by Gene Ontology","volume":"14","author":"Cheng","year":"2004","journal-title":"J. Biopharm. Stat."},{"key":"2023013112044846800_B4","doi-asserted-by":"crossref","first-page":"255","DOI":"10.1038\/nri2056","article-title":"Siglecs and their roles in the immune system","volume":"7","author":"Crocker","year":"2007","journal-title":"Nat. Rev. Immunol."},{"key":"2023013112044846800_B5","doi-asserted-by":"crossref","first-page":"429","DOI":"10.1016\/j.tibtech.2005.05.011","article-title":"Pathways to the analysis of microarray data","volume":"23","author":"Curtis","year":"2005","journal-title":"Trends Biotechnol."},{"key":"2023013112044846800_B6","doi-asserted-by":"crossref","first-page":"151","DOI":"10.1186\/1471-2105-7-151","article-title":"GOurmet: a tool for quantitative comparison and visualization of gene expression profiles based on gene ontology (GO) distributions","volume":"7","author":"Doherty","journal-title":"BMC Bioinformatics"},{"key":"2023013112044846800_B7","doi-asserted-by":"crossref","first-page":"3335","DOI":"10.1093\/bioinformatics\/btm526","article-title":"Hierarchical tree snipping: clustering guided by prior knowledge","volume":"23","author":"Dotan-Cohen","year":"2007","journal-title":"Bioinformatics"},{"key":"2023013112044846800_B8","doi-asserted-by":"crossref","first-page":"401","DOI":"10.1016\/j.jbi.2005.08.004","article-title":"Knowledge guided analysis of microarray data","volume":"39","author":"Fang","year":"2006","journal-title":"J. Biomed. Inform."},{"key":"2023013112044846800_B9","doi-asserted-by":"crossref","first-page":"891","DOI":"10.1038\/nrc1478","article-title":"Why do cancers have high aerobic glycolysis?","volume":"4","author":"Gatenby","year":"2004","journal-title":"Nat. Rev. Cancer"},{"key":"2023013112044846800_B10","doi-asserted-by":"crossref","first-page":"145","DOI":"10.1093\/bioinformatics\/18.suppl_1.S145","article-title":"Co-clustering of biological networks and gene expression data","volume":"18","author":"Hanisch","year":"2002","journal-title":"Bioinformatics"},{"key":"2023013112044846800_B11","doi-asserted-by":"crossref","first-page":"1259","DOI":"10.1093\/bioinformatics\/btl065","article-title":"Incorporating biological knowledge into distance-based clustering analysis of microarray gene expression data","volume":"22","author":"Huang","year":"2006","journal-title":"Bioinformatics"},{"key":"2023013112044846800_B12","article-title":"Semantic similarity based on corpus statistics and lexical taxonomy","volume-title":"Procedings of the International Conference on Research in Computational Linguistics, ROCLING X","author":"Jiang","year":"1997"},{"key":"2023013112044846800_B13","doi-asserted-by":"crossref","first-page":"3587","DOI":"10.1093\/bioinformatics\/bti565","article-title":"Ontological analysis of gene expression data: current tools, limitations, and open problems","volume":"21","author":"Khatri","year":"2005","journal-title":"Bioinformatics"},{"key":"2023013112044846800_B14","doi-asserted-by":"crossref","first-page":"216","DOI":"10.1186\/1471-2105-7-216","article-title":"A factor analysis model for functional genomics","volume":"7","author":"Kustra","year":"2006","journal-title":"BMC Bioinformatics"},{"key":"2023013112044846800_B15","first-page":"1","article-title":"Data-fusion in clustering microarray data: balancing discovery and interpretability","volume":"1","author":"Kustra","year":"2007","journal-title":"IEEE\/ACM Trans. Comput. Biol. Bioinform."},{"key":"2023013112044846800_B16","first-page":"296","article-title":"An information-theoretic definition of similarity","volume-title":"Proceedings of the 15th International Conference on Machine Learning","author":"Lin","year":"1998"},{"key":"2023013112044846800_B17","doi-asserted-by":"crossref","first-page":"1275","DOI":"10.1093\/bioinformatics\/btg153","article-title":"Investigating semantic similarity measures across the Gene Ontology: the relationship between sequence and annotation","volume":"19","author":"Lord","year":"2003","journal-title":"Bioinformatics"},{"key":"2023013112044846800_B18","doi-asserted-by":"crossref","first-page":"697","DOI":"10.1007\/s00251-005-0036-7","article-title":"Polymorphism of the mouse gene for the interleukin 10 receptor alpha chain (Il10ra) and its association with the autoimmune phenotype","volume":"57","author":"Qi","year":"2005","journal-title":"Immunogenetics"},{"key":"2023013112044846800_B19","doi-asserted-by":"crossref","first-page":"95","DOI":"10.1613\/jair.514","article-title":"Semantic similarity in a taxonomy: an information-based measure and its application to problems of ambiguity in natural language","volume":"11","author":"Resnik","year":"1999","journal-title":"J. Artif. Intell. Res."},{"key":"2023013112044846800_B20","doi-asserted-by":"crossref","first-page":"302","DOI":"10.1186\/1471-2105-7-302","article-title":"A new measure for functional similarity of gene products based on Gene Ontology","volume":"7","author":"Schlicker","year":"2006","journal-title":"BMC Bioinformatics"},{"key":"2023013112044846800_B21","doi-asserted-by":"crossref","first-page":"2797","DOI":"10.4049\/jimmunol.141.8.2797","article-title":"Isolation of a cDNA encoding CD33, a differentiation antigen of myeloid progenitor cells","volume":"141","author":"Simmons","year":"1988","journal-title":"J. Immunol."},{"key":"2023013112044846800_B22","first-page":"1631","article-title":"A memetic co-clustering algorithm for gene expression profiles and biological annotation","volume":"2","author":"Speer","year":"2004","journal-title":"CIBCB"},{"key":"2023013112044846800_B23","doi-asserted-by":"crossref","first-page":"3273","DOI":"10.1091\/mbc.9.12.3273","article-title":"Comprehensive identification of cell cycle-regulated genes of the yeast Saccharomyces cerevisiae by microarray hybridization","volume":"9","author":"Spellman","year":"1998","journal-title":"Mol. Biol. Cell."},{"key":"2023013112044846800_B24","doi-asserted-by":"crossref","first-page":"R157","DOI":"10.1186\/gb-2007-8-8-r157","article-title":"An immune response gene expression module identifies a good prognosis subtype in estrogen receptor negative breast cancer","volume":"8","author":"Teschendorff","year":"2007","journal-title":"Genome Biol."},{"key":"2023013112044846800_B25","doi-asserted-by":"crossref","first-page":"32","DOI":"10.1186\/1471-2105-5-32","article-title":"Selection of informative clusters from hierarchical cluster tree with gene classes","volume":"5","author":"Toronen","year":"2004","journal-title":"BMC Bioinformatics"},{"key":"2023013112044846800_B26","doi-asserted-by":"crossref","first-page":"530","DOI":"10.1038\/415530a","article-title":"Gene expression profiling predicts clinical outcome of breast cancer","volume":"415","author":"van 't Veer","year":"2002","journal-title":"Nature"},{"key":"2023013112044846800_B27","first-page":"25","article-title":"Gene expression correlation and Gene Ontology-based similarity: an assessment of quantitative relationships","volume-title":"Proceedings of the 2004 IEEE Symposium on Computational Intelligence in Bioinformatics and Computational Biology (CIBCB 2004)","author":"Wang","year":"2004"},{"key":"2023013112044846800_B28","doi-asserted-by":"crossref","first-page":"454","DOI":"10.1152\/ajpregu.00060.2004","article-title":"Modulation of red cell glycolysis: interactions between vertebrate hemoglobins and cytoplasmic domains of band 3 red cell membrane proteins","volume":"287","author":"Weber","year":"2004","journal-title":"Am. J. Physiol. Regul. Integr. Comp Physiol."},{"key":"2023013112044846800_B29","article-title":"Comparing algorithms for clustering of expression data - how to assess gene clusters","volume-title":"Computational Systems Biology","author":"Yona","year":"2007"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/25\/14\/1789\/48992411\/bioinformatics_25_14_1789.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/25\/14\/1789\/48992411\/bioinformatics_25_14_1789.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,1,31]],"date-time":"2023-01-31T21:18:31Z","timestamp":1675199911000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/25\/14\/1789\/225734"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2009,6,3]]},"references-count":29,"journal-issue":{"issue":"14","published-print":{"date-parts":[[2009,7,15]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btp327","relation":{},"ISSN":["1367-4811","1367-4803"],"issn-type":[{"value":"1367-4811","type":"electronic"},{"value":"1367-4803","type":"print"}],"subject":[],"published-other":{"date-parts":[[2009,7,15]]},"published":{"date-parts":[[2009,6,3]]}}}