{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,12]],"date-time":"2026-03-12T04:19:18Z","timestamp":1773289158194,"version":"3.50.1"},"reference-count":40,"publisher":"Oxford University Press (OUP)","issue":"1","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2006,1,1]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>Motivation: Hierarchical and relocation clustering (e.g. K-means and self-organizing maps) have been successful tools in the display and analysis of whole genome DNA microarray expression data. However, the results of hierarchical clustering are sensitive to outliers, and most relocation methods give results which are dependent on the initialization of the algorithm. Therefore, it is difficult to assess the significance of the results. We have developed a consensus clustering algorithm, where the final result is averaged over multiple clustering runs, giving a robust and reproducible clustering, capable of capturing small signal variations. The algorithm preserves valuable properties of hierarchical clustering, which is useful for visualization and interpretation of the results.<\/jats:p><jats:p>Results: We show for the first time that one can take advantage of multiple clustering runs in DNA microarray analysis by collecting re-occurring clustering patterns in a co-occurrence matrix. The results show that consensus clustering obtained from clustering multiple times with Variational Bayes Mixtures of Gaussians or K-means significantly reduces the classification error rate for a simulated dataset. The method is flexible and it is possible to find consensus clusters from different clustering algorithms. Thus, the algorithm can be used as a framework to test in a quantitative manner the homogeneity of different clustering algorithms. We compare the method with a number of state-of-the-art clustering methods. It is shown that the method is robust and gives low classification error rates for a realistic, simulated dataset. The algorithm is also demonstrated for real datasets. It is shown that more biological meaningful transcriptional patterns can be found without conservative statistical or fold-change exclusion of data.<\/jats:p><jats:p>Availability: \u00a0Matlab source code for the clustering algorithm ClusterLustre, and the simulated dataset for testing are available upon request from T.G. and O.W.<\/jats:p><jats:p>Contact: \u00a0tg@biocentrum.dtu.dk and owi@imm.dtu.dk<\/jats:p><jats:p>Supplementary information: \u00a0<\/jats:p>","DOI":"10.1093\/bioinformatics\/bti746","type":"journal-article","created":{"date-parts":[[2005,10,29]],"date-time":"2005-10-29T00:13:06Z","timestamp":1130544786000},"page":"58-67","source":"Crossref","is-referenced-by-count":40,"title":["Robust multi-scale clustering of large DNA microarray datasets with the consensus algorithm"],"prefix":"10.1093","volume":"22","author":[{"given":"Thomas","family":"Grotkj\u00e6r","sequence":"first","affiliation":[{"name":"Center for Microbial Biotechnology 1 \u00a0 1 \u00a0 \u00a0 BioCentrum-DTU, Building 223, Technical University of Denmark, DK-2800 Kgs. Lyngby, Denmark"}]},{"given":"Ole","family":"Winther","sequence":"additional","affiliation":[{"name":"Informatics and Mathematical Modelling, Building 321, Technical University of Denmark 2 \u00a0 2 \u00a0 \u00a0 DK-2800 Kgs. Lyngby, Denmark"}]},{"given":"Birgitte","family":"Regenberg","sequence":"additional","affiliation":[{"name":"Center for Microbial Biotechnology 1 \u00a0 1 \u00a0 \u00a0 BioCentrum-DTU, Building 223, Technical University of Denmark, DK-2800 Kgs. Lyngby, Denmark"}]},{"given":"Jens","family":"Nielsen","sequence":"additional","affiliation":[{"name":"Center for Microbial Biotechnology 1 \u00a0 1 \u00a0 \u00a0 BioCentrum-DTU, Building 223, Technical University of Denmark, DK-2800 Kgs. Lyngby, Denmark"}]},{"given":"Lars Kai","family":"Hansen","sequence":"additional","affiliation":[{"name":"Informatics and Mathematical Modelling, Building 321, Technical University of Denmark 2 \u00a0 2 \u00a0 \u00a0 DK-2800 Kgs. Lyngby, Denmark"}]}],"member":"286","published-online":{"date-parts":[[2005,10,27]]},"reference":[{"key":"2023012408334145900_b1","doi-asserted-by":"crossref","first-page":"25","DOI":"10.1038\/75556","article-title":"Gene Ontology: tool for the unification of biology. the gene ontology consortium","volume":"25","author":"Ashburner","year":"2000","journal-title":"Nat. Genet."},{"key":"2023012408334145900_b2","doi-asserted-by":"crossref","first-page":"1425","DOI":"10.1101\/gr.180801","article-title":"Creating the gene ontology resource: design and implementation\u2014the gene ontology consortium","volume":"11","author":"Ashburner","year":"2001","journal-title":"Genome Res."},{"key":"2023012408334145900_b3","first-page":"209","article-title":"A variational Bayesian framework for graphical models","volume-title":"Adv. Neur. Info. Proc. Sys.","author":"Attias","year":"2000"},{"issue":"Suppl. 1","key":"2023012408334145900_b4","doi-asserted-by":"crossref","first-page":"S22","DOI":"10.1093\/bioinformatics\/17.suppl_1.S22","article-title":"Fast optimal leaf ordering for hierarchical clustering","volume":"17","author":"Bar-Joseph","year":"2001","journal-title":"Bioinformatics"},{"key":"2023012408334145900_b5","doi-asserted-by":"crossref","first-page":"289","DOI":"10.1111\/j.2517-6161.1995.tb02031.x","article-title":"Controlling the false discovery rate: a practical and powerful approach to multiple testing","volume":"57","author":"Benjamini","year":"1995","journal-title":"J. R. Stat. Soc."},{"key":"2023012408334145900_b6","doi-asserted-by":"crossref","first-page":"32141","DOI":"10.1074\/jbc.M304478200","article-title":"Transcriptional, proteomic, and metabolic responses to lithium in galactose-grown yeast cells","volume":"278","author":"Bro","year":"2003","journal-title":"J. Biol. Chem."},{"key":"2023012408334145900_b7","doi-asserted-by":"crossref","first-page":"680","DOI":"10.1126\/science.278.5338.680","article-title":"Exploring the metabolic and genetic control of gene expression on a genomic scale","volume":"278","author":"DeRisi","year":"1997","journal-title":"Science"},{"key":"2023012408334145900_b8","first-page":"399","article-title":"Clustering protein sequence and structure space with infinite Gaussian mixture models","author":"Dubey","year":"2004"},{"key":"2023012408334145900_b9","doi-asserted-by":"crossref","first-page":"14863","DOI":"10.1073\/pnas.95.25.14863","article-title":"Cluster analysis and display of genome-wide expression patterns","volume":"95","author":"Eisen","year":"1998","journal-title":"Proc. Natl Acad. Sci. USA"},{"key":"2023012408334145900_b10","doi-asserted-by":"crossref","first-page":"219","DOI":"10.1016\/B978-155860797-2\/50012-3","article-title":"Clustering microarray data with evolutionary algorithms","volume-title":"Evolutionary Computation in Bioinformatics","author":"Falkenauer","year":"2003","edition":"1st edn."},{"key":"2023012408334145900_b11","first-page":"276","article-title":"Data clustering using evidence accumulation","author":"Fred","year":"2002"},{"key":"2023012408334145900_b12","first-page":"128","article-title":"Robust data clustering","author":"Fred","year":"2003"},{"key":"2023012408334145900_b13","doi-asserted-by":"crossref","first-page":"4241","DOI":"10.1091\/mbc.11.12.4241","article-title":"Genomic expression programs in the response of yeast cells to environmental changes","volume":"11","author":"Gasch","year":"2000","journal-title":"Mol. Biol. Cell"},{"key":"2023012408334145900_b14","doi-asserted-by":"crossref","first-page":"1817","DOI":"10.1093\/bioinformatics\/btg245","article-title":"Transformation and normalization of oligonucleotide microarray data","volume":"19","author":"Geller","year":"2003","journal-title":"Bioinformatics"},{"key":"2023012408334145900_b15","doi-asserted-by":"crossref","first-page":"275","DOI":"10.1093\/bioinformatics\/18.2.275","article-title":"Mixture modelling of gene expression data from microarray experiments","volume":"18","author":"Ghosh","year":"2002","journal-title":"Bioinformatics"},{"key":"2023012408334145900_b16","doi-asserted-by":"crossref","first-page":"1574","DOI":"10.1101\/gr.397002","article-title":"Judging the quality of gene expression-based clustering methods using gene annotation","volume":"12","author":"Gibbons","year":"2002","journal-title":"Genome Res."},{"key":"2023012408334145900_b17","first-page":"219","article-title":"Statistical issues in the clustering of gene expression data","volume":"12","author":"Goldstein","year":"2002","journal-title":"Stat. Sin."},{"key":"2023012408334145900_b18","doi-asserted-by":"crossref","first-page":"673","DOI":"10.2174\/1389202043348472","article-title":"Enhancing yeast transcription analysis through integration of heterogenous data","volume":"4","author":"Grotkj\u00e6r","year":"2004","journal-title":"Curr. Genomics"},{"key":"2023012408334145900_b19","first-page":"3494","article-title":"Modeling text with generalizable Gaussian mixtures","author":"Hansen","year":"2000"},{"key":"2023012408334145900_b20","article-title":"The Elements of Statistical Learning \u2014 Data Mining, Inference, and Prediction","volume-title":"Springer Series in Statistics","author":"Hastie","year":"2001"},{"key":"2023012408334145900_b21","doi-asserted-by":"crossref","first-page":"109","DOI":"10.1016\/S0092-8674(00)00015-5","article-title":"Functional discovery via a compendium of expression profiles","volume":"102","author":"Hughes","year":"2000","journal-title":"Cell"},{"key":"2023012408334145900_b22","doi-asserted-by":"crossref","first-page":"107","DOI":"10.1152\/physiolgenomics.00139.2003","article-title":"Transcriptome profiling of a Saccharomyces cerevisiae mutant with a constitutively activated Ras\/cAMP pathway","volume":"16","author":"Jones","year":"2003","journal-title":"Physiol. Genomics"},{"key":"2023012408334145900_b23","doi-asserted-by":"crossref","first-page":"125","DOI":"10.1165\/ajrcmb.27.2.f247","article-title":"Practical approaches to analyzing results of microarray experiments","volume":"27","author":"Kaminski","year":"2002","journal-title":"Am. J. Respir. Cell Mol. Biol."},{"key":"2023012408334145900_b24","first-page":"283","article-title":"Interpreting and extending classical agglomerative clustering algorithms using a model-based approach","author":"Kamvar","year":"2002"},{"key":"2023012408334145900_b25","article-title":"Information Theory, Inference and Learning Algorithms","author":"MacKay","year":"2003"},{"key":"2023012408334145900_b26","doi-asserted-by":"crossref","first-page":"413","DOI":"10.1093\/bioinformatics\/18.3.413","article-title":"A mixture model-based approach to the clustering of microarray expression data","volume":"18","author":"McLachlan","year":"2002","journal-title":"Bioinformatics"},{"key":"2023012408334145900_b27","doi-asserted-by":"crossref","first-page":"91","DOI":"10.1023\/A:1023949509487","article-title":"Consensus clustering: a resampling-based method for class discovery and visualization of gene expression microarray data","volume":"52","author":"Monti","year":"2003","journal-title":"Mach. Learn."},{"key":"2023012408334145900_b28","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1186\/gb-2002-3-2-research0009","article-title":"Model-based cluster analysis of microarray gene-expression data","volume":"3","author":"Pan","year":"2002","journal-title":"Genome Biol."},{"key":"2023012408334145900_b29","doi-asserted-by":"crossref","first-page":"9121","DOI":"10.1073\/pnas.132656399","article-title":"Cluster analysis of gene expression dynamics","volume":"99","author":"Ramoni","year":"2002","journal-title":"Proc. Natl Acad. Sci. USA"},{"key":"2023012408334145900_b30","doi-asserted-by":"crossref","first-page":"368","DOI":"10.1093\/bioinformatics\/btf877","article-title":"Identifying differentially expressed genes using false discovery rate controlling procedures","volume":"19","author":"Reiner","year":"2003","journal-title":"Bioinformatics"},{"key":"2023012408334145900_b31","doi-asserted-by":"crossref","first-page":"557","DOI":"10.1089\/106652701753307485","article-title":"A model for measurement error for gene expression arrays","volume":"8","author":"Rocke","year":"2001","journal-title":"J. Comput. Biol."},{"key":"2023012408334145900_b32","doi-asserted-by":"crossref","first-page":"966","DOI":"10.1093\/bioinformatics\/btg107","article-title":"Approximate variance-stabilising transformations for gene-expression microarray data","volume":"19","author":"Rocke","year":"2003","journal-title":"Bioinformatics"},{"key":"2023012408334145900_b33","doi-asserted-by":"crossref","first-page":"1787","DOI":"10.1093\/bioinformatics\/btg232","article-title":"CLICK and EXPANDER: a system for clustering and visualizing gene expression data","volume":"19","author":"Sharan","year":"2003","journal-title":"Bioinformatics"},{"key":"2023012408334145900_b34","doi-asserted-by":"crossref","first-page":"735","DOI":"10.1093\/bioinformatics\/18.5.735","article-title":"Adaptive quality-based clustering of gene expression profiles","volume":"18","author":"Smet","year":"2002","journal-title":"Bioinformatics"},{"key":"2023012408334145900_b35","first-page":"583","article-title":"Cluster ensembles\u2014a knowledge reuse framework for combining multiple partitions","volume":"3","author":"Strehl","year":"2002","journal-title":"J. Mach. Learn. Res."},{"key":"2023012408334145900_b36","doi-asserted-by":"crossref","first-page":"2907","DOI":"10.1073\/pnas.96.6.2907","article-title":"Interpreting patterns of gene expression with self-organizing maps: methods and application to hematopoietic differentiation","volume":"96","author":"Tamayo","year":"1999","journal-title":"Proc. Natl Acad. Sci. USA"},{"key":"2023012408334145900_b37","doi-asserted-by":"crossref","first-page":"281","DOI":"10.1038\/10343","article-title":"Systematic determination of genetic network architecture","volume":"22","author":"Tavazoie","year":"1999","journal-title":"Nat. Genet."},{"key":"2023012408334145900_b38","doi-asserted-by":"crossref","first-page":"5116","DOI":"10.1073\/pnas.091062498","article-title":"Significance analysis of microarrays applied to the ionizing radiation response","volume":"98","author":"Tusher","year":"2001","journal-title":"Proc. Natl Acad. Sci. USA"},{"key":"2023012408334145900_b39","doi-asserted-by":"crossref","first-page":"659","DOI":"10.1093\/bioinformatics\/btg046","article-title":"MatArray: a Matlab toolbox for microarray data","volume":"19","author":"Venet","year":"2003","journal-title":"Bioinformatics"},{"key":"2023012408334145900_b40","doi-asserted-by":"crossref","first-page":"236","DOI":"10.1080\/01621459.1963.10500845","article-title":"Hierarchical grouping to optimize an objective function","volume":"58","author":"Ward","year":"1963","journal-title":"J. Am. Stat. Assoc."}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/22\/1\/58\/48838970\/bioinformatics_22_1_58.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/22\/1\/58\/48838970\/bioinformatics_22_1_58.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,1,5]],"date-time":"2025-01-05T07:22:09Z","timestamp":1736061729000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/22\/1\/58\/218897"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2005,10,27]]},"references-count":40,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2006,1,1]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/bti746","relation":{},"ISSN":["1367-4811","1367-4803"],"issn-type":[{"value":"1367-4811","type":"electronic"},{"value":"1367-4803","type":"print"}],"subject":[],"published-other":{"date-parts":[[2006,1,1]]},"published":{"date-parts":[[2005,10,27]]}}}