{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2024,8,3]],"date-time":"2024-08-03T11:24:24Z","timestamp":1722684264671},"reference-count":39,"publisher":"Oxford University Press (OUP)","issue":"3","license":[{"start":{"date-parts":[[2016,10,2]],"date-time":"2016-10-02T00:00:00Z","timestamp":1475366400000},"content-version":"vor","delay-in-days":2489,"URL":"http:\/\/creativecommons.org\/licenses\/by-nc\/2.0\/uk\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2010,2,1]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:p>Motivation: Various clustering methods have been applied to microarray gene expression data for identifying genes with similar expression profiles. As the biological annotation data accumulated, more and more genes have been organized into functional categories. Functionally related genes may be regulated by common cellular signals, thus likely to be co-expressed. Consequently, utilizing the rapidly increasing functional annotation resources such as Gene Ontology (GO) to improve the performance of clustering methods is of great interest. On the opposite side of clustering, there are genes that have distinct expression profiles and do not co-express with other genes. Identification of these scattered genes could enhance the performance of clustering methods.<\/jats:p>\n               <jats:p>Results: We developed a new clustering algorithm, Dynamically Weighted Clustering with Noise set (DWCN), which makes use of gene annotation information and allows for a set of scattered genes, the noise set, to be left out of the main clusters. We tested the DWCN method and contrasted its results with those obtained using several common clustering techniques on a simulated dataset as well as on two public datasets: the Stanford yeast cell-cycle gene expression data, and a gene expression dataset for a group of genetically different yeast segregants.<\/jats:p>\n               <jats:p>Conclusion: Our method produces clusters with more consistent functional annotations and more coherent expression patterns than existing clustering techniques.<\/jats:p>\n               <jats:p>Contact: \u00a0yshen@stat.ucla.edu<\/jats:p>\n               <jats:p>Supplementary information: \u00a0Supplementary data are available at Bioinformatics online.<\/jats:p>","DOI":"10.1093\/bioinformatics\/btp671","type":"journal-article","created":{"date-parts":[[2009,12,10]],"date-time":"2009-12-10T01:46:35Z","timestamp":1260409595000},"page":"341-347","source":"Crossref","is-referenced-by-count":6,"title":["Dynamically weighted clustering with noise set"],"prefix":"10.1093","volume":"26","author":[{"given":"Yijing","family":"Shen","sequence":"first","affiliation":[{"name":"1 Department of Statistics at University of California, Los Angeles, CA 90095, 2 Department of Biostatistics, Genetics, University of North Carolina, NC 27516, USA and 3 Institute of Statistical Science, Academia Sinica, Taipei, Taiwan, Republic of China"}]},{"given":"Wei","family":"Sun","sequence":"additional","affiliation":[{"name":"1 Department of Statistics at University of California, Los Angeles, CA 90095, 2 Department of Biostatistics, Genetics, University of North Carolina, NC 27516, USA and 3 Institute of Statistical Science, Academia Sinica, Taipei, Taiwan, Republic of China"}]},{"given":"Ker-Chau","family":"Li","sequence":"additional","affiliation":[{"name":"1 Department of Statistics at University of California, Los Angeles, CA 90095, 2 Department of Biostatistics, Genetics, University of North Carolina, NC 27516, USA and 3 Institute of Statistical Science, Academia Sinica, Taipei, Taiwan, Republic of China"},{"name":"1 Department of Statistics at University of California, Los Angeles, CA 90095, 2 Department of Biostatistics, Genetics, University of North Carolina, NC 27516, USA and 3 Institute of Statistical Science, Academia Sinica, Taipei, Taiwan, Republic of China"}]}],"member":"286","published-online":{"date-parts":[[2009,12,9]]},"reference":[{"key":"2023012511001054100_B1","doi-asserted-by":"crossref","first-page":"59","DOI":"10.1145\/1014052.1014062","article-title":"A probabilistic framework for semi-supervised clustering","volume-title":"Proceedings of the Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining","author":"Basu","year":"2004"},{"issue":"Suppl. 2","key":"2023012511001054100_B2","doi-asserted-by":"crossref","first-page":"S7","DOI":"10.1186\/1471-2105-8-S2-S7","article-title":"Model order selection for bio-molecular data clustering","volume":"8","author":"Bertoni","year":"2007","journal-title":"BMC Bioinformatics"},{"issue":"Suppl. 2","key":"2023012511001054100_B3","doi-asserted-by":"crossref","first-page":"S4","DOI":"10.1186\/1471-2105-9-S2-S4","article-title":"Discovering multi-level structures in bio-molecular data through the Bernstein inequality","volume":"9","author":"Bertoni","year":"2008","journal-title":"BMC Bioinformatics"},{"key":"2023012511001054100_B4","doi-asserted-by":"crossref","first-page":"1572","DOI":"10.1073\/pnas.0408709102","article-title":"The landscape of genetic complexity across 5,700 gene expression traits in yeast","volume":"102","author":"Brem","year":"2005","journal-title":"Proc. Natl Acad. Sci. USA"},{"key":"2023012511001054100_B5","doi-asserted-by":"crossref","first-page":"701","DOI":"10.1038\/nature03865","article-title":"Genetic interactions between polymorphisms that affect gene expression in yeast","volume":"436","author":"Brem","year":"2005","journal-title":"Nature"},{"key":"2023012511001054100_B6","doi-asserted-by":"crossref","first-page":"519","DOI":"10.1109\/TITB.2006.872073","article-title":"Application of simulated annealing to the biclustering of gene expression data","volume":"10","author":"Bryan","year":"2006","journal-title":"IEEE Trans. Inf. Technol. Biomed."},{"key":"2023012511001054100_B7","doi-asserted-by":"crossref","DOI":"10.1109\/HPCASIA.2005.25","article-title":"Biclustering of gene expression data by simulated annealing","volume-title":"Proceedings of the Eighth International Conference on High-Performance Computing in Asia-Pacific Region.","author":"Chakraborty","year":"2005"},{"key":"2023012511001054100_B8","doi-asserted-by":"crossref","first-page":"687","DOI":"10.1081\/BIP-200025659","article-title":"A knowledge-based clustering algorithm driven by Gene Ontology","volume":"14","author":"Cheng","year":"2004","journal-title":"J. Biopharmaceut. Statist."},{"key":"2023012511001054100_B9","doi-asserted-by":"crossref","first-page":"411","DOI":"10.1586\/14737159.3.4.411","article-title":"Cancer diagnosis using proteomic patterns","volume":"3","author":"Conrads","year":"2003","journal-title":"Expert Rev. Mol. Diagnost."},{"key":"2023012511001054100_B10","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1186\/gb-2002-3-7-research0036","article-title":"A prediction-based resampling method for estimating the number of clusters in a dataset","volume":"3","author":"Dudoit","year":"2002","journal-title":"Genome Biol."},{"key":"2023012511001054100_B11","doi-asserted-by":"crossref","first-page":"14863","DOI":"10.1073\/pnas.95.25.14863","article-title":"Cluster analysis and display of genome-wide expression patterns","volume":"95","author":"Eisen","year":"1998","journal-title":"Proc. Natl Acad. Sci.USA"},{"key":"2023012511001054100_B12","doi-asserted-by":"crossref","first-page":"906","DOI":"10.1093\/bioinformatics\/16.10.906","article-title":"Support vector machine classification and validation of cancer tissue samples using microarray expression data","volume":"16","author":"Furey","year":"2000","journal-title":"Bioinformatics"},{"key":"2023012511001054100_B13","first-page":"18","article-title":"Singular value decomposition regression models for classification of tumors from microarray experiments","volume":"98","author":"Ghosh","year":"2002","journal-title":"Pac. Symp. Biocomput."},{"key":"2023012511001054100_B14","first-page":"1001","article-title":"A unified framework for model-based clustering","volume":"4","author":"Ghosh","year":"2003","journal-title":"J. Machine Learn. Res."},{"key":"2023012511001054100_B15","doi-asserted-by":"crossref","first-page":"531","DOI":"10.1126\/science.286.5439.531","article-title":"Molecular classification of cancer: class discovery and class prediction by gene expression monitoring","volume":"286","author":"Golub","year":"1999","journal-title":"Science"},{"issue":"Suppl. 1","key":"2023012511001054100_B16","doi-asserted-by":"crossref","first-page":"S145","DOI":"10.1093\/bioinformatics\/18.suppl_1.S145","article-title":"Co-clustering of biological networks and gene expression data","volume":"18","author":"Hanisch","year":"2002","journal-title":"Bioinformatics"},{"key":"2023012511001054100_B17","doi-asserted-by":"crossref","DOI":"10.1186\/gb-2000-1-2-research0003","article-title":"\u2018Gene shaving\u2019 as a method for identifying distinct sets of genes with similar expression patterns","volume":"1","author":"Hastie","year":"2000","journal-title":"Genome Biol."},{"key":"2023012511001054100_B18","doi-asserted-by":"crossref","first-page":"832","DOI":"10.1109\/34.709601","article-title":"The random subspace method for constructing decision forests","volume":"20","author":"Ho","year":"1998","journal-title":"IEEE Trans. Pattern Anal. Machine Intell."},{"key":"2023012511001054100_B19","doi-asserted-by":"crossref","first-page":"193","DOI":"10.1007\/BF01908075","article-title":"Comparing partitions","volume":"2","author":"Hubert","year":"1985","journal-title":"J. Classif."},{"key":"2023012511001054100_B20","doi-asserted-by":"crossref","first-page":"264","DOI":"10.1145\/331499.331504","article-title":"Data clustering: a review","volume":"31","author":"Jain","year":"1999","journal-title":"ACM Comput. Surveys"},{"key":"2023012511001054100_B21","doi-asserted-by":"crossref","first-page":"719","DOI":"10.1093\/bioinformatics\/btm563","article-title":"Defining clusters from a hierarchical cluster tree: the Dynamic Tree Cut package for R","volume":"24","author":"Langfelder","year":"2008","journal-title":"Bioinformatics"},{"key":"2023012511001054100_B22","doi-asserted-by":"crossref","first-page":"526","DOI":"10.1093\/nar\/gkn972","article-title":"Patterns of co-expression for protein complexes by size in Saccharomyces cerevisiae","volume":"37","author":"Liu","year":"2008","journal-title":"Nucleic Acids Res."},{"key":"2023012511001054100_B23","volume-title":"Mixture Models: Inference and Applications to Clustering.","author":"MacLachlan","year":"1988"},{"key":"2023012511001054100_B24","doi-asserted-by":"crossref","first-page":"83","DOI":"10.1038\/47048","article-title":"A combined algorithm for genome-wide prediction of protein function","volume":"402","author":"Marcotte","year":"1999","journal-title":"Nature"},{"key":"2023012511001054100_B25","doi-asserted-by":"crossref","first-page":"795","DOI":"10.1093\/bioinformatics\/btl011","article-title":"Incorporating gene functions as priors in model-based clustering of microarray gene expression data","volume":"22","author":"Pan","year":"2006","journal-title":"Bioinformatics"},{"key":"2023012511001054100_B26","doi-asserted-by":"crossref","first-page":"846","DOI":"10.1080\/01621459.1971.10482356","article-title":"Objective criteria for the evaluation of clustering methods","volume":"66","author":"Rand","year":"1971","journal-title":"J. Am. Statist. Assoc."},{"key":"2023012511001054100_B27","doi-asserted-by":"crossref","first-page":"166","DOI":"10.1038\/ng1165","article-title":"Module networks: identifying regulatory modules and their condition-specific regulators from gene expression data","volume":"34","author":"Segal","year":"2003","journal-title":"Nat. Genet."},{"key":"2023012511001054100_B28","doi-asserted-by":"crossref","first-page":"68","DOI":"10.1038\/nm0102-68","article-title":"Diffuse large B-cell lymphoma outcome prediction by gene-expression profiling and supervised machine learning","volume":"8","author":"Shipp","year":"2002","journal-title":"Nat. Med."},{"key":"2023012511001054100_B29","doi-asserted-by":"crossref","first-page":"203","DOI":"10.1016\/S1535-6108(02)00030-2","article-title":"Gene expression correlates of clinical prostate cancer behavior","volume":"1","author":"Singh","year":"2002","journal-title":"Cancer Cell"},{"key":"2023012511001054100_B30","doi-asserted-by":"crossref","first-page":"36","DOI":"10.1186\/1471-2105-4-36","article-title":"Cluster stability scores for microarray data in cancer studies","volume":"4","author":"Smolkin","year":"2003","journal-title":"BMC Bioinformatics"},{"key":"2023012511001054100_B31","doi-asserted-by":"crossref","first-page":"3273","DOI":"10.1091\/mbc.9.12.3273","article-title":"Comprehensive identification of cell cycle\u2014regulated genes of the yeast Saccharomyces cerevisiae by microarray hybridizationh","volume":"9","author":"Spellman","year":"1998","journal-title":"Mol. Biol. Cell"},{"key":"2023012511001054100_B32","doi-asserted-by":"crossref","first-page":"2907","DOI":"10.1073\/pnas.96.6.2907","article-title":"Interpreting patterns of gene expression with self-organizing maps: Methods and application to hematopoietic differentiation","volume":"96","author":"Tamayo","year":"1999","journal-title":"Proc. Natl Acad. Sci. USA"},{"key":"2023012511001054100_B33","doi-asserted-by":"crossref","first-page":"281","DOI":"10.1038\/10343","article-title":"Systematic determination of genetic network architecture","volume":"22","author":"Tavazoie","year":"1999","journal-title":"Nature Genet."},{"key":"2023012511001054100_B34","doi-asserted-by":"crossref","first-page":"2405","DOI":"10.1093\/bioinformatics\/btl406","article-title":"Evaluation and comparison of gene clustering methods in microarray analysis","volume":"22","author":"Thalamuthu","year":"2006","journal-title":"Bioinformatics"},{"key":"2023012511001054100_B35","article-title":"Cluster validation by prediction strength","volume-title":"Technical Report.","author":"Tibshirani","year":"2001"},{"key":"2023012511001054100_B36","doi-asserted-by":"crossref","first-page":"520","DOI":"10.1093\/bioinformatics\/17.6.520","article-title":"Missing value estimation methods for DNA microarrays","volume":"17","author":"Troyanskaya","year":"2001","journal-title":"Bioinformatics"},{"key":"2023012511001054100_B37","doi-asserted-by":"crossref","first-page":"2247","DOI":"10.1093\/bioinformatics\/btm320","article-title":"Penalized and weighted K-means for clustering with scattered objects and prior information in high-throughput biological data","volume":"23","author":"Tseng","year":"2007","journal-title":"Bioinformatics"},{"key":"2023012511001054100_B38","doi-asserted-by":"crossref","first-page":"10","DOI":"10.1111\/j.0006-341X.2005.031032.x","article-title":"Tight clustering: a resampling-based approach for identifying stable and tight patterns in data","volume":"61","author":"Tseng","year":"2005","journal-title":"Biometrics"},{"key":"2023012511001054100_B39","first-page":"997","article-title":"Model-based clustring and data transformations for gene expression data","volume":"17","author":"Yeung","year":"2001","journal-title":"Bioinformatics"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/26\/3\/341\/48860418\/bioinformatics_26_3_341.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/26\/3\/341\/48860418\/bioinformatics_26_3_341.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,1,25]],"date-time":"2023-01-25T11:01:21Z","timestamp":1674644481000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/26\/3\/341\/214679"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2009,12,9]]},"references-count":39,"journal-issue":{"issue":"3","published-print":{"date-parts":[[2010,2,1]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btp671","relation":{},"ISSN":["1367-4811","1367-4803"],"issn-type":[{"value":"1367-4811","type":"electronic"},{"value":"1367-4803","type":"print"}],"subject":[],"published-other":{"date-parts":[[2010,2,1]]},"published":{"date-parts":[[2009,12,9]]}}}