{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,28]],"date-time":"2026-03-28T10:34:00Z","timestamp":1774694040505,"version":"3.50.1"},"reference-count":22,"publisher":"Oxford University Press (OUP)","issue":"22","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2005,11,15]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:p>Motivation: Classification of biological samples by microarrays is a topic of much interest. A number of methods have been proposed and successfully applied to this problem. It has recently been shown that classification by nearest centroids provides an accurate predictor that may outperform much more complicated methods. The \u2018Prediction Analysis of Microarrays\u2019 (PAM) approach is one such example, which the authors strongly motivate by its simplicity and interpretability. In this spirit, I seek to assess the performance of classifiers simpler than even PAM.<\/jats:p>\n               <jats:p>Results: I surprisingly show that the modified t-statistics and shrunken centroids employed by PAM tend to increase misclassification error when compared with their simpler counterparts. Based on these observations, I propose a classification method called \u2018Classification to Nearest Centroids\u2019 (ClaNC). ClaNC ranks genes by standard t-statistics, does not shrink centroids and uses a class-specific gene-selection procedure. Because of these modifications, ClaNC is arguably simpler and easier to interpret than PAM, and it can be viewed as a traditional nearest centroid classifier that uses specially selected genes. I demonstrate that ClaNC error rates tend to be significantly less than those for PAM, for a given number of active genes.<\/jats:p>\n               <jats:p>Availability: Point-and-click software is freely available at<\/jats:p>\n               <jats:p>Contact: \u00a0adabney@u.washington.edu<\/jats:p>\n               <jats:p>Supplementary Information: \u00a0<\/jats:p>","DOI":"10.1093\/bioinformatics\/bti681","type":"journal-article","created":{"date-parts":[[2005,9,21]],"date-time":"2005-09-21T03:03:34Z","timestamp":1127271814000},"page":"4148-4154","source":"Crossref","is-referenced-by-count":104,"title":["Classification of microarrays to nearest centroids"],"prefix":"10.1093","volume":"21","author":[{"given":"Alan R.","family":"Dabney","sequence":"first","affiliation":[{"name":"Department of Biostatistics, University of Washington \u00a0 Seattle 98195, USA"}]}],"member":"286","published-online":{"date-parts":[[2005,9,20]]},"reference":[{"key":"2023061007105057900_b1","doi-asserted-by":"crossref","first-page":"503","DOI":"10.1038\/35000501","article-title":"Distinct types of diffuse large B-cell lymphoma identified by gene expression profiling","volume":"403","author":"Alizadeh","year":"2000","journal-title":"Nature"},{"key":"2023061007105057900_b2","doi-asserted-by":"crossref","first-page":"6562","DOI":"10.1073\/pnas.102102699","article-title":"Selection bias in gene extraction on the basis of microarray gene-expression data","volume":"99","author":"Ambroise","year":"2002","journal-title":"Proc. Natl Acad. Sci. USA"},{"key":"2023061007105057900_b3","doi-asserted-by":"crossref","first-page":"5","DOI":"10.1023\/A:1010933404324","article-title":"Random forests","volume":"45","author":"Breiman","year":"2001","journal-title":"Machine Learning"},{"key":"2023061007105057900_b4","volume-title":"Classification and Regression Trees","author":"Breiman","year":"1984"},{"key":"2023061007105057900_b5","doi-asserted-by":"crossref","first-page":"425","DOI":"10.1093\/biomet\/81.3.425","article-title":"Ideal spatial adaptation by wavelet shrinkage","volume":"81","author":"Donoho","year":"1994","journal-title":"Biometrika"},{"key":"2023061007105057900_b6","article-title":"Comparison of discriminant methods for the classification of tumors using gene expression data","author":"Dudoit","year":"2000"},{"key":"2023061007105057900_b7","doi-asserted-by":"crossref","first-page":"77","DOI":"10.1198\/016214502753479248","article-title":"Comparison of discriminant methods for the classification of tumors using gene expression data","volume":"97","author":"Dudoit","year":"2002","journal-title":"J. Am. Stat. Assoc."},{"key":"2023061007105057900_b8","doi-asserted-by":"crossref","first-page":"531","DOI":"10.1126\/science.286.5439.531","article-title":"Molecular classification of cancer: Class discovery and class prediction by gene expression monitoring","volume":"286","author":"Golub","year":"1999","journal-title":"Science"},{"key":"2023061007105057900_b9","doi-asserted-by":"crossref","DOI":"10.1007\/978-0-387-21606-5","volume-title":"The Elements of Statistical Learning: Data Mining, Inference and Prediction","author":"Hastie","year":"2001"},{"key":"2023061007105057900_b10","doi-asserted-by":"crossref","first-page":"673","DOI":"10.1038\/89044","article-title":"Classification and diagnostic prediction of cancers using gene expression profiling and artificial neural networks","volume":"7","author":"Khan","year":"2001","journal-title":"Nat. Med."},{"key":"2023061007105057900_b11","doi-asserted-by":"crossref","first-page":"869","DOI":"10.1016\/j.csda.2004.03.017","article-title":"An extensive comparison of recent classification tools applied to microarray data","volume":"48","author":"Lee","year":"2005","journal-title":"Comput. Stat. Data Anal."},{"key":"2023061007105057900_b12","volume-title":"Multivariate Analysis","author":"Mardia","year":"1979"},{"key":"2023061007105057900_b13","doi-asserted-by":"crossref","first-page":"407","DOI":"10.1016\/j.csda.2003.08.001","article-title":"On partial least squares dimension reduction for microarray-based classification: a simulation study","volume":"46","author":"Nguyen","year":"2004","journal-title":"Comput. Stati. Data Anal."},{"key":"2023061007105057900_b14","doi-asserted-by":"crossref","first-page":"15149","DOI":"10.1073\/pnas.211566398","article-title":"Multiclass cancer diagnosis using tumor gene expression signature","volume":"98","author":"Ramaswamy","year":"2001","journal-title":"Proc. Nati. Acad. Sci. USA"},{"key":"2023061007105057900_b15","doi-asserted-by":"crossref","first-page":"227","DOI":"10.1038\/73432","article-title":"Systematic variation in gene expression patterns in human cancer cell lines","volume":"24","author":"Ross","year":"2000","journal-title":"Nat. Genet."},{"key":"2023061007105057900_b16","doi-asserted-by":"crossref","first-page":"467","DOI":"10.1126\/science.270.5235.467","article-title":"Quantitative monitoring of gene expression patterns with a complementary DNA microarray","volume":"270","author":"Schena","year":"1995","journal-title":"Science"},{"key":"2023061007105057900_b17","doi-asserted-by":"crossref","first-page":"236","DOI":"10.1038\/73439","article-title":"A gene expression database for the molecular pharmacology of cancer","volume":"24","author":"Scherf","year":"2000","journal-title":"Nat. Genet."},{"key":"2023061007105057900_b18","first-page":"197","article-title":"Inadmissability of the usual estimator for the mean of a multivariate distribution","volume":"1","author":"Stein","year":"1956","journal-title":"Proc. Third Berkeley Symp. Math. Statist. Prob."},{"key":"2023061007105057900_b19","doi-asserted-by":"crossref","first-page":"6567","DOI":"10.1073\/pnas.082099299","article-title":"Diagnosis of multiple cancer types by shrunken centroids of gene expression","volume":"99","author":"Tibshirani","year":"2002","journal-title":"Proc. Natl Acad. Sci."},{"key":"2023061007105057900_b20","doi-asserted-by":"crossref","first-page":"104","DOI":"10.1214\/ss\/1056397488","article-title":"Class prediction by nearest shrunken centroids, with applications to DNA microarrays","volume":"18","author":"Tibshirani","year":"2003","journal-title":"Stat. Sci."},{"key":"2023061007105057900_b21","doi-asserted-by":"crossref","first-page":"520","DOI":"10.1093\/bioinformatics\/17.6.520","article-title":"Missing value estimation methods for DNA microarrays","volume":"17","author":"Troyanskaya","year":"2001","journal-title":"Bioinformatics"},{"key":"2023061007105057900_b22","doi-asserted-by":"crossref","first-page":"427","DOI":"10.1093\/biostatistics\/kxg046","article-title":"Classification of gene microarrays by penalized logistic regression","volume":"5","author":"Zhu","year":"2004","journal-title":"Biostatistics"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/21\/22\/4148\/50566161\/bioinformatics_21_22_4148.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/21\/22\/4148\/50566161\/bioinformatics_21_22_4148.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,6,10]],"date-time":"2023-06-10T07:11:23Z","timestamp":1686381083000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/21\/22\/4148\/194954"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2005,9,20]]},"references-count":22,"journal-issue":{"issue":"22","published-print":{"date-parts":[[2005,11,15]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/bti681","relation":{},"ISSN":["1367-4811","1367-4803"],"issn-type":[{"value":"1367-4811","type":"electronic"},{"value":"1367-4803","type":"print"}],"subject":[],"published-other":{"date-parts":[[2005,11,15]]},"published":{"date-parts":[[2005,9,20]]}}}